THE MIT ENCYCLOPEDIA COMPUTATIONAL INTELLIGENCE OF THE CULTURE, COGNITION, COGNITIVE AND EVOLUTION SCIENCES LINGUISTICS AND LANGUAGE NEUROSCIENCES PHILOSOPHY EDITED BY PSYCHOLOGY ROBERT A. WILSON AND FRANK C. KEIL The MIT Encyclopedia of the Cognitive Sciences The MIT Encyclopedia of the Cognitive Sciences EDITED BY Robert A. Wilson and Frank C. Keil A Bradford Book The MIT Press Cambridge, Massachusetts London, England © 1999 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Library of Congress Cataloging-in-Publication Data The MIT encyclopedia of the cognitive sciences / edited by Robert A. Wilson, Frank C. Keil. p. cm. “A Bradford book.” Includes bibliographical references and index. ISBN 0-262-73124-X (pbk. : alk. paper) 1. Cognitive science—Encyclopedias I. Wilson, Robert A. (Robert Andrew) II. Keil, Frank C., 1952– . BF311.M556 1999 153’.03—dc21 99-11115 CIP To the memory of Henry Bradford Stanton (a.k.a “Harry the hat”), 1921–1997, and to his wife Betty upon her retirement, after twenty-one years with Bradford Books. Harry and Betty were its cofounders and a major force in their own right in the flowering and cross-fertilization of the interdisciplinary cognitive sciences. Contents List of Entries ix Preface xiii Philosophy, Robert A. Wilson xv Psychology, Keith J. Holyoak xxxix Neurosciences, Thomas D. Albright and Helen J. Neville li Computational Intelligence, Michael I. Jordan and Stuart Russell lxxiii Linguistics and Language, Gennaro Chierchia xci Culture, Cognition, and Evolution, Dan Sperber and Lawrence Hirschfeld cxi The MIT Encyclopedia of the Cognitive Sciences 1 List of Contributors 901 Name Index 913 Subject Index 933 List of Entries Cerebellum 110 Acquisition, Formal Theories of 1 Cerebral Cortex 111 Adaptation and Adaptationism 3 Chess, Psychology of 113 Affordances 4 Chinese Room Argument 115 Aging and Cognition 6 Church-Turing Thesis 116 Aging, Memory, and the Brain 7 Codeswitching 118 AI and Education 9 Cognitive Anthropology 120 Algorithm 11 Cognitive Archaeology 122 Altruism 12 Cognitive Architecture 124 Ambiguity 14 Cognitive Artifacts 126 Amygdala, Primate 15 Cognitive Development 128 Analogy 17 Cognitive Ergonomics 130 Anaphora 20 Cognitive Ethology 132 Animal Communication 22 Cognitive Linguistics 134 Animal Navigation 24 Cognitive Maps 135 Animal Navigation, Neural Networks 26 Cognitive Modeling, Connectionist 137 Animism 28 Cognitive Modeling, Symbolic 141 Anomalous Monism 30 Color Categorization 143 Aphasia 31 Color, Neurophysiology of 145 Articulation 33 Color Vision 147 Artifacts and Civilization 35 Columns and Modules 148 Artificial Life 37 Comparative Psychology 150 Attention 39 Compositionality 152 Attention in the Animal Brain 41 Computation 153 Attention in the Human Brain 43 Computation and the Brain 155 Attribution Theory 46 Computational Complexity 158 Audition 48 Computational Learning Theory 159 Auditory Attention 50 Computational Lexicons 160 Auditory Physiology 52 Computational Linguistics 162 Auditory Plasticity 56 Computational Neuroanatomy 164 Autism 58 Computational Neuroscience 166 Automata 60 Computational Psycholinguistics 168 Automaticity 63 Computational Theory Of Mind 170 Autonomy of Psychology 64 Computational Vision 172 Bartlett, Frederic Charles 66 Computing in Single Neurons 174 Basal Ganglia 67 Concepts 176 Bayesian Learning 70 Conceptual Change 179 Bayesian Networks 72 Conditioning 182 Behavior-Based Robotics 74 Conditioning and the Brain 184 Behaviorism 77 Connectionism, Philosophical Issues 186 Bilingualism and the Brain 80 Connectionist Approaches to Language 188 Binding by Neural Synchrony 81 Consciousness 190 Binding Problem 85 Consciousness, Neurobiology of 193 Binding Theory 86 Constraint Satisfaction 195 Blindsight 88 Context and Point of View 198 Bloomfield, Leonard 90 Control Theory 199 Boas, Franz 91 Cooperation and Competition 201 Bounded Rationality 92 Cortical Localization, History of 203 Brentano, Franz 94 Creativity 205 Broadbent, Donald E. 95 Creoles 206 Broca, Paul 97 Cultural Consensus Theory 208 Cajal, Santiago Ramón y 98 Cultural Evolution 209 Case-Based Reasoning and Analogy 99 Cultural Psychology 211 Categorial Grammar 101 Cultural Relativism 213 Categorization 104 Cultural Symbolism 216 Causal Reasoning 106 Cultural Variation 217 Causation 108 x List of Entries Functionalism 332 Darwin, Charles 218 Fuzzy Logic 335 Decision Making 220 Game-Playing Systems 336 Decision Trees 223 Game Theory 338 Deductive Reasoning 225 Generative Grammar 340 Depth Perception 227 Geschwind, Norman 343 Descartes, René 229 Gestalt Perception 244 Discourse 231 Gestalt Psychology 346 Dissonance 233 Gibson, James Jerome 349 Distinctive Features 234 Gödel’s Theorems 351 Distributed vs. Local Representation 236 Golgi, Camillo 352 Domain-Specificity 238 Grammar, Neural Basis of 354 Dominance in Animal Social Groups 240 Grammatical Relations 355 Dreaming 242 Greedy Local Search 357 Dynamic Approaches to Cognition 244 Grice, H. Paul 359 Dynamic Programming 246 Haptic Perception 360 Dynamic Semantics 247 Head-Driven Phrase Structure Grammar 362 Dyslexia 249 Head Movement 364 Ebbinghaus, Hermann 251 Hebb, Donald O. 366 Echolocation 253 Helmholtz, Hermann Ludwig Ferdinand von 367 Ecological Psychology 255 Hemispheric Specialization 369 Ecological Validity 257 Heuristic Search 372 Economics and Cognitive Science 259 Hidden Markov Models 373 Education 261 High-Level Vision 374 Electrophysiology, Electric and Magnetic Evoked Hippocampus 377 Fields 262 Human-Computer Interaction 379 Eliminative Materialism 265 Human Navigation 380 Emergentism 267 Human Universals 382 Emotion and the Animal Brain 269 Hume, David 384 Emotion and the Human Brain 271 Illusions 385 Emotions 273 Imagery 387 Epiphenomenalism 275 Imitation 389 Episodic vs. Semantic Memory 278 Implicature 391 Epistemology and Cognition 280 Implicit vs. Explicit Memory 394 Essentialism 282 Indexicals and Demonstratives 395 Ethics and Evolution 284 Individualism 397 Ethnopsychology 286 Induction 399 Ethology 288 Inductive Logic Programming 400 Evolution 290 Infant Cognition 402 Evolution of Language 292 Information Theory 404 Evolutionary Computation 293 Informational Semantics 406 Evolutionary Psychology 295 Innateness of Language 408 Expertise 298 Intelligence 409 Explanation 300 Intelligent Agent Architecture 411 Explanation-Based Learning 301 Intentional Stance 412 Explanatory Gap 304 Intentionality 413 Extensionality, Thesis of 305 Intersubjectivity 415 Eye Movements and Visual Attention 306 Introspection 419 Face Recognition 309 Jakobson, Roman 421 Feature Detectors 311 James, William 422 Figurative Language 314 Judgment Heuristics 423 Focus 315 Justification 425 Folk Biology 317 Kant, Immanuel 427 Folk Psychology 319 Knowledge Acquisition 428 Formal Grammars 320 Knowledge-Based Systems 430 Formal Systems, Properties of 322 Knowledge Representation 432 Frame-Based Systems 324 Language Acquisition 434 Frame Problem 326 Language and Communication 438 Frege, Gottlob 327 Language and Culture 441 Freud, Sigmund 328 Language and Gender 442 Functional Decomposition 329 Language and Thought 444 Functional Role Semantics 331 List of Entries xi Moral Psychology 561 Language Impairment, Developmental 446 Morphology 562 Language, Neural Basis of 448 Motion, Perception of 564 Language of Thought 451 Motivation 566 Language Production 453 Motivation and Culture 568 Language Variation and Change 456 Motor Control 570 Lashley, Karl S. 458 Motor Learning 571 Learning 460 Multiagent Systems 573 Learning Systems 461 Multisensory Integration 574 Lévi-Strauss, Claude 463 Naive Mathematics 575 Lexical Functional Grammar 464 Naive Physics 577 Lexicon 467 Naive Sociology 579 Lexicon, Neural Basis 469 Narrow Content 581 Lightness Perception 471 Nativism 583 Limbic System 472 Nativism, History of 586 Linguistic Relativity Hypothesis 475 Natural Kinds 588 Linguistic Universals and Universal Grammar 476 Natural Language Generation 589 Linguistics, Philosophical Issues 478 Natural Language Processing 592 Literacy 481 Neural Development 594 Logic 482 Neural Networks 597 Logic Programming 484 Neural Plasticity 598 Logical Form in Linguistics 486 Neuroendocrinology 601 Logical Form, Origins of 488 Neuron 603 Logical Omniscience, Problem of 489 Neurotransmitters 605 Logical Reasoning Systems 491 Newell, Allen 607 Long-Term Potentiation 492 Nonmonotonic Logics 608 Luria, Alexander Romanovich 494 Numeracy and Culture 611 Machiavellian Intelligence Hypothesis 495 Object Recognition, Animal Studies 613 Machine Learning 497 Object Recognition, Human Neuropsychology 615 Machine Translation 498 Oculomotor Control 618 Machine Vision 501 Optimality Theory 620 Magic and Superstition 503 Pain 622 Magnetic Resonance Imaging 505 Parameter-Setting Approaches to Acquisition, Creolization, Malinowski, Bronislaw 507 and Diachrony 624 Manipulation and Grasping 508 Parsimony and Simplicity 627 Marr, David 511 Pattern Recognition and Feed-Forward Networks 629 McCulloch, Warren S. 512 Penfield, Wilder 631 Meaning 513 Perceptual Development 632 Memory 514 Phantom Limb 635 Memory, Animal Studies 517 Phonetics 636 Memory, Human Neuropsychology 520 Phonological Rules and Processes 637 Memory Storage, Modulation of 522 Phonology 639 Mental Causation 524 Phonology, Acquisition of 641 Mental Models 525 Phonology, Neural Basis of 643 Mental Representation 527 Physicalism 645 Mental Retardation 529 Piaget, Jean 647 Mental Rotation 531 Pictorial Art and Vision 648 Metacognition 533 Pitts, Walter 651 Metaphor 535 Planning 652 Metaphor and Culture 537 Polysynthetic Languages 654 Metareasoning 539 Positron Emission Tomography 656 Metarepresentation 541 Possible Worlds Semantics 659 Meter and Poetry 543 Poverty of the Stimulus Arguments 660 Mid-Level Vision 545 Pragmatics 661 Mind-Body Problem 546 Presupposition 664 Minimalism 548 Primate Cognition 666 Minimum Description Length 550 Primate Language 669 Mobile Robots 551 Probabilistic Reasoning 671 Modal Logic 554 Probability, Foundations of 673 Modeling Neuropsychological Deficits 555 Problem Solving 674 Modularity and Language 557 Production Systems 676 Modularity of Mind 558 xii List of Entries Speech Recognition in Machines 790 Propositional Attitudes 678 Speech Synthesis 792 Prosody and Intonation 679 Sperry, Roger Wolcott 794 Prosody and Intonation, Processing Issues 682 Spoken Word Recognition 796 Psychoanalysis, Contemporary Views 683 Statistical Learning Theory 798 Psychoanalysis, History of 685 Statistical Techniques in Natural Language Processing 801 Psycholinguistics 688 Stereo and Motion Perception 802 Psychological Laws 690 Stereotyping 804 Psychophysics 691 Stress 806 Qualia 693 Stress, Linguistic 808 Quantifiers 694 Structure from Visual Information Sources 810 Radical Interpretation 696 Supervenience 812 Rational Agency 698 Supervised Learning in Multilayer Neural Networks 814 Rational Choice Theory 699 Surface Perception 816 Rational Decision Making 701 Syntax 818 Rationalism vs. Empiricism 703 Syntax, Acquisition of 820 Reading 705 Syntax-Semantics Interface 824 Realism and Anti-Realism 707 Taste 826 Recurrent Networks 709 Technology and Human Evolution 828 Reductionism 712 Temporal Reasoning 829 Reference, Theories of 714 Tense and Aspect 831 Reinforcement Learning 715 Teuber, Hans-Lukas 832 Relational Grammar 717 Texture 833 Relevance and Relevance Theory 719 Thalamus 835 Religious Ideas and Practices 720 Thematic Roles 837 Retina 722 Theory of Mind 838 Robotics and Learning 723 Time in the Mind 841 Rules and Representations 724 Tone 843 Sapir, Edward 726 Top-Down Processing in Vision 844 Saussure, Ferdinand de 728 Transparency 845 Schemata 729 Turing, Alan Mathison 847 Scientific Thinking and Its Development 730 Tversky, Amos 849 Self 733 Twin Earth 850 Self-Knowledge 735 Typology 852 Self-Organizing Systems 737 Uncertainty 853 Semantics 739 Unity of Science 856 Semantics, Acquisition of 742 Unsupervised Learning 857 Semiotics and Cognition 744 Utility Theory 859 Sensations 745 Vagueness 861 Sense and Reference 746 Vision and Learning 863 Sentence Processing 748 Visual Anatomy and Physiology 864 Sexual Attraction, Evolutionary Psychology of 751 Visual Cortex, Cell Types and Connections in 867 Shape Perception 753 Visual Neglect 869 Sign Language and the Brain 756 Visual Object Recognition, AI 871 Sign Languages 758 Visual Processing Streams 873 Signal Detection Theory 760 Visual Word Recognition 875 Similarity 763 Von Neumann, John 876 Simulation vs. Theory-Theory 765 Vygotsky, Lev Semenovich 878 Single-Neuron Recording 766 Walking and Running Machines 879 Situated Cognition and Learning 767 Wh-Movement 882 Situatedness/Embeddedness 769 What-It’s-Like 883 Situation Calculus 777 Wiener, Norbert 884 Sleep 772 Word Meaning, Acquisition of 886 Smell 775 Working Memory 888 Social Cognition 777 Working Memory, Neural Basis of 890 Social Cognition in Animals 778 Writing Systems 894 Social Play Behavior 780 Wundt, Wilhelm 896 Sociobiology 783 X-Bar Theory 898 Spatial Perception 784 Speech Perception 787 Preface The MIT Encyclopedia of the Cognitive Sciences (MITECS to its friends) has been four years in the making from conception to publication. It consists of 471 concise articles, nearly all of which include useful lists of references and further readings, pre- ceded by six longer introductory essays written by the volume’s advisory editors. We see MITECS as being of use to students and scholars across the various disciplines that contribute to the cognitive sciences, including psychology, neuroscience, linguis- tics, philosophy, anthropology and the social sciences more generally, evolutionary biology, education, computer science, artificial intelligence, and ethology. Although we prefer to let the volume speak largely for itself, it may help to provide some brief details about the aims and development of the project. One of the chief motivations for this undertaking was the sense that, despite a number of excellent works that overlapped with the ambit of cognitive science as it was traditionally con- ceived, there was no single work that adequately represented the full range of con- cepts, methods, and results derived and deployed in cognitive science over the last twenty-five years. Second, each of the various cognitive sciences differs in its focus and orientation; in addition, these have changed over time and will continue to do so in the future. We see MITECS as aiming to represent the scope of this diversity, and as conveying a sense of both the history and future of the cognitive sciences. Finally, we wanted, through discussions with authors and as a result of editorial review, to highlight links across the various cognitive sciences so that readers from one discipline might gain a greater insight into relevant work in other fields. MITECS represents far more than an alphabetic list of topics in the cognitive sciences; it cap- tures a good deal of the structure of the whole enterprise at this point in time, the ways in which ideas are linked together across topics and disciplines, as well as the ways in which authors from very different disciplines converge and diverge in their approaches to very similar topics. As one looks through the encyclopedia as a whole, one takes a journey through a rich and multidimensional landscape of interconnected ideas. Categorization is rarely just that, especially in the sciences. Ideas and patterns are related to one another, and the grounds for categorizations are often embedded in complex theoretical and empirical patterns. MITECS illustrates the richness and intri- cacy of this process and the immense value of cognitive science approaches to many questions about the mind. All three of the motivations for MITECS were instrumental in the internal organiza- tion of the project. The core of MITECS is the 471 articles themselves, which were assigned to one of six fields that constitute the foundation of the cognitive sciences. One or two advisory editors oversaw the articles in each of these fields and contributed the introductory essays. The fields and the corresponding advisory editors are Philosophy (Robert A. Wilson) Psychology (Keith J. Holyoak) Neurosciences (Thomas D. Albright and Helen J. Neville) Computational Intelligence (Michael I. Jordan and Stuart Russell) Linguistics and Language (Gennaro Chierchia) Culture, Cognition, and Evolution (Dan Sperber and Lawrence Hirschfeld) These editors advised us regarding both the topics and authors for the articles and assisted in overseeing the review process for each. Considered collectively, the articles represent much of the diversity to be found in the corresponding fields and indicate much of what has been, is, and might be of value for those thinking about cognition from one or another interdisciplinary perspective. Each introduction has two broad goals. The first is to provide a road map through MITECS to the articles in the corresponding section. Because of the arbitrariness of xiv Preface assigning some articles to one section rather than another, and because of the interdis- ciplinary vision guiding the volume, the introductions mention not only the articles in the corresponding section but also others from overlapping fields. The second goal is to provide a perspective on the nature of the corresponding discipline or disciplines, particularly with respect to the cognitive sciences. Each introduction should stand as a useful overview of the field it represents. We also made it clear to the editors that their introductions did not have to be completely neutral and could clearly express their own unique perspectives. The result is a vibrant and engaging series of essays. We have been fortunate in being able to enlist many of the world’s leading authori- ties as authors of the articles. Our directions to contributors were to write articles that are both representative of their topic and accessible to advanced undergraduates and graduate students in the field. The review process involved assigning two reviewers to each article, one an expert from within the same field, the other an outsider from another field represented in MITECS; nearly all reviewers were themselves contribu- tors to MITECS. In addition, every article was read by at least one of the general edi- tors. Articles that did not seem quite right to either or both of us or to our reviewers were sometimes referred to the advisory editors. One might think that with such short articles (most being between 1,000 and 1,500 words in length), the multiple levels of review were unnecessary, but the selectivity that this brevity necessitated made such a review process all the more worthwhile. Relatedly, as more than one contributor noted in explaining his own tardiness: “This article would have been written sooner if it hadn’t been so short!”. Of course the content of the articles will be the chief source of their value to the reader, but given the imposed conciseness, an important part of their value is the guide that their references and further readings provide to the relevant literature. In addition, each article contains cross-references, indicated in SMALL CAPITALS, to related articles and a short list of “see also” cross-references at the end of the article. Responsibility for these cross-references lies ultimately with one of us (RAW), though we are thankful to those authors who took the time to suggest cross-references for their own articles. We envisioned that many scholars would use MITECS as a frequent, perhaps even daily, tool in their research and have designed the references, readings, and cross-ref- erences with that use in mind. The electronic version will allow users to download rel- evant references into their bibliography databases along with considerable cross- classification information to aid future searches. Both of us are surprised at the extent to which we have already come to rely on drafts of articles in MITECS for these pur- poses in our own scholarly pursuits. In the long list of people to thank, we begin with the contributors themselves, from whom we have learned much, both from their articles and their reviews of the articles of others, and to whom readers owe their first debt. Without the expertise of the advi- sory editors there is little chance that we would have arrived at a comprehensive range of topics or managed to identify and recruit many of the authors who have contributed to MITECS. And without their willingness to take on the chore of responding to our whims and fancies over a three-year period, and to write the section introductions, MITECS would have fallen short of its goals. Thanks Tom, Gennaro, Larry, Keith, Mike, Helen, Stuart, and Dan. At The MIT Press, we thank Amy Brand for her leader- ship and persistence, her able assistants Ed Sprague and Ben Bruening for their tech- know-how and hard work, and Sandra Minkkinen for editorial oversight of the pro- cess. Rob Wilson thanks his coterie of research assistants: Patricia Ambrose and Peter Piegaze while he was at Queen’s University; and Aaron Sklar, Keith Krueger, and Peter Asaro since he has been at the University of Illinois. His work on MITECS was supported, in part, by SSHRC Individual Three-Year Grant #410-96-0497, and a UIUC Campus Research Board Grant. Frank Keil thanks Cornell University for inter- nal funds that were used to help support this project. Philosophy Robert A. Wilson The areas of philosophy that contribute to and draw on the cognitive sciences are vari- ous; they include the philosophy of mind, science, and language; formal and philo- sophical logic; and traditional metaphysics and epistemology. The most direct connections hold between the philosophy of mind and the cognitive sciences, and it is with classical issues in the philosophy of mind that I begin this introduction (section 1). I then briefly chart the move from the rise of materialism as the dominant response to one of these classic issues, the mind-body problem, to the idea of a sci- ence of the mind. I do so by discussing the early attempts by introspectionists and behaviorists to study the mind (section 2). Here I focus on several problems with a philosophical flavor that arise for these views, problems that continue to lurk back- stage in the theater of contemporary cognitive science. Between these early attempts at a science of the mind and today’s efforts lie two general, influential philosophical traditions, ordinary language philosophy and logical positivism. In order to bring out, by contrast, what is distinctive about the contempo- rary naturalism integral to philosophical contributions to the cognitive sciences, I sketch the approach to the mind in these traditions (section 3). And before getting to contemporary naturalism itself I take a quick look at the philosophy of science, in light of the legacy of positivism (section 4). In sections 5 through 7 I get, at last, to the mind in cognitive science proper. Sec- tion 5 discusses the conceptions of mind that have dominated the contemporary cogni- tive sciences, particularly that which forms part of what is sometimes called “classic” cognitive science and that of its connectionist rival. Sections 6 and 7 explore two spe- cific clusters of topics that have been the focus of philosophical discussion of the mind over the last 20 years or so, folk psychology and mental content. The final sec- tions gesture briefly at the interplay between the cognitive sciences and logic (section 8) and biology (section 9). 1 Three Classic Philosophical Issues About the Mind i. The Mental-Physical Relation The relation between the mental and the physical is the deepest and most recurrent classic philosophical topic in the philosophy of mind, one very much alive today. In due course, we will come to see why this topic is so persistent and pervasive in think- ing about the mind. But to convey something of the topic’s historical significance let us begin with a classic expression of the puzzling nature of the relation between the mental and the physical, the MIND-BODY PROBLEM. This problem is most famously associated with RENÉ DESCARTES, the preeminent figure of philosophy and science in the first half of the seventeenth century. Descartes combined a thorough-going mechanistic theory of nature with a dualistic theory of the nature of human beings that is still, in general terms, the most widespread view held by ordinary people outside the hallowed halls of academia. Although nature, includ- ing that of the human body, is material and thus completely governed by basic princi- ples of mechanics, human beings are special in that they are composed both of material and nonmaterial or mental stuff, and so are not so governed. In Descartes’s own terms, people are essentially a combination of mental substances (minds) and material substances (bodies). This is Descartes’s dualism. To put it in more common- sense terms, people have both a mind and a body. Although dualism is often presented as a possible solution to the mind-body prob- lem, a possible position that one might adopt in explaining how the mental and physi- cal are related, it serves better as a way to bring out why there is a “problem” here at all. For if the mind is one type of thing, and the body is another, how do these two xvi Philosophy types of things interact? To put it differently, if the mind really is a nonmaterial sub- stance, lacking physical properties such as spatial location and shape, how can it be both the cause of effects in the material world—like making bodies move—and itself be causally affected by that world—as when a thumb slammed with a hammer (bodily cause) causes one to feel pain (mental effect)? This problem of causation between mind and body has been thought to pose a largely unanswered problem for Cartesian dualism. It would be a mistake, however, to assume that the mind-body problem in its most general form is simply a consequence of dualism. For the general question as to how the mental is related to the physical arises squarely for those convinced that some ver- sion of materialism or PHYSICALISM must be true of the mind. In fact, in the next sec- tion, I will suggest that one reason for the resilience and relevance of the mind-body problem has been the rise of materialism over the last fifty years. Materialists hold that all that exists is material or physical in nature. Minds, then, are somehow or other composed of arrangements of physical stuff. There have been various ways in which the “somehow or other” has been cashed out by physicalists, but even the view that has come closest to being a consensus view among contempo- rary materialists—that the mind supervenes on the body—remains problematic. Even once one adopts materialism, the task of articulating the relationship between the mental and the physical remains, because even physical minds have special properties, like intentionality and consciousness, that require further explanation. Simply pro- claiming that the mind is not made out of distinctly mental substance, but is material like the rest of the world, does little to explain the features of the mind that seem to be distinctively if not uniquely features of physical minds. ii. The Structure of the Mind and Knowledge Another historically important cluster of topics in the philosophy of mind concerns what is in a mind. What, if anything, is distinctive of the mind, and how is the mind structured? Here I focus on two dimensions to this issue. One dimension stems from the RATIONALISM VS. EMPIRICISM debate that reached a high point in the seventeenth and eighteenth centuries. Rationalism and empiricism are views of the nature of human knowledge. Broadly speaking, empiricists hold that all of our knowledge derives from our sensory, experiential, or empirical interaction with the world. Rationalists, by contrast, hold the negation of this, that there is some knowledge that does not derive from experience. Since at least our paradigms of knowledge—of our immediate environments, of common physical objects, of scientific kinds—seem obviously to be based on sense experience, empiricism has significant intuitive appeal. Rationalism, by contrast, seems to require further motivation: minimally, a list of knowables that represent a prima facie challenge to the empiricist’s global claim about the foundations of knowl- edge. Classic rationalists, such as Descartes, Leibniz, Spinoza, and perhaps more con- tentiously KANT, included knowledge of God, substance, and abstract ideas (such as that of a triangle, as opposed to ideas of particular triangles). Empiricists over the last three hundred years or so have either claimed that there was nothing to know in such cases, or sought to provide the corresponding empiricist account of how we could know such things from experience. The different views of the sources of knowledge held by rationalists and empiricists have been accompanied by correspondingly different views of the mind, and it is not hard to see why. If one is an empiricist and so holds, roughly, that there is nothing in the mind that is not first in the senses, then there is a fairly literal sense in which ideas, found in the mind, are complexes that derive from impressions in the senses. This in turn suggests that the processes that constitute cognition are themselves elaborations of those that constitute perception, that is, that cognition and perception differ only in degree, not kind. The most commonly postulated mechanisms governing these pro- cesses are association and similarity, from Hume’s laws of association to feature- extraction in contemporary connectionist networks. Thus, the mind tends to be viewed by empiricists as a domain-general device, in that the principles that govern its opera- Philosophy xvii tion are constant across various types and levels of cognition, with the common empirical basis for all knowledge providing the basis for parsimony here. By contrast, in denying that all knowledge derives from the senses, rationalists are faced with the question of what other sources there are for knowledge. The most natu- ral candidate is the mind itself, and for this reason rationalism goes hand in hand with NATIVISM about both the source of human knowledge and the structure of the human mind. If some ideas are innate (and so do not need to be derived from experience), then it follows that the mind already has a relatively rich, inherent structure, one that in turn limits the malleability of the mind in light of experience. As mentioned, classic rationalists made the claim that certain ideas or CONCEPTS were innate, a claim occa- sionally made by contemporary nativists—most notably Jerry Fodor (1975) in his claim that all concepts are innate. However, contemporary nativism is more often expressed as the view that certain implicit knowledge that we have or principles that govern how the mind works—most notoriously, linguistic knowledge and princi- ples—are innate, and so not learned. And because the types of knowledge that one can have may be endlessly heterogeneous, rationalists tend to view the mind as a domain- specific device, as one made up of systems whose governing principles are very differ- ent. It should thus be no surprise that the historical debate between rationalists and empiricists has been revisited in contemporary discussions of the INNATENESS OF LANGUAGE, the MODULARITY OF MIND, and CONNECTIONISM. A second dimension to the issue of the structure of the mind concerns the place of CONSCIOUSNESS among mental phenomena. From WILLIAM JAMES’s influential analy- sis of the phenomenology of the stream of consciousness in his The Principles of Psy- chology (1890) to the renaissance that consciousness has experienced in the last ten years (if publication frenzies are anything to go by), consciousness has been thought to be the most puzzling of mental phenomena. There is now almost universal agree- ment that conscious mental states are a part of the mind. But how large and how important a part? Consciousness has sometimes been thought to exhaust the mental, a view often attributed to Descartes. The idea here is that everything mental is, in some sense, conscious or available to consciousness. (A version of the latter of these ideas has been recently expressed in John Searle’s [1992: 156] connection principle: “all unconscious intentional states are in principle accessible to consciousness.”) There are two challenges to the view that everything mental is conscious or even available to consciousness. The first is posed by the unconscious. SIGMUND FREUD’s extension of our common-sense attributions of belief and desire, our folk psychology, to the realm of the unconscious played and continues to play a central role in PSYCHO- ANALYSIS. The second arises from the conception of cognition as information pro- cessing that has been and remains focal in contemporary cognitive science, because such information processing is mostly not available to consciousness. If cognition so conceived is mental, then most mental processing is not available to consciousness. iii. The First- and Third-Person Perspectives Occupying center stage with the mind-body problem in traditional philosophy of mind is the problem of other minds, a problem that, unlike the mind-body problem, has all but disappeared from philosophical contributions to the cognitive sciences. The prob- lem is often stated in terms of a contrast between the relatively secure way in which I “directly” know about the existence of my own mental states, and the far more epistemically risky way in which I must infer the existence of the mental states of oth- ers. Thus, although I can know about my own mental states simply by introspection and self-directed reflection, because this way of finding out about mental states is peculiarly first-person, I need some other type of evidence to draw conclusions about the mental states of others. Naturally, an agent's behavior is a guide to what mental states he or she is in, but there seems to be an epistemic gap between this sort of evi- dence and the attribution of the corresponding mental states that does not exist in the case of self-ascription. Thus the problem of other minds is chiefly an epistemological problem, sometimes expressed as a form of skepticism about the justification that we have for attributing mental states to others. xviii Philosophy There are two reasons for the waning attention to the problem of other minds qua problem that derive from recent philosophical thought sensitive to empirical work in the cognitive sciences. First, research on introspection and SELF-KNOWLEDGE has raised questions about how “direct” our knowledge of our own mental states and of the SELF is, and so called into question traditional conceptions of first-person knowl- edge of mentality. Second, explorations of the THEORY OF MIND, ANIMAL COMMUNI- CATION, and SOCIAL PLAY BEHAVIOR have begun to examine and assess the sorts of attribution of mental states that are actually justified in empirical studies, suggesting that third-person knowledge of mental states is not as limited as has been thought. Considered together, this research hints that the contrast between first- and third- person knowledge of the mental is not as stark as the problem of other minds seems to intimate. Still, there is something distinctive about the first-person perspective, and it is in part as an acknowledgment of this, to return to an earlier point, that consciousness has become a hot topic in the cognitive sciences of the 1990s. For whatever else we say about consciousness, it seems tied ineliminably to the first-person perspective. It is a state or condition that has an irreducibly subjective component, something with an essence to be experienced, and which presupposes the existence of a subject of that experience. Whether this implies that there are QUALIA that resist complete character- ization in materialist terms, or other limitations to a science of the mind, remain ques- tions of debate. See also ANIMAL COMMUNICATION; CONCEPTS; CONNECTIONISM, PHILOSOPHICAL ISSUES; CONSCIOUSNESS; CONSCIOUSNESS, NEUROBIOLOGY OF; DESCARTES, RENÉ; FREUD, SIGMUND; INNATENESS OF LANGUAGE; JAMES, WILLIAM; KANT, IMMANUEL; MIND-BODY PROBLEM; MODULARITY OF MIND; NATIVISM; NATIVISM, HISTORY OF; PHYSICALISM; PSYCHOANALYSIS, CONTEMPORARY VIEWS; PSYCHOANALYSIS, HIS- TORY OF; QUALIA; RATIONALISM VS. EMPIRICISM; SELF; SELF-KNOWLEDGE; SOCIAL PLAY BEHAVIOR; THEORY OF MIND 2 From Materialism to Mental Science In raising issue i., the mental-physical relation, in the previous section, I implied that materialism was the dominant ontological view of the mind in contemporary philoso- phy of mind. I also suggested that, if anything, general convergence on this issue has intensified interest in the mind-body problem. For example, consider the large and lively debate over whether contemporary forms of materialism are compatible with genuine MENTAL CAUSATION, or, alternatively, whether they commit one to EPIPHE- NOMENALISM about the mental (Kim 1993; Heil and Mele 1993; Yablo 1992). Like- wise, consider the fact that despite the dominance of materialism, some philosophers maintain that there remains an EXPLANATORY GAP between mental phenomena such as consciousness and any physical story that we are likely to get about the workings of the brain (Levine 1983; cf. Chalmers 1996). Both of these issues, very much alive in contemporary philosophy of mind and cognitive science, concern the mind-body problem, even if they are not always identified in such old-fashioned terms. I also noted that a healthy interest in the first-person perspective persists within this general materialist framework. By taking a quick look at the two major initial attempts to develop a systematic, scientific understanding of the mind—late nineteenth-century introspectionism and early twentieth-century behaviorism—I want to elaborate on these two points and bring them together. Introspectionism was widely held to fall prey to a problem known as the problem of the homunculus. Here I argue that behaviorism, too, is subject to a variation on this very problem, and that both versions of this problem continue to nag at contemporary sciences of the mind. Students of the history of psychology are familiar with the claim that the roots of contemporary psychology can be dated from 1879, with the founding of the first experimental laboratory devoted to psychology by WILHELM WUNDT in Leipzig, Ger- many. As an experimental laboratory, Wundt’s laboratory relied on the techniques introduced and refined in physiology and psychophysics over the preceding fifty years Philosophy xix by HELMHOLTZ, Weber, and Fechner that paid particular attention to the report of SEN- SATIONS. What distinguished Wundt’s as a laboratory of psychology was his focus on the data reported in consciousness via the first-person perspective; psychology was to be the science of immediate experience and its most basic constituents. Yet we should remind ourselves of how restricted this conception of psychology was, particularly relative to contemporary views of the subject. First, Wundt distinguished between mere INTROSPECTION, first-person reports of the sort that could arise in the everyday course of events, and experimentally manipu- lable self-observation of the sort that could only be triggered in an experimental con- text. Although Wundt is often thought of as the founder of an introspectionist methodology that led to a promiscuous psychological ontology, in disallowing mere introspection as an appropriate method for a science of the mind he shared at least the sort of restrictive conception of psychology with both his physiological predecessors and his later behaviorist critics. Second, Wundt thought that the vast majority of ordinary thought and cognition was not amenable to acceptable first-person analysis, and so lay beyond the reach of a scientific psychology. Wundt thought, for example, that belief, language, personality, and SOCIAL COGNITION could be studied systematically only by detailing the cultural mores, art, and religion of whole societies (hence his four-volume Völkerpsychologie of 1900–1909). These studies belonged to the humanities (Geisteswissenshaften) rather than the experimental sciences (Naturwissenschaften), and were undertaken by anthropologists inspired by Wundt, such as BRONISLAW MALINOWSKI. Wundt himself took one of his early contributions to be a solution of the mind-body problem, for that is what the data derived from the application of the experimental method to distinctly psychological phenomena gave one: correlations between the mental and the physical that indicated how the two were systematically related. The discovery of psychophysical laws of this sort showed how the mental was related to the physical. Yet with the expansion of the domain of the mental amenable to experi- mental investigation over the last 150 years, the mind-body problem has taken on a more acute form: just how do we get all that mind-dust from merely material mechan- ics? And it is here that the problem of the homunculus arises for introspectionist psy- chology after Wundt. The problem, put in modern guise, is this. Suppose that one introspects, say, in order to determine the location of a certain feature (a cabin, for example) on a map that one has attempted to memorize (Kosslyn 1980). Such introspection is typically reported in terms of exploring a mental image with one’s mind’s eye. Yet we hardly want our psychological story to end there, because it posits a process (introspection) and a processor (the mind’s eye) that themselves cry out for further explanation. The problem of the homunculus is the problem of leaving undischarged homunculi (“little men” or their equivalents) in one’s explanantia, and it persists as we consider an elab- oration on our initial introspective report. For example, one might well report forming a mental image of the map, and then scanning around the various features of the map, zooming in on them to discern more clearly what they are to see if any of them is the sought-after cabin. To take this introspective report seriously as a guide to the under- lying psychological mechanisms would be to posit, minimally, an imager (to form the initial image), a scanner (to guide your mind’s eye around the image), and a zoomer (to adjust the relative sizes of the features on the map). But here again we face the problem of the homunculus, because such “mechanisms” themselves require further psychological decomposition. To be faced with the problem of the homunculus, of course, is not the same as to succumb to it. We might distinguish two understandings of just what the “problem” is here. First, the problem of the homunculus could be viewed as a problem specifically for introspectionist views of psychology, a problem that was never successfully met and that was principally responsible for the abandonment of introspectionism. As such, the problem motivated BEHAVIORISM in psychology. Second, the problem of the homunculus might simply be thought of as a challenge that any view that posits inter- nal mental states must respond to: to show how to discharge all of the homunculi introduced in a way that is acceptably materialistic. So construed, the problem xx Philosophy remains one that has been with us more recently, in disputes over the psychological reality of various forms of GENERATIVE GRAMMAR (e.g., Stabler 1983); in the nativ- ism that has been extremely influential in post-Piagetian accounts of COGNITIVE DEVELOPMENT (Spelke 1990; cf. Elman et al. 1996); and in debates over the signifi- cance of MENTAL ROTATION and the nature of IMAGERY (Kosslyn 1994; cf. Pylyshyn 1984: ch.8). With Wundt’s own restrictive conception of psychology and the problem of the homunculus in mind, it is with some irony that we can view the rise and fall of behav- iorism as the dominant paradigm for psychology subsequent to the introspectionism that Wundt founded. For here was a view so deeply indebted to materialism and the imperative to explore psychological claims only by reference to what was acceptably experimental that, in effect, in its purest form it appeared to do away with the distinc- tively mental altogether! That is, because objectively observable behavioral responses to objectively measurable stimuli are all that could be rigorously explored, experimen- tal psychological investigations would need to be significantly curtailed, relative to those of introspectionists such as Wundt and Titchener. As J. B. Watson said in his early, influential “Psychology as the Behaviorist Views It” in 1913, “Psychology as behavior will, after all, have to neglect but few of the really essential problems with which psychology as an introspective science now concerns itself. In all probability even this residue of problems may be phrased in such a way that refined methods in behavior (which certainly must come) will lead to their solution” (p. 177). Behaviorism brought with it not simply a global conception of psychology but spe- cific methodologies, such as CONDITIONING, and a focus on phenomena, such as that of LEARNING, that have been explored in depth since the rise of behaviorism. Rather than concentrate on these sorts of contribution to the interdisciplinary sciences of the mind that behaviorists have made, I want to focus on the central problem that faced behaviorism as a research program for reshaping psychology. One of the common points shared by behaviorists in their philosophical and psy- chological guises was a commitment to an operational view of psychological con- cepts and thus a suspicion of any reliance on concepts that could not be operationally characterized. Construed as a view of scientific definition (as it was by philosophers), operationalism is the view that scientific terms must be defined in terms of observable and measurable operations that one can perform. Thus, an operational definition of “length,” as applied to ordinary objects, might be: “the measure we obtain by laying a standard measuring rod or rods along the body of the object.” Construed as a view of scientific methodology (as it was by psychologists), operationalism claims that the subject matter of the sciences should be objectively observable and measurable, by itself a view without much content. The real bite of the insistence on operational definitions and methodology for psy- chology came via the application of operationalism to unobservables, for the various feelings, sensations, and other internal states reported by introspection, themselves unobservable, proved difficult to operationalize adequately. Notoriously, the intro- spective reports from various psychological laboratories produced different listings of the basic feelings and sensations that made up consciousness, and the lack of agree- ment here generated skepticism about the reliability of introspection as a method for revealing the structure of the mind. In psychology, this led to a focus on behavior, rather than consciousness, and to its exploration through observable stimulus and response: hence, behaviorism. But I want to suggest that this reliance on operational- ism itself created a version of the problem of the homunculus for behaviorism. This point can be made in two ways, each of which offers a reinterpretation of a standard criticism of behaviorism. The first of these criticisms is usually called “philosophical behaviorism,” the attempt to provide conceptual analyses of mental state terms exclu- sively in terms of behavior; the second is “psychological behaviorism,” the research program of studying objective and observable behavior, rather than subjective and unobservable inner mental episodes. First, as Geach (1957: chap. 4) pointed out with respect to belief, behaviorist anal- yses of individual folk psychological states are bound to fail, because it is only in con- cert with many other propositional attitudes that any given such attitude has Philosophy xxi behavioral effects. Thus, to take a simple example, we might characterize the belief that it is raining as the tendency to utter “yes” when asked, “Do you believe that it is raining?” But one reason this would be inadequate is that one will engage in this ver- bal behavior only if one wants to answer truthfully, and only if one hears and under- stands the question asked, where each of the italicized terms above refers to some other mental state. Because the problem recurs in every putative analysis, this implies that a behavioristically acceptable construal of folk psychology is not possible. This point would seem to generalize beyond folk psychology to representational psychol- ogy more generally. So, in explicitly attempting to do without internal mental representations, behavior- ists themselves are left with mental states that must simply be assumed. Here we are not far from those undischarged homunculi that were the bane of introspectionists, especially once we recognize that the metaphorical talk of “homunculi” refers pre- cisely to internal mental states and processes that themselves are not further explained. Second, as Chomsky (1959: esp. p. 54) emphasized in his review of Skinner’s Ver- bal Behavior, systematic attempts to operationalize psychological language invariably smuggle in a reference to the very mental processes they are trying to do without. At the most general level, the behavior of interest to the linguist, Skinner’s “verbal behavior,” is difficult to characterize adequately without at least an implicit reference to the sorts of psychological mechanism that generate it. For example, linguists are not interested in mere noises that have the same physical properties—“harbor” may be pronounced so that its first syllable has the same acoustic properties as an exasperated grunt—but in parts of speech that are taxonomized at least partially in terms of the surrounding mental economy of the speaker or listener. The same seems true for all of the processes introduced by behaviorists—for exam- ple, stimulus control, reinforcement, conditioning—insofar as they are used to charac- terize complex, human behavior that has a natural psychological description (making a decision, reasoning, conducting a conversation, issuing a threat). What marks off their instances as behaviors of the same kind is not exclusively their physical or behav- ioral similarity, but, in part, the common, internal psychological processes that gener- ate them, and that they in turn generate. Hence, the irony: behaviorists, themselves motivated by the idea of reforming psychology so as to generalize about objective, observable behavior and so avoid the problem of the homunculus, are faced with undischarged homunculi, that is, irreducibly mental processes, in their very own alter- native to introspectionism. The two versions of the problem of the homunculus are still with us as a Scylla and Charybdis for contemporary cognitive scientists to steer between. On the one hand, theorists need to avoid building the very cognitive abilities that they wish to explain into the models and theories they construct. On the other, in attempting to side-step this problem they also run the risk of masking the ways in which their “objective” tax- onomic categories presuppose further internal psychological description of precisely the sort that gives rise to the problem of the homunculus in the first place. See also BEHAVIORISM; COGNITIVE DEVELOPMENT; CONDITIONING; EPIPHENOME- NALISM; EXPLANATORY GAP; GENERATIVE GRAMMAR; HELMHOLTZ, HERMANN; IMAG- ERY; INTROSPECTION; LEARNING; MALINOWSKI, BRONISLAW; MENTAL CAUSATION; MENTAL ROTATION; SENSATIONS; SOCIAL COGNITION; SOCIAL COGNITION IN ANI- MALS; WUNDT, WILHELM 3 A Detour Before the Naturalistic Turn Given the state of philosophy and psychology in the early 1950s, it is surprising that within twenty-five years there would be a thriving and well-focused interdiscipli- nary unit of study, cognitive science, to which the two are central. As we have seen, psychology was dominated by behaviorist approaches that were largely skeptical of positing internal mental states as part of a serious, scientific psychology. And Anglo-American philosophy featured two distinct trends, each of which made phi- losophy more insular with respect to other disciplines, and each of which served to reinforce the behaviorist orientation of psychology. xxii Philosophy First, ordinary language philosophy, particularly in Great Britain under the influ- ence of Ludwig Wittgenstein and J. L. Austin, demarcated distinctly philosophical problems as soluble (or dissoluble) chiefly by reference to what one would ordinarily say, and tended to see philosophical views of the past and present as the result of con- fusions in how philosophers and others come to use words that generally have a clear sense in their ordinary contexts. This approach to philosophical issues in the post-war period has recently been referred to by Marjorie Grene (1995: 55) as the “Bertie Wooster season in philosophy,” a characterization I suspect would seem apt to many philosophers of mind interested in contemporary cognitive science (and in P. G. Wodehouse). Let me illustrate how this approach to philosophy served to isolate the philosophy of mind from the sciences of the mind with perhaps the two most influen- tial examples pertaining to the mind in the ordinary language tradition. In The Concept of Mind, Gilbert Ryle (1949: 17) attacked a view of the mind that he referred to as “Descartes’ Myth” and “the dogma of the Ghost in the Machine”— basically, dualism—largely through a repeated application of the objection that dual- ism consisted of an extended category mistake: it “represents the facts of mental life as if they belonged to one logical type or category . . . when they actually belong to another.” Descartes’ Myth represented a category mistake because in supposing that there was a special, inner theater on which mental life is played out, it treated the “facts of mental life” as belonging to a special category of facts, when they were sim- ply facts about how people can, do, and would behave in certain circumstances. Ryle set about showing that for the range of mental concepts that were held to refer to pri- vate, internal mental episodes or events according to Descartes’ Myth—intelligence, the will, emotion, self-knowledge, sensation, and imagination—an appeal to what one would ordinarily say both shows the dogma of the Ghost in the Machine to be false, and points to a positive account of the mind that was behaviorist in orientation. To convey why Ryle’s influential views here turned philosophy of mind away from sci- ence rather than towards it, consider the opening sentences of The Concept of Mind: “This book offers what may with reservations be described as a theory of the mind. But it does not give new information about minds. We possess already a wealth of information about minds, information which is neither derived from, nor upset by, the arguments of philosophers. The philosophical arguments which constitute this book are intended not to increase what we know about minds, but to rectify the logical geography of the knowledge which we already possess” (Ryle 1949: 9). The “we” here refers to ordinary folk, and the philosopher's task in articulating a theory of mind is to draw on what we already know about the mind, rather than on arcane, philosoph- ical views or on specialized, scientific knowledge. The second example is Norman Malcolm’s Dreaming, which, like The Concept of Mind, framed the critique it wished to deliver as an attack on a Cartesian view of the mind. Malcolm’s (1959: 4) target was the view that “dreams are the activity of the mind during sleep,” and associated talk of DREAMING as involving various mental acts, such as remembering, imagining, judging, thinking, and reasoning. Malcolm argued that such dream-talk, whether it be part of commonsense reflection on dream- ing (How long do dreams last?; Can you work out problems in your dreams?) or a contribution to more systematic empirical research on dreaming, was a confusion aris- ing from the failure to attend to the proper “logic” of our ordinary talk about dream- ing. Malcolm’s argument proceeded by appealing to how one would use various expressions and sentences that contained the word “dreaming.” (In looking back at Malcolm’s book, it is striking that nearly every one of the eighteen short chapters begins with a paragraph about words and what one would say with or about them.) Malcolm’s central point was that there was no way to verify any given claim about such mental activity occurring while one was asleep, because the commonsense crite- ria for the application of such concepts were incompatible with saying that a person was asleep or dreaming. And because there was no way to tell whether various attribu- tions of mental states to a sleeping person were correct, such attributions were mean- ingless. These claims not only could be made without an appeal to any empirical details about dreaming or SLEEP, but implied that the whole enterprise of investigating dreaming empirically itself represented some sort of logical muddle. Philosophy xxiii Malcolm’s point became more general than one simply about dreaming (or the word “dreaming”). As he said in a preface to a later work, written after “the notion that thoughts, ideas, memories, sensations, and so on ‘code into’ or ‘map onto’ neural firing patterns in the brain” had become commonplace: “I believe that a study of our psychological concepts can show that [such] psycho-physical isomorphism is not a coherent assumption” (Malcolm 1971: x). Like Ryle’s straightening of the logical geography of our knowledge of minds, Malcolm’s appeal to the study of our psycho- logical concepts could be conducted without any knowledge gleaned from psycholog- ical science (cf. Griffiths 1997: chap. 2 on the emotions). Quite distinct from the ordinary language tradition was a second general perspec- tive that served to make philosophical contributions to the study of the mind “distinc- tive” from those of science. This was logical positivism or empiricism, which developed in Europe in the 1920s and flourished in the United States through the 1930s and 1940s with the immigration to the United States of many of its leading members, including Rudolph Carnap, Hans Reichenbach, Herbert Feigl, and Carl Hempel. The logical empiricists were called “empiricists” because they held that it was via the senses and observation that we came to know about the world, deploying this empiricism with the logical techniques that had been developed by Gottlob Frege, Bertrand Russell, and Alfred Whitehead. Like empiricists in general, the logical posi- tivists viewed the sciences as the paradigmatic repository of knowledge, and they were largely responsible for the rise of philosophy of science as a distinct subdisci- pline within philosophy. As part of their reflection on science they articulated and defended the doctrine of the UNITY OF SCIENCE, the idea that the sciences are, in some sense, essentially uni- fied, and their empiricism led them to appeal to PARSIMONY AND SIMPLICITY as grounds for both theory choice within science and for preferring theories that were ontological Scrooges. This empiricism came with a focus on what could be verified, and with it scepticism about traditional metaphysical notions, such as God, CAUSA- TION, and essences, whose instances could not be verified by an appeal to the data of sense experience. This emphasis on verification was encapsulated in the verification theory of meaning, which held that the meaning of a sentence was its method of veri- fication, implying that sentences without any such method were meaningless. In psy- chology, this fueled skepticism about the existence of internal mental representations and states (whose existence could not be objectively verified), and offered further philosophical backing for behaviorism. In contrast to the ordinary language philosophers (many of whom would have been professionally embarrassed to have been caught knowing anything about science), the positivists held that philosophy was to be informed about and sensitive to the results of science. The distinctive task of the philosopher, however, was not simply to describe scientific practice, but to offer a rational reconstruction of it, one that made clear the logical structure of science. Although the term “rational reconstruction” was used first by Carnap in his 1928 book The Logical Construction of the World, quite a general epistemological tract, the technique to which it referred came to be applied especially to scientific concepts and theories. This played out in the frequent appeal to the distinction between the context of dis- covery and the context of justification, drawn as such by Reichenbach in Experience and Prediction (1938) but with a longer history in the German tradition. To consider an aspect of a scientific view in the context of discovery was essentially to raise psy- chological, sociological, or historical questions about how that view originated, was developed, or came to be accepted or rejected. But properly philosophical explora- tions of science were to be conducted in the context of justification, raising questions and making claims about the logical structure of science and the concepts it used. Rational reconstruction was the chief way of divorcing the relevant scientific theory from its mere context of discovery. A story involving Feigl and Carnap nicely illustrates the divorce between philoso- phy and science within positivism. In the late 1950s, Feigl visited the University of California, Los Angeles, to give a talk to the Department of Philosophy, of which Car- nap was a member. Feigl’s talk was aimed at showing that a form of physicalism, the xxiv Philosophy mind-brain identity theory, faced an empirical problem, since science had little, if any- thing, to say about the “raw feel” of consciousness, the WHAT-IT’S-LIKE of experience. During the question period, Carnap raised his hand, and was called on by Feigl. “Your claim that current neurophysiology tells us nothing about raw feels is wrong! You have overlooked the discovery of alpha-waves in the brain,” exclaimed Carnap. Feigl, who was familiar with what he thought was the relevant science, looked puzzled: “Alpha-waves? What are they?” Carnap replied: “My dear Herbert. You tell me what raw feels are, and I will tell you what alpha-waves are.” Of the multiple readings that this story invites (whose common denominator is surely Carnap’s savviness and wit), consider those that take Carnap’s riposte to imply that he thought that one could defend materialism by, effectively, making up the sci- ence to fit whatever phenomena critics could rustle up. A rather extreme form of ratio- nal reconstruction, but it suggests one way in which the positivist approach to psychology could be just as a priori and so divorced from empirical practice as that of Ryle and Malcolm. See also CAUSATION; DREAMING; PARSIMONY AND SIMPLICITY; SLEEP; UNITY OF SCIENCE; WHAT-IT’S-LIKE 4 The Philosophy of Science The philosophy of science is integral to the cognitive sciences in a number of ways. We have already seen that positivists held views about the overall structure of science and the grounds for theory choice in science that had implications for psychology. Here I focus on three functions that the philosophy of science plays vis-à-vis the cognitive sciences: it provides a perspective on the place of psychol- ogy among the sciences; it raises questions about what any science can tell us about the world; and it explores the nature of knowledge and how it is known. I take these in turn. One classic way in which the sciences were viewed as being unified, according to the positivists, was via reduction. REDUCTIONISM, in this context, is the view that intu- itively “higher-level” sciences can be reduced, in some sense, to “lower-level” sci- ences. Thus, to begin with the case perhaps of most interest to MITECS readers, psychology was held to be reducible in principle to biology, biology to chemistry, chemistry to physics. This sort of reduction presupposed the existence of bridge laws, laws that exhaustively characterized the concepts of any higher-level science, and the generalizations stated using them, in terms of those concepts and generalizations at the next level down. And because reduction was construed as relating theories of one science to those of another, the advocacy of reductionism went hand-in-hand with a view of EXPLANATION that gave lower-level sciences at least a usurpatory power over their higher-level derivatives. This view of the structure of science was opposed to EMERGENTISM, the view that the properties studied by higher-level sciences, such as psychology, were not mere aggregates of properties studied by lower-level sciences, and thus could not be com- pletely understood in terms of them. Both emergentism and this form of reductionism were typically cast in terms of the relationship between laws in higher- and lower- level sciences, thus presupposing that there were, in the psychological case, PSYCHO- LOGICAL LAWS in the first place. One well-known position that denies this assumption is Donald Davidson’s ANOMALOUS MONISM, which claims that while mental states are strictly identical with physical states, our descriptions of them as mental states are nei- ther definitionally nor nomologically reducible to descriptions of them as physical states. This view is usually expressed as denying the possibility of the bridge laws required for the reduction of psychology to biology. Corresponding to the emphasis on scientific laws in views of the relations between the sciences is the idea that these laws state relations between NATURAL KINDS. The idea of a natural kind is that of a type or kind of thing that exists in the world itself, rather than a kind or grouping that exists because of our ways of per- ceiving, thinking about, or interacting with the world. Paradigms of natural kinds are biological kinds—species, such as the domestic cat (Felis domesticus)—and Philosophy xxv chemical kinds—such as silver (Ag) and gold (Au). Natural kinds can be contrasted with artifactual kinds (such as chairs), whose members are artifacts that share com- mon functions or purposes relative to human needs or designs; with conventional kinds (such as marriage vows), whose members share some sort of conventionally determined property; and from purely arbitrary groupings of objects, whose mem- bers have nothing significant in common save that they belong to the category. Views of what natural kinds are, of how extensively science traffics in them, and of how we should characterize the notion of a natural kind vis-à-vis other metaphysic notions, such as essence, intrinsic property, and causal power, all remain topics of debate in contemporary philosophy of science (e.g., van Fraassen 1989; Wilson 1999). There is an intuitive connection between the claims that there are natural kinds, and that the sciences strive to identify them, and scientific realism, the view that the enti- ties in mature sciences, whether they are observable or not, exist and our theories about them are at least approximately true. For realists hold that the sciences strive to “carve nature at its joints,” and natural kinds are the pre-existing joints that one’s sci- entific carving tries to find. The REALISM AND ANTIREALISM issue is, of course, more complicated than suggested by the view that scientific realists think there are natural kinds, and antirealists deny this—not least because there are a number of ways to deny either this realist claim or to diminish its significance. But such a perspective provides one starting point for thinking about the different views one might have of the relationship between science and reality. Apart from raising issues concerning the relationships between psychology and other sciences and their respective objects of study, and questions about the relation between science and reality, the philosophy of science is also relevant to the cognitive sciences as a branch of epistemology or the theory of knowledge, studying a particular type of knowledge, scientific knowledge. A central notion in the general theory of knowledge is JUSTIFICATION, because being justified in what we believe is at least one thing that distinguishes knowledge from mere belief or a lucky guess. Since scientific knowledge is a paradigm of knowledge, views of justification have often been devel- oped with scientific knowledge in mind. The question of what it is for an individual to have a justified belief, however, has remained contentious in the theory of knowledge. Justified beliefs are those that we are entitled to hold, ones for which we have reasons, but how should we understand such entitlement and such reasons? One dichotomy here is between internalists about justification, who hold that having justified belief exclusively concerns facts that are “internal” to the believer, facts about his or her internal cog- nitive economy; and externalists about justification, who deny this. A second dichotomy is between naturalists, who hold that what cognitive states are justified may depend on facts about cognizers or about the world beyond cognizers that are uncovered by empirical science; and rationalists, who hold that justification is determined by the relations between one’s cognitive states that the agent herself is in a special position to know about. Clearly part of what is at issue between inter- nalists and externalists, as well as between naturalists and rationalists, is the role of the first-person perspective in accounts of justification and thus knowledge (see also Goldman 1997). These positions about justification raise some general questions about the relation- ship between EPISTEMOLOGY AND COGNITION, and interact with views of the impor- tance of first- and third-person perspectives on cognition itself. They also suggest different views of RATIONAL AGENCY, of what it is to be an agent who acts on the basis of justified beliefs. Many traditional views of rationality imply that cognizers have LOGICAL OMNISCIENCE, that is, that they believe all the logical consequences of their beliefs. Since clearly we are not logically omniscient, there is a question of how to modify one’s account of rationality to avoid this result. See also ANOMALOUS MONISM; EMERGENTISM; EPISTEMOLOGY AND COGNITION; EXPLANATION; JUSTIFICATION; LOGICAL OMNISCIENCE, PROBLEM OF; NATURAL KINDS; PSYCHOLOGICAL LAWS; RATIONAL AGENCY; REALISM AND ANTIREALISM; REDUCTIONISM xxvi Philosophy 5 The Mind in Cognitive Science At the outset, I said that the relation between the mental and physical remains the cen- tral, general issue in contemporary, materialist philosophy of mind. In section 2, we saw that the behaviorist critiques of Cartesian views of the mind and behaviorism themselves introduced a dilemma that derived from the problem of the homunculus that any mental science would seem to face. And in section 3 I suggested how a vibrant skepticism about the scientific status of a distinctively psychological science and philosophy's contribution to it was sustained by two dominant philosophical per- spectives. It is time to bring these three points together as we move to explore the view of the mind that constituted the core of the developing field of cognitive science in the 1970s, what is sometimes called classic cognitive science, as well as its successors. If we were to pose questions central to each of these three issues—the mental- physical relation, the problem of the homunculus, and the possibility of a genuinely cognitive science, they might be: a. What is the relation between the mental and the physical? b. How can psychology avoid the problem of the homunculus? c. What makes a genuinely mental science possible? Strikingly, these questions received standard answers, in the form of three “isms,” from the nascent naturalistic perspective in the philosophy of mind that accompanied the rise of classic cognitive science. (The answers, so you don’t have to peek ahead, are, respectively, functionalism, computationalism, and representationalism.) The answer to (a) is FUNCTIONALISM, the view, baldly put, that mental states are functional states. Functionalists hold that what really matters to the identity of types of mental states is not what their instances are made of, but how those instances are causally arranged: what causes them, and what they, in turn, cause. Functionalism represents a view of the mental-physical relation that is compatible with materialism or physicalism because even if it is the functional or causal role that makes a mental state the state it is, every occupant of any particular role could be physical. The role-occupant distinction, introduced explicitly by Armstrong (1968) and implicitly in Lewis (1966), has been central to most formulations of functionalism. A classic example of something that is functionally identified or individuated is money: it’s not what it’s made of (paper, gold, plastic) that makes something money but, rather, the causal role that it plays in some broader economic system. Recogniz- ing this fact about money is not to give up on the idea that money is material or physi- cal. Even though material composition is not what determines whether something is money, every instance of money is material or physical: dollar bills and checks are made of paper and ink, coins are made of metal, even money that is stored solely as a string of digits in your bank account has some physical composition. There are at least two related reasons why functionalism about the mind has been an attractive view to philosophers working in the cognitive sciences. The first is that functionalism at least appears to support the AUTONOMY OF PSY- CHOLOGY, for it claims that even if, as a matter of fact, our psychological states are realized in states of our brains, their status as psychological states lies in their func- tional organization, which can be abstracted from this particular material stuff. This is a nonreductive view of psychology. If functionalism is true, then there will be distinc- tively psychological natural kinds that cross-cut the kinds that are determined by a creature’s material composition. In the context of materialism, functionalism suggests that creatures with very different material organizations could not only have mental states, but have the same kinds of mental states. Thus functionalism makes sense of comparative psychological or neurological investigations across species. The second is that functionalism allows for nonbiological forms of intelligence and mentality. That is, because it is the “form” not the “matter” that determines psycho- logical kinds, there could be entirely artifactual creatures, such as robots or comput- ers, with mental states, provided that they have the right functional organization. This idea has been central to traditional artificial intelligence (AI), where one ideal has Philosophy xxvii been to create programs with a functional organization that not only allows them to behave in some crude way like intelligent agents but to do so in a way that instantiates at least some aspects of intelligence itself. Both of these ideas have been criticized as part of attacks on functionalism. For example, Paul and Patricia Churchland (1981) have argued that the “autonomy” of psychology that one gains from functionalism can be a cover for the emptiness of the science itself, and Jaegwon Kim (1993) has argued against the coherence of the nonre- ductive forms of materialism usually taken to be implied by functionalism. Addition- ally, functionalism and AI are the targets of John Searle's much-discussed CHINESE ROOM ARGUMENT. Consider (c), the question of what makes a distinctively mental science possible. Although functionalism gives one sort of answer to this in its basis for a defense of the autonomy (and so distinctness) of psychology, because there are more functional kinds than those in psychology (assuming functionalism), this answer does not explain what is distinctively psychological about psychology. A better answer to this question is representationalism, also known as the representational theory of mind. This is the view that mental states are relations between the bearers of those states and internal mental representations. Representationalism answers (c) by viewing psychol- ogy as the science concerned with the forms these mental representations can take, the ways in which they can be manipulated, and how they interact with one another in mediating between perceptual input and behavioral output. A traditional version of representationalism, one cast in terms of Ideas, themselves often conceptualized as images, was held by the British empiricists John Locke, George Berkeley, and DAVID HUME. A form of representationalism, the LANGUAGE OF THOUGHT (LOT) hypothesis, has more recently been articulated and defended by Jerry Fodor (1975, 1981, 1987, 1994). The LOT hypothesis is the claim that we are able to cognize in virtue of having a mental language, mentalese, whose symbols are com- bined systematically by syntactic rules to form more complex units, such as thoughts. Because these mental symbols are intentional or representational (they are about things), the states that they compose are representational; mental states inherit their intentionality from their constituent mental representations. Fodor himself has been particularly exercised to use the language of thought hypothesis to chalk out a place for the PROPOSITIONAL ATTITUDES and our folk psy- chology within the developing sciences of the mind. Not all proponents of the repre- sentational theory of mind, however, agree with Fodor's view that the system of representation underlying thought is a language, nor with his defense of folk psychol- ogy. But even forms of representationalism that are less committal than Fodor’s own provide an answer to the question of what is distinctive about psychology: psychology is not mere neuroscience because it traffics in a range of mental representations and posits internal processes that operate on these representations. Representationalism, particularly in Fodoresque versions that see the language of thought hypothesis as forming the foundations for a defense of both cognitive psy- chology and our commonsense folk psychology, has been challenged within cognitive science by the rise of connectionism in psychology and NEURAL NETWORKS within computer science. Connectionist models of psychological processing might be taken as an existence proof that one does not need to assume what is sometimes called the RULES AND REPRESENTATIONS approach to understand cognitive functions: the lan- guage of thought hypothesis is no longer “the only game in town.” Connectionist COGNITIVE MODELING of psychological processing, such as that of the formation of past tense (Rumelhart and McClelland 1986), face recognition (Cot- trell and Metcalfe 1991), and VISUAL WORD RECOGNITION (Seidenberg and McClel- land 1989), typically does not posit discrete, decomposable representations that are concatenated through the rules of some language of thought. Rather, connectionists posit a COGNITIVE ARCHITECTURE made up of simple neuron-like nodes, with activity being propagated across the units proportional to the weights of the connection strength between them. Knowledge lies not in the nodes themselves but in the values of the weights connecting nodes. There seems to be nothing of a propositional form within such connectionist networks, no place for the internal sentences that are the xxviii Philosophy objects of folk psychological states and other subpersonal psychological states posited in accounts of (for example) memory and reasoning. The tempting idea that “classicists” accept, and connectionists reject, representa- tionalism is too simple, one whose implausibility is revealed once one shifts one’s focus from folk psychology and the propositional attitudes to cognition more gener- ally. Even when research in classical cognitive science—for example, that on KNOWLEDGE-BASED SYSTEMS and on BAYESIAN NETWORKS—is cast in terms of “beliefs” that a system has, the connection between “beliefs” and the beliefs of folk psychology has been underexplored. More importantly, the notion of representation itself has not been abandoned across-the-board by connectionists, some of whom have sought to salvage and adapt the notion of mental representation, as suggested by the continuing debate over DISTRIBUTED VS. LOCAL REPRESENTATION and the explo- ration of sub-symbolic forms of representation within connectionism (see Boden 1990; Haugeland 1997; Smolensky 1994). What perhaps better distinguishes classic and connectionist cognitive science here is not the issue of whether some form of representationalism is true, but whether the question to which it is an answer needs answering at all. In classical cognitive science, what makes the idea of a genuinely mental science possible is the idea that psychol- ogy describes representation crunching. But in starting with the idea that neural repre- sentation occurs from single neurons up through circuits to modules and more nebulous, distributed neural systems, connectionists are less likely to think that psy- chology offers a distinctive level of explanation that deserves some identifying char- acterization. This rejection of question (c) is clearest, I think, in related DYNAMIC APPROACHES TO COGNITION, since such approaches investigate psychological states as dynamic systems that need not posit distinctly mental representations. (As with con- nectionist theorizing about cognition, dynamic approaches encompass a variety of views of mental representation and its place in the study of the mind that make repre- sentationalism itself a live issue within such approaches; see Haugeland 1991; van Gelder 1998.) Finally, consider (b), the question of how to avoid the problem of the homunculus in the sciences of the mind. In classic cognitive science, the answer to (b) is computa- tionalism, the view that mental states are computational, an answer which integrates and strengthens functionalist materialism and representationalism as answers to our previous two questions. It does so in the way in which it provides a more precise char- acterization of the nature of the functional or causal relations that exist between men- tal states: these are computational relations between mental representations. The traditional way to spell this out is the COMPUTATIONAL THEORY OF MIND, according to which the mind is a digital computer, a device that stores symbolic representations and performs operations on them in accord with syntactic rules, rules that attend only to the “form” of these symbols. This view of computationalism has been challenged not only by relatively technical objections (such as that based on the FRAME PROB- LEM), but also by the development of neural networks and models of SITUATED COG- NITION AND LEARNING, where (at least some) informational load is shifted from internal codes to organism-environment interactions (cf. Ballard et al. 1997). The computational theory of mind avoids the problem of the homunculus because digital computers that exhibit some intelligence exist, and they do not contain undis- charged homunculi. Thus, if we are fancy versions of such computers, then we can understand our intelligent capacities without positing undischarged homunculi. The way this works in computers is by having a series of programs and languages, each compiled by the one beneath it, with the most basic language directly implemented in the hardware of the machine. We avoid an endless series of homunculi because the capacities that are posited at any given level are typically simpler and more numerous than those posited at any higher level, with the lowest levels specifying instructions to perform actions that require no intelligence at all. This strategy of FUNCTIONAL DECOMPOSITION solves the problem of the homunculus if we are digital computers, assuming that it solves it for digital computers. Like representationalism, computationalism has sometimes been thought to have been superseded by either (or both) the connectionist revolution of the 1980s, or the Philosophy xxix Decade of the Brain (the 1990s). But as with proclamations of the death of representa- tionalism, this notice of the death of computationalism is premature. In part this is because the object of criticism is a specific version of computationalism, not compu- tationalism per se (cf. representationalism), and in part it is because neural networks and the neural systems in the head they model are both themselves typically claimed to be computational in some sense. It is surprisingly difficult to find an answer within the cognitive science community to the question of whether there is a univocal notion of COMPUTATION that underlies the various different computational approaches to cognition on offer. The various types of AUTOMATA postulated in the 1930s and 1940s—particularly TURING machines and the “neurons” of MCCULLOCH and PITTS, which form the intellectual foundations, respectively, for the computational theory of mind and contemporary neural network theory—have an interwoven history, and many of the initial putative differences between classical and connectionist cognitive science have faded into the background as research in artificial intelligence and cogni- tive modeling has increasingly melded the insights of each approach into more sophis- ticated hybrid models of cognition (cf. Ballard 1997). While dynamicists (e.g., Port and van Gelder 1995) have sometimes been touted as providing a noncomputational alternative to both classic and connectionist cognitive science (e.g., Thelen 1995: 70), as with claims about the nonrepresentational stance of such approaches, such a characterization is not well founded (see Clark 1997, 1998). More generally, the relationship between dynamical approaches to both classical and connectionist views remains a topic for further discussion (cf. van Gelder and Port 1995; Horgan and Tienson 1996; and Giunti 1997). See also AUTOMATA; AUTONOMY OF PSYCHOLOGY; BAYESIAN NETWORKS; CHI- NESE ROOM ARGUMENT; COGNITIVE ARCHITECTURE; COGNITIVE MODELING, CONNEC- TIONIST; COGNITIVE MODELING, SYMBOLIC; COMPUTATION; COMPUTATIONAL THEORY OF MIND; DISTRIBUTED VS. LOCAL REPRESENTATION; DYNAMIC APPROACHES TO COG- NITION; FRAME PROBLEM; FUNCTIONAL DECOMPOSITION; FUNCTIONALISM; HUME, DAVID; KNOWLEDGE-BASED SYSTEMS; LANGUAGE OF THOUGHT; MCCULLOCH, WAR- REN S.; NEURAL NETWORKS; PITTS, WALTER; PROPOSITIONAL ATTITUDES; RULES AND REPRESENTATIONS; SITUATED COGNITION AND LEARNING; TURING, ALAN; VISUAL WORD RECOGNITION 6 A Focus on Folk Psychology Much recent philosophical thinking about the mind and cognitive science remains preoccupied with the three traditional philosophical issues I identified in the first section: the mental-physical relation, the structure of the mind, and the first-person perspective. All three issues arise in one of the most absorbing discussions over the last twenty years, that over the nature, status, and future of what has been variously called commonsense psychology, the propositional attitudes, or FOLK PSYCHOL- OGY. The term folk psychology was coined by Daniel Dennett (1981) to refer to the sys- tematic knowledge that we “folk” employ in explaining one another's thoughts, feel- ings, and behavior; the idea goes back to Sellars’s Myth of Jones in “Empiricism and the Philosophy of Mind” (1956). We all naturally and without explicit instruction engage in psychological explanation by attributing beliefs, desires, hopes, thoughts, memories, and emotions to one another. These patterns of folk psychological explana- tion are “folk” as opposed to “scientific” since they require no special training and are manifest in everyday predictive and explanatory practice; and genuinely “psychologi- cal” because they posit the existence of various states or properties that seem to be paradigmatically mental in nature. To engage in folk psychological explanation is, in Dennett’s (1987) terms, to adopt the INTENTIONAL STANCE. Perhaps the central issue about folk psychology concerns its relationship to the developing cognitive sciences. ELIMINATIVE MATERIALISM, or eliminativism, is the view that folk psychology will find no place in any of the sciences that could be called “cognitive” in orientation; rather, the fortune of folk psychology will be like that of many other folk views of the world that have found themselves permanently out of xxx Philosophy step with scientific approaches to the phenomena they purport to explain, such as folk views of medicine, disease, and witchcraft. Eliminativism is sometimes motivated by adherence to reductionism (including the thesis of EXTENSIONALITY) and the ideal of the unity of science, together with the rec- ognition that the propositional attitudes have features that set them off in kind from the types of entity that exist in other sciences. For example, they are intentional or rep- resentational, and attributing them to individuals seems to depend on factors beyond the boundary of those individuals, as the TWIN EARTH arguments suggest. These argu- ments and others point to a prima facie conflict between folk psychology and INDIVID- UALISM (or internalism) in psychology (see Wilson 1995). The apparent conflict between folk psychology and individualism has provided one of the motivations for developing accounts of NARROW CONTENT, content that depends solely on an individ- ual's intrinsic, physical properties. (The dependence here has usually been understood in terms of the technical notion of SUPERVENIENCE; see Horgan 1993.) There is a spin on this general motivation for eliminative materialism that appeals more directly to the issue of the how the mind is structured. The claim here is that whether folk psychology is defensible will turn in large part on how compatible its ontology—its list of what we find in a folk psychological mind—is with the develop- ing ontology of the cognitive sciences. With respect to classical cognitive science, with its endorsement of both the representational and computational theories of mind, folk psychology is on relatively solid ground here. It posits representational states, such as belief and desire, and it is relatively easy to see how the causal relations between such states could be modeled computationally. But connectionist models of the mind, with what representation there is lying in patterns of activity rather than in explicit representations like propositions, seem to leave less room in the structure of the mind for folk psychology. Finally, the issue of the place of the first-person perspective arises with respect to folk psychology when we ask how people deploy folk psychology. That is, what sort of psychological machinery do we folk employ in engaging in folk psychological explanation? This issue has been the topic of the SIMULATION VS. THEORY-THEORY debate, with proponents of the simulation view holding, roughly, a “first-person first” account of how folk psychology works, and theory-theory proponents viewing folk psychology as essentially a third-person predictive and explanatory tool. Two recent volumes by Davies and Stone (1995a, 1995b) have added to the literature on this debate, which has developmental and moral aspects, including implications for MORAL PSYCHOLOGY. See also ELIMINATIVE MATERIALISM; EXTENSIONALITY, THESIS OF; FOLK PSYCHOL- OGY; INDIVIDUALISM; INTENTIONAL STANCE; MORAL PSYCHOLOGY; NARROW CON- TENT; SIMULATION VS. THEORY-THEORY; SUPERVENIENCE; TWIN EARTH 7 Exploring Mental Content Although BRENTANO’s claim that INTENTIONALITY is the “mark of the mental” is problematic and has few adherents today, intentionality has been one of the flagship topics in philosophical discussion of the mental, and so at least a sort of mark of that discussion. Just what the puzzle about intentionality is and what one might say about it are topics I want to explore in more detail here. To say that something is intentional is just to say that it is about something, or that it refers to something. In this sense, statements of fact are paradigmatically inten- tional, since they are about how things are in the world. Similarly, a highway sign with a picture of a gas pump on it is intentional because it conveys the information that there is gas station ahead at an exit: it is, in some sense, about that state of affairs. The beginning of chapter 4 of Jerry Fodor’s Psychosemantics provides one lively expression of the problem with intentionality: I suppose that sooner or later the physicists will complete the catalogue they’ve been compiling of the ultimate and irreducible properties of things. When they do, the likes of spin, charm, and charge will perhaps appear upon their list. But aboutness surely won’t; intentionality simply doesn’t go that deep. It’s hard to see, in face of this consideration, how one can be a Realist about intentionality without Philosophy xxxi also being, to some extent or other, a Reductionist. If the semantic and the intentional are real properties of things, it must be in virtue of their identity with (or maybe of their supervenience on?) properties that are themselves neither intentional nor semantic. If aboutness is real, it must be really something else. (p. 97, emphases in original) Although there is much that one could take issue with in this passage, my reason for introducing it here is not to critique it but to try to capture some of the worries about intentionality that bubble up from it. The most general of these concerns the basis of intentionality in the natural order: given that only special parts of the world (like our minds) have intentional properties, what is it about those things that gives them (and not other things) intentionality? Since not only mental phenomena are intentional (for example, spoken and written natural language and systems of signs and codes are as well), one might think that a natural way to approach this question would be as follows. Consider all of the various sorts of “merely material” things that at least seem to have intentional properties. Then proceed to articulate why each of them is intentional, either taking the high road of specifying something like the “essence of intentionality”—something that all and only things with intentional properties have—or taking the low road of doing so for each phenomenon, allowing these accounts to vary across disparate intentional phe- nomena. Very few philosophers have explored the problem of intentionality in this way. I think this is chiefly because they do not view all things with intentional properties as having been created equally. A common assumption is that even if lots of the nonmen- tal world is intentional, its intentionality is derived, in some sense, from the intention- ality of the mental. So, to take a classic example, the sentences we utter and write are intentional all right (they are about things). But their intentionality derives from that of the corresponding thoughts that are their causal antecedents. To take another often- touted example, computers often produce intentional output (even photocopiers can do this), but whatever intentionality lies in such output is not inherent to the machines that produce it but is derivative, ultimately, from the mental states of those who design, program, and use them and their products. Thus, there has been a focus on mental states as a sort of paradigm of intentional state, and a subsequent narrowing of the sorts of intentional phenomena discussed. Two points are perhaps worth making briefly in this regard. First, the assumption that not all things with intentional properties are created equally is typically shared even by those who have not focused almost exclusively on mental states as paradigms of intentional states, but on languages and other public and conventional forms of representation (e.g., Horst 1996). It is just that their paradigm is different. Second, even when mental states have been taken as a paradigm here, those inter- ested in developing a “psychosemantics”—an account of the basis for the semantics of psychological states—have often turned to decidedly nonmental systems of repre- sentation in order to theorize about the intentionality of the mental. This focus on what we might think of as proto-intentionality has been prominent within both Fred Dretske’s (1981) informational semantics and the biosemantic approach pioneered by Ruth Millikan (1984, 1993). The idea common to such views is to get clear about the grounds of simple forms of intentionality before scaling up to the case of the intentionality of human minds, an instance of a research strategy that has driven work in the cognitive sciences from early work in artificial intelligence on KNOWLEDGE REPRESENTATION and cognitive modeling through to contemporary work in COMPUTATIONAL NEUROSCIENCE. Explor- ing simplified or more basic intentional systems in the hope of gaining some insight into the more full-blown case of the intentionality of human minds runs the risk, of course, of focusing on cases that leave out precisely that which is crucial to full-blown intentionality. Some (for example, Searle 1992) would claim that consciousness and phenomenology are such features. As I hinted at in my discussion of the mind in cognitive science in section 5, con- strued one way the puzzle about the grounds of intentionality has a general answer in the hypothesis of computationalism. But there is a deeper problem about the grounds xxxii Philosophy of intentionality concerning just how at least some mental stuff could be about other stuff in the world, and computationalism is of little help here. Computationalism does not even pretend to answer the question of what it is about specific mental states (say, my belief that trees often have leaves) that gives them the content that they have—for example, that makes them about trees. Even if we were complicated Turing machines, what would it be about my Turing machine table that implies that I have the belief that trees often have leaves? Talking about the correspondence between the semantic and syntactic properties that symbol structures in computational systems have, and of how the former are “inherited” from the latter is well and good. But it leaves open the “just how” question, and so fails to address what I am here calling the deeper problem about the grounds of intentionality. This problem is explored in the article on MENTAL REPRESENTATION, and particular proposals for a psychosemantics can be found in those on INFORMATIONAL SEMANTICS and FUNCTIONAL ROLE SEMANTICS. It would be remiss in exploring mental content to fail to mention that much thought about intentionality has been propelled by work in the philosophy of language: on INDEXICALS AND DEMONSTRATIVES, on theories of REFERENCE and the propositional attitudes, and on the idea of RADICAL INTERPRETATION. Here I will restrict myself to some brief comments on theories of reference, which have occupied center stage in the philosophy of language for much of the last thirty years. One of the central goals of theories of reference has been to explain in virtue of what parts of sentences of natural languages refer to the things they refer to. What makes the name “Miranda” refer to my daughter? In virtue of what does the plural noun “dogs” refer to dogs? Such questions have a striking similarity to my above expression of the central puzzle concerning intentionality. In fact, the application of causal theories of reference (Putnam 1975, Kripke 1980) developed principally for natural languages has played a central role in disputes in the philosophy of mind that concern intentionality, including those over individualism, narrow content, and the role of Twin Earth arguments in thinking about intentionality. In particular, applying them not to the meaning of natural language terms but to the content of thought is one way to reach the conclusion that mental content does not supervene on an individual's physical properties, that is, that mental content is not individual- istic. GOTTLOB FREGE is a classic source for contrasting descriptivist theories of refer- ence, according to which natural language reference is, in some sense, mediated by a speaker’s descriptions of the object or property to which she refers. Moreover, Frege’s notion of sense and the distinction between SENSE AND REFERENCE are often invoked in support of the claim that there is much to MEANING—linguistic or mental—that goes beyond the merely referential. Frege is also one of the founders of modern logic, and it is to the role of logic in the cognitive sciences that I now turn. See also BRENTANO, FRANZ; COMPUTATIONAL NEUROSCIENCE; FREGE, GOTTLOB; FUNCTIONAL ROLE SEMANTICS; INDEXICALS AND DEMONSTRATIVES; INFORMATIONAL SEMANTICS; INTENTIONALITY; KNOWLEDGE REPRESENTATION; MEANING; MENTAL REPRESENTATION; RADICAL INTERPRETATION; REFERENCE, THEORIES OF; SENSE AND REFERENCE 8 Logic and the Sciences of the Mind Although INDUCTION, like deduction, involves drawing inferences on the basis of one or more premises, it is deductive inference that has been the focus in LOGIC, what is often simply referred to as “formal logic” in departments of philosophy and linguis- tics. The idea that it is possible to abstract away from deductive arguments given in natural language that differ in the content of their premises and conclusions goes back at least to Aristotle in the fourth century B.C. Hence the term “Aristotelian syllogisms” to refer to a range of argument forms containing premises and conclusions that begin with the words “every” or “all,” “some,” and “no.” This abstraction makes it possible to talk about argument forms that are valid and invalid, and allows one to describe two arguments as being of the same logical form. To take a simple example, we know that any argument of the form: Philosophy xxxiii All A are B. No B are C. No A are C. is formally valid, where the emphasis here serves to highlight reference to the preser- vation of truth from premises to conclusion, that is, the validity, solely in virtue of the forms of the individual sentences, together with the form their arrangement consti- tutes. Whatever plural noun phrases we substitute for “A,” “B,” and “C,” the resulting natural language argument will be valid: if the two premises are true, the conclusion must also be true. The same general point applies to arguments that are formally invalid, which makes it possible to talk about formal fallacies, that is, inferences that are invalid because of the forms they instantiate. Given the age of the general idea of LOGICAL FORM, what is perhaps surprising is that it is only in the late nineteenth century that the notion was developed so as to apply to a wide range of natural language constructions through the development of the propositional and predicate logics. And it is only in the late twentieth century that the notion of logical form comes to be appropriated within linguistics in the study of SYNTAX. I focus here on the developments in logic. Central to propositional logic (sometimes called “sentential logic”) is the idea of a propositional or sentential operator, a symbol that acts as a function on propositions or sentences. The paradigmatic propositional operators are symbols for negation (“~”), conjunction (“&”), disjunction (“v”), and conditional (“→”). And with the development of formal languages containing these symbols comes an ability to repre- sent a richer range of formally valid arguments, such as that manifest in the following thought: If Sally invites Tom, then either he will say “no,” or cancel his game with Bill. But there’s no way he’d turn Sally down. So I guess if she invites him, he’ll cancel with Bill. In predicate or quantificational logic, we are able to represent not simply the relations between propositions, as we can in propositional logic, but also the structure within propositions themselves through the introduction of QUANTIFIERS and the terms and predicates that they bind. One of the historically more important applications of pred- icate logic has been its widespread use in linguistics, philosophical logic, and the phi- losophy of language to formally represent increasingly larger parts of natural languages, including not just simple subjects and predicates, but adverbial construc- tions, tense, indexicals, and attributive adjectives (for example, see Sainsbury 1991). These fundamental developments in logical theory have had perhaps the most widespread and pervasive effect on the foundations of the cognitive sciences of any contributions from philosophy or mathematics. They also form the basis for much contemporary work across the cognitive sciences: in linguistic semantics (e.g., through MODAL LOGIC, in the use of POSSIBLE WORLDS SEMANTICS to model frag- ments of natural language, and in work on BINDING); in metalogic (e.g., on FORMAL SYSTEMS and results such as the CHURCH-TURING THESIS and GÖDEL’S THEOREMS); and in artificial intelligence (e.g., on LOGICAL REASONING SYSTEMS, TEMPORAL REA- SONING, and METAREASONING). Despite their technical payoff, the relevance of these developments in logical the- ory for thinking more directly about DEDUCTIVE REASONING in human beings is, iron- ically, less clear. Psychological work on human reasoning, including that on JUDGMENT HEURISTICS, CAUSAL REASONING, and MENTAL MODELS, points to ways in which human reasoning may be governed by structures very different from those developed in formal logic, though this remains an area of continuing debate and dis- cussion. See also BINDING THEORY; CAUSAL REASONING; CHURCH-TURING THESIS; DEDUC- TIVE REASONING; FORMAL SYSTEMS, PROPERTIES OF; GÖDEL’S THEOREMS; INDUC- TION; JUDGMENT HEURISTICS; LOGIC; LOGICAL FORM IN LINGUISTICS; LOGICAL FORM, ORIGINS OF; LOGICAL REASONING SYSTEMS; MENTAL MODELS; METAREASONING; xxxiv Philosophy MODAL LOGIC; POSSIBLE WORLDS SEMANTICS; QUANTIFIERS; SYNTAX; TEMPORAL REASONING 9 Two Ways to Get Biological By the late nineteenth century, both evolutionary theory and the physiological study of mental capacities were firmly entrenched. Despite this, these two paths to a biological view of cognition have only recently been re-explored in sufficient depth to warrant the claim that contemporary cognitive science incorporates a truly biological perspec- tive on the mind. The neurobiological path, laid down by the tradition of physiological psychology that developed from the mid-nineteenth century, is certainly the better traveled of the two. The recent widening of this path by those dissatisfied with the dis- tinctly nonbiological approaches adopted within traditional artificial intelligence has, as we saw in our discussion of computationalism, raised new questions about COMPU- TATION AND THE BRAIN, the traditional computational theory of the mind, and the rules and representations approach to understanding the mind. The evolutionary path, by contrast, has been taken only occasionally and half-heartedly over the last 140 years. I want to concentrate not only on why but on the ways in which evolutionary theory is relevant to contemporary interdisciplinary work on the mind. The theory of EVOLUTION makes a claim about the patterns that we find in the bio- logical world—they are patterns of descent—and a claim about the predominant cause of those patterns—they are caused by the mechanism of natural selection. None of the recent debates concerning evolutionary theory—from challenges to the focus on ADAPTATION AND ADAPTATIONISM in Gould and Lewontin (1979) to more recent work on SELF-ORGANIZING SYSTEMS and ARTIFICIAL LIFE—challenges the substantial core of the theory of evolution (cf. Kauffman 1993, 1995; Depew and Weber 1995). The vast majority of those working in the cognitive sciences both accept the theory of evolution and so think that a large number of traits that organisms possess are adapta- tions to evolutionary forces, such as natural selection. Yet until the last ten years, the scattered pleas to apply evolutionary theory to the mind (such as those of Ghiselin 1969 and Richards 1987) have come largely from those outside of the psychological and behavioral sciences. Within the last ten years, however, a distinctive EVOLUTIONARY PSYCHOLOGY has developed as a research program, beginning in Leda Cosmides’s (1989) work on human reasoning and the Wason selection task, and represented in the collection of papers The Adapted Mind (Barkow, Cosmides, and Tooby 1992) and, more recently and at a more popular level, by Steven Pinker’s How the Mind Works (1997). Evolu- tionary psychologists view the mind as a set of “Darwinian algorithms” designed by natural selection to solve adaptive problems faced by our hunter-gatherer ancestors. The claim is that this basic Darwinian insight can and should guide research into the cognitive architecture of the mind, since the task is one of discovering and under- standing the design of the human mind, in all its complexity. Yet there has been more than an inertial resistance to viewing evolution as central to the scientific study of human cognition. One reason is that evolutionary theory in general is seen as answering different questions than those at the core of the cognitive sciences. In terms of the well-known distinction between proximal and ultimate causes, appeals to evolutionary theory pri- marily allow one to specify the latter, and cognitive scientists are chiefly interested in the former: they are interested in the how rather than the why of the mind. Or to put it more precisely, central to cognitive science is an understanding of the mechanisms that govern cognition, not the various histories—evolutionary or not—that produced these mechanisms. This general perception of the concerns of evolutionary theory and the contrasting conception of cognitive science, have both been challenged by evolution- ary psychologists. The same general challenges have been issued by those who think that the relations between ETHICS AND EVOLUTION and those between cognition and CULTURAL EVOLUTION have not received their due in contemporary cognitive science. Yet despite the skepticism about this direct application of evolutionary theory to human cognition, its implicit application is inherent in the traditional interest in the Philosophy xxxv minds of other animals, from aplysia to (nonhuman) apes. ANIMAL NAVIGATION, PRI- MATE LANGUAGE, and CONDITIONING AND THE BRAIN, while certainly topics of inter- est in their own right, gain some added value from what their investigation can tell us about human minds and brains. This presupposes something like the following: that there are natural kinds in psychology that transcend species boundaries, such that there is a general way of exploring how a cognitive capacity is structured, independent of the particular species of organism in which it is instantiated (cf. functionalism). Largely on the basis of research with non-human animals, we know enough now to say, with a high degree of certainty, things like this: that the CEREBELLUM is the cen- tral brain structure involved in MOTOR LEARNING, and that the LIMBIC SYSTEM plays the same role with respect to at least some EMOTIONS. This is by way of returning to (and concluding with) the neuroscientific path to biologizing the mind, and the three classic philosophical issues about the mind with which we began. As I hope this introduction has suggested, despite the distinctively philosophical edge to all three issues—the mental-physical relation, the structure of the mind, and the first-person perspective—discussion of each of them is elucidated and enriched by the interdisciplinary perspectives provided by empirical work in the cognitive sciences. It is not only a priori arguments but complexities revealed by empirical work (e.g., on the neurobiology of consciousness, or ATTENTION and ani- mal and human brains) that show the paucity of the traditional philosophical “isms” (dualism, behaviorism, type-type physicalism) with respect to the mental-physical relation. It is not simply general, philosophical arguments against nativism or against empiricism about the structure of the mind that reveal limitations to the global ver- sions of these views, but ongoing work on MODULARITY AND LANGUAGE, on cogni- tive architecture, and on the innateness of language. And thought about introspection and self-knowledge, to take two topics that arise when one reflects on the first-person perspective on the mind, is both enriched by and contributes to empirical work on BLINDSIGHT, the theory of mind, and METAREPRESENTATION. With some luck, phi- losophers increasingly sensitive to empirical data about the mind will have paved a two-way street that encourages psychologists, linguists, neuroscientists, computer scientists, social scientists and evolutionary theorists to venture more frequently and more surely into philosophy. See also ADAPTATION AND ADAPTATIONISM; ANIMAL NAVIGATION; ARTIFICIAL LIFE; ATTENTION IN THE ANIMAL BRAIN; ATTENTION IN THE HUMAN BRAIN; BLIND- SIGHT; CEREBELLUM; COMPUTATION AND THE BRAIN; CONDITIONING AND THE BRAIN; CULTURAL EVOLUTION; EMOTIONS; ETHICS AND EVOLUTION; EVOLUTION; EVOLUTION- ARY PSYCHOLOGY; LIMBIC SYSTEM; METAREPRESENTATION; MODULARITY AND LAN- GUAGE; MOTOR LEARNING; PRIMATE LANGUAGE; SELF-ORGANIZING SYSTEMS Acknowledgments I would like to thank Kay Bock, Bill Brewer, Alvin Goldman, John Heil, Greg Mur- phy, Stewart Saunders, Larry Shapiro, Sydney Shoemaker, Tim van Gelder, and Steve Wagner, as well as the PNP Group at Washington University, St. Louis, for taking time out to provide some feedback on earlier versions of this introduction. I guess the remaining idiosyncrasies and mistakes are mine. References Armstrong, D. M. (1968). A Materialist Theory of the Mind. London: Routledge and Kegan Paul. Ballard, D. (1997). An Introduction to Natural Computation. Cambridge, MA: MIT Press. Ballard, D., M. Hayhoe, P. Pook, and R. Rao. (1997). Deictic codes for the embodiment of cognition. Behavioral and Brain Sciences 20: 723–767. Barkow, J. H., L. Cosmides, and J. Tooby, Eds. (1992). The Adapted Mind. New York: Oxford Uni- versity Press. Boden, M., Ed. (1990). The Philosophy of Artificial Intelligence. Oxford: Oxford University Press. Carnap, R. (1928). The Logical Construction of the World. Translated by R. George (1967). Berke- ley: University of California Press. Chalmers, D. (1996). The Conscious Mind: In Search of a Fundamental Theory. New York: Oxford University Press. xxxvi Philosophy Chomsky, N. (1959). Review of B. F. Skinner's Verbal Behavior. Language 35 : 26–58. Churchland, P. M. (1979). Scientific Realism and the Plasticity of Mind. New York: Cambridge Uni- versity Press. Churchland, P. M., and P. S. Churchland. (1981). Functionalism, qualia, and intentionality. Philo- sophical Topics 12: 121–145. Clark, A. (1997). Being There: Putting Brain, Body, and World Together Again. Cambridge, MA: MIT Press. Clark, A. (1998). Twisted tales: Causal complexity and cognitive scientific explanation. Minds and Machines 8: 79–99. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans rea- son? Studies with the Wason Selection Task. Cognition 31: 187–276. Cottrell, G., and J. Metcalfe. (1991). EMPATH: Face, Emotion, and Gender Recognition Using Holons. In R. Lippman, J. Moody, and D. Touretzky, Eds., Advances in Neural Information Pro- cessing Systems, vol. 3. San Mateo, CA: Morgan Kaufmann. Davies, M., and T. Stone, Eds. (1995a). Folk Psychology: The Theory of Mind Debate. Oxford: Blackwell. Davies, M., and T. Stone, Eds. (1995b). Mental Simulation: Evaluations and Applications. Oxford: Blackwell. Dennett, D. C. (1981). Three kinds of intentional psychology. Reprinted in his 1987. Dennett, D. C. (1987). The Intentional Stance. Cambridge, MA: MIT Press. Depew, D., and B. Weber. (1995). Darwinism Evolving: Systems Dynamics and the Genealogy of Natural Selection. Cambridge, MA: MIT Press. Dretske, F. (1981). Knowledge and the Flow of Information. Cambridge, MA: MIT Press. Elman, J., E. Bates, M. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett, Eds. (1996). Rethinking Innateness. Cambridge, MA: MIT Press. Fodor, J. A. (1975). The Language of Thought. Cambridge, MA: Harvard University Press. Fodor, J. A. (1981). Representations: Philosophical Essays on the Foundations of Cognitive Science. Sussex: Harvester Press. Fodor, J. A. (1987). Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cam- bridge, MA: MIT Press. Fodor, J. A. (1994). The Elm and the Expert. Cambridge, MA: MIT Press. Geach, P. (1957). Mental Acts. London: Routledge and Kegan Paul. Ghiselin, M. (1969). The Triumph of the Darwinian Method. Berkeley: University of California Press. Giunti, M. (1997). Computation, Dynamics, and Cognition. New York: Oxford University Press. Goldman, A. (1997). Science, Publicity, and Consciousness. Philosophy of Science 64: 525–545. Gould, S. J., and R. C. Lewontin. (1979). The spandrels of San Marco and the panglossian paradigm: A critique of the adaptationist programme. Reprinted in E. Sober, Ed., Conceptual Issues in Evo- lutionary Biology, 2nd ed. (1993.) Cambridge, MA: MIT Press. Grene, M. (1995). A Philosophical Testament. Chicago: Open Court. Griffiths, P. E. (1997). What Emotions Really Are. Chicago: University of Chicago Press. Haugeland, J. (1991). Representational genera. In W. Ramsey and S. Stich, Eds., Philosophy and Connectionist Theory. Hillsdale, NJ: Erlbaum. Haugeland, J., Ed. (1997). Mind Design 2: Philosophy, Psychology, and Artificial Intelligence. Cam- bridge, MA: MIT Press. Heil, J., and A. Mele, Eds. (1993). Mental Causation. Oxford: Clarendon Press. Horgan, T. (1993). From supervenience to superdupervenience: Meeting the demands of a material world. Mind 102: 555–586. Horgan, T., and J. Tienson. (1996). Connectionism and the Philosophy of Psychology. Cambridge, MA: MIT Press. Horst, S. (1996). Symbols, Computation, and Intentionality. Berkeley: University of California Press. James, W. (1890). The Principles of Psychology. 2 vol. Dover reprint (1950). New York: Dover. Kauffman, S. (1993). The Origins of Order. New York: Oxford University Press. Kauffman, S. (1995). At Home in the Universe. New York: Oxford University Press. Kim, J. (1993). Supervenience and Mind. New York: Cambridge University Press. Kosslyn, S. (1980). Image and Mind. Cambridge, MA: Harvard University Press. Kosslyn, S. (1994). Image and Brain. Cambridge, MA: MIT Press. Kripke, S. (1980). Naming and Necessity. Cambridge, MA: Harvard University Press. Levine, J. (1983). Materialism and qualia: The explanatory gap. Pacific Philosophical Quarterly 64: 354–361. Lewis, D. K. (1966). An argument for the identity theory. Journal of Philosophy 63: 17–25. Malcolm, N. (1959). Dreaming. London: Routledge and Kegan Paul. Malcolm, N. (1971). Problems of Mind: Descartes to Wittgenstein. New York: Harper and Row. Millikan, R. G. (1984). Language, Thought, and Other Biological Categories. Cambridge, MA: MIT Press. Millikan, R. G. (1993). White Queen Psychology and Other Essays for Alice. Cambridge, MA: MIT Press. Philosophy xxxvii Pinker, S. (1997). How the Mind Works. New York: Norton. Port, R., and T. van Gelder, Eds. (1995). Mind as Motion: Explorations in the Dynamics of Cogni- tion. Cambridge, MA: MIT Press. Putnam, H. (1975). The meaning of “meaning.” Reprinted in Mind, Language, and Reality: Col- lected Papers, vol. 2. Cambridge: Cambridge University Press. Pylyshyn, Z. (1984). Computation and Cognition. Cambridge, MA: MIT Press. Reichenbach, H. (1938). Experience and Prediction. Chicago: University of Chicago Press. Richards, R. (1987). Darwin and the Emergence of Evolutionary Theories of Mind and Behavior. Chicago: University of Chicago Press. Rumelhart, D., and J. McClelland. (1986). On Learning the Past Tenses of English Verbs. In J. McClelland, D. Rumelhart, and the PDP Research Group, Eds., Parallel Distributed Processing, vol. 2. Cambridge, MA: MIT Press. Ryle, G. (1949). The Concept of Mind. New York: Penguin. Sainsbury, M. (1991). Logical Forms. New York: Blackwell. Searle, J. (1992). The Rediscovery of the Mind. Cambridge, MA: MIT Press. Seidenberg, M. S., and J. L. McClelland. (1989). A distributed, developmental model of visual word recognition and naming. Psychological Review 96: 523–568. Sellars, W. (1956). Empiricism and the philosophy of mind. In H. Feigl and M. Scriven, Eds., Min- nesota Studies in the Philosophy of Science, vol. 1. Minneapolis: University of Minnesota Press. Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts. Smolensky, P. (1994). Computational models of mind. In S. Guttenplan, Ed., A Companion to the Philosophy of Mind. Cambridge, MA: Blackwell. Spelke, E. (1990). Principles of object perception. Cognitive Science 14: 29–56. Stabler, E. (1983). How are grammars represented? Behavioral and Brain Sciences 6: 391–420. Thelen, E. (1995). Time-scale dynamics and the development of an embodied cognition. In R. Port and T. van Gelder, Eds., Mind as Motion: Explorations in the Dynamics of Cognition. Cam- bridge, MA: MIT Press. van Fraassen, B. (1989). Laws and Symmetry. New York: Oxford University Press. van Gelder, T. J. (1998). The dynamical hypothesis in cognitive science. Behavioral and Brain Sci- ences 21: 1–14. van Gelder, T., and R. Port. (1995). It's about time: An overview of the dynamical approach to cogni- tion. In R. Port and T. van Gelder, Eds., Mind as Motion: Explorations in the Dynamics of Cogni- tion. Cambridge, MA: MIT Press. Watson, J. B. (1913). Psychology as the behaviorist views it. Psychological Review 20: 158–177. Wilson, R. A. (1995). Cartesian Psychology and Physical Minds: Individualism and the Sciences of the Mind. New York: Cambridge University Press. Wilson, R. A., Ed. (1999). Species: New Interdisciplinary Essays. Cambridge, MA: MIT Press. Wundt, W. (1900–1909). Völkerpsychologie. Leipzig: W. Engelmann. Yablo, S. (1992). Mental causation. Philosophical Review 101: 245–280. Psychology Keith J. Holyoak Psychology is the science that investigates the representation and processing of infor- mation by complex organisms. Many animal species are capable of taking in informa- tion about their environment, forming internal representations of it, and manipulating these representations to select and execute actions. In addition, many animals are able to adapt to their environments by means of learning that can take place within the lifespan of an individual organism. Intelligent information processing implies the abil- ity to acquire and process information about the environment in order to select actions that are likely to achieve the fundamental goals of survival and propagation. Animals have evolved a system of capabilities that collectively provide them with the ability to process information. They have sensory systems such as TASTE and HAPTIC PERCEP- TION (touch), which provide information about the immediate environment with which the individual is in direct contact; proprioception, which provides information about an animal's own bodily states; and SMELL, AUDITION, and VISION, which provide information about more distant aspects of the environment. Animals are capable of directed, self-generated motion, including EYE MOVEMENTS and other motoric behav- iors such as MANIPULATION AND GRASPING, which radically increase their ability to pick up sensory information and also to act upon their environments. The central focus of psychology concerns the information processing that inter- venes between sensory inputs and motoric outputs. The most complex forms of intel- ligence, observed in birds and mammals, and particularly primates (especially great apes and humans) require theories that deal with the machinery of thought and inner experience. These animals have minds and EMOTIONS; their sensory inputs are inter- preted to create perceptions of the external world, guided in part by selective ATTEN- TION; some of the products of perception are stored in MEMORY, and may in turn influence subsequent perception. Intellectually sophisticated animals perform DECI- SION MAKING and PROBLEM SOLVING, and in the case of humans engage in LANGUAGE AND COMMUNICATION. Experience coupled with innate constraints results in a process of COGNITIVE DEVELOPMENT as the infant becomes an adult, and also leads to LEARN- ING over the lifespan, so that the individual is able to adapt to its environment within a vastly shorter time scale than that required for evolutionary change. Humans are capa- ble of the most complex and most domain-general forms of information processing of all species; for this reason (and because those who study psychology are humans), most of psychology aims directly or indirectly to understand the nature of human information processing and INTELLIGENCE. The most general characteristics of the human system for information processing are described as the COGNITIVE ARCHITEC- TURE. See also ATTENTION; AUDITION; COGNITIVE ARCHITECTURE; COGNITIVE DEVELOP- MENT; DECISION MAKING; EMOTIONS; EYE MOVEMENTS AND VISUAL ATTENTION; HAP- TIC PERCEPTION; INTELLIGENCE; LANGUAGE AND COMMUNICATION; LEARNING; MANIPULATION AND GRASPING; MEMORY; PROBLEM SOLVING; SMELL; TASTE; VISION 1 The Place of Psychology within Cognitive Science As the science of the representation and processing of information by organisms, psychology (particularly cognitive psychology) forms part of the core of cognitive science. Cognitive science research conducted in other disciplines generally has actual or potential implications for psychology. Not all research on intelligent infor- mation processing is relevant to psychology. Some work in artificial intelligence, for example, is based on representations and algorithms with no apparent connec- tion to biological intelligence. Even though such work may be highly successful at achieving high levels of competence on cognitive tasks, it does not fall within the scope of cognitive science. For example, the Deep Blue II program that defeated the xl Psychology human CHESS champion Gary Kasparov is an example of an outstanding artificial- intelligence program that has little or no apparent psychological relevance, and hence would not be considered to be part of cognitive science. In contrast, work on adaptive PRODUCTION SYSTEMS and NEURAL NETWORKS, much of which is con- ducted by computer scientists, often has implications for psychology. Similarly, a great deal of work in such allied disciplines as neuroscience, linguistics, anthropol- ogy, and philosophy has psychological implications. At the same time, work in psy- chology often has important implications for research in other disciplines. For example, research in PSYCHOLINGUISTICS has influenced developments in linguis- tics, and research in PSYCHOPHYSICS has guided neurophysiological research on the substrates of sensation and perception. In terms of MARR’s tripartite division of levels of analysis (computational theory, representation and algorithm, and hardware implementation), work in psychology tends to concentrate on the middle level, emphasizing how information is represented and processed by humans and other animals. Although there are many important exceptions, psychologists generally aim to develop process models that specify more than the input-output functions that govern cognition (for example, also specifying timing relations among intervening mental processes), while abstracting away from the detailed neural underpinnings of behavior. Nonetheless, most psychologists do not insist in any strict sense on the AUTONOMY OF PSYCHOLOGY, but rather focus on important interconnections with allied disciplines that comprise cognitive science. Contemporary psychology at the information-processing level is influenced by research in neuroscience that investigates the neural basis for cognition and emotion, by work on representations and algorithms in the fields of artificial intelligence and neural networks, and by work in social sciences such as anthropology that places the psychology of individuals within its cultural context. Research on the psychology of language (e.g., COMPUTATIONAL PSYCHOLINGUISTICS and LANGUAGE AND THOUGHT) is influenced by the formal analyses of language developed in linguistics. Many areas of psychology make close contact with classical issues in philosophy, especially in EPISTEMOLOGY (e.g., CAUSAL REASONING; INDUCTION; CONCEPTS). The field of psychology has several major subdivisions, which have varying degrees of connection to cognitive science. Cognitive psychology deals directly with the representation and processing of information, with greatest emphasis on cognition in adult humans; the majority of the psychology entries that appear in this volume reflect work in this area. Developmental psychology deals with the changes in cogni- tive, social, and emotional functioning that occur over the lifespan of humans and other animals (see in particular COGNITIVE DEVELOPMENT, PERCEPTUAL DEVELOP- MENT, and INFANT COGNITION). Social psychology investigates the cognitive and emotional factors involved in interactions between people, especially in small groups. One subarea of social psychology, SOCIAL COGNITION, is directly concerned with the manner in which people understand the minds, emotions, and behavior of themselves and others (see also THEORY OF MIND; INTERSUBJECTIVITY). Personality psychology deals primarily with motivational and emotional aspects of human experience (see FREUD for discussion of the ideas of the famous progenitor of this area of psychol- ogy), and clinical psychology deals with applied issues related to mental health. COM- PARATIVE PSYCHOLOGY investigates the commonalities and differences in cognition and behavior between different animal species (see PRIMATE COGNITION; ANIMAL NAVIGATION; CONDITIONING; and MOTIVATION), and behavioral neuroscience pro- vides the interface between research on molar cognition and behavior and their under- lying neural substrate. See also ANIMAL NAVIGATION; ANIMAL NAVIGATION, NEURAL NETWORKS; AUTON- OMY OF PSYCHOLOGY; CAUSAL REASONING; CHESS, PSYCHOLOGY OF; COGNITIVE DEVELOPMENT; COMPARATIVE PSYCHOLOGY; COMPUTATIONAL PSYCHOLINGUISTICS; CONCEPTS; CONDITIONING; EPISTEMOLOGY AND COGNITION; INDUCTION; INFANT COGNITION; INTERSUBJECTIVITY; LANGUAGE AND THOUGHT; MARR, DAVID; MOTIVA- TION; NEURAL NETWORKS; PERCEPTUAL DEVELOPMENT; PRIMATE COGNITION; PRO- DUCTION SYSTEMS; PSYCHOLINGUISTICS; PSYCHOPHYSICS; SOCIAL COGNITION; SOCIAL COGNITION IN ANIMALS; THEORY OF MIND Psychology xli 2 Capsule History of Psychology Until the middle of the nineteenth century the nature of the mind was solely the con- cern of philosophers. Indeed, there are a number of reasons why some have argued that the scientific investigation of the mind may prove to be an impossible undertak- ing. One objection is that thoughts cannot be measured; and without measurement, science cannot even begin. A second objection is to question how humans could objectively study their own thought processes, given the fact that science itself depends on human thinking. A final objection is that our mental life is incredibly complex and bound up with the further complexities of human social interactions; perhaps cognition is simply too complex to permit successful scientific investigation. Despite these reasons for skepticism, scientific psychology emerged as a discipline separate from philosophy in the second half of the nineteenth century. A science depends on systematic empirical methods for collecting observations and on theories that interpret these observations. Beginning around 1850, a number of individuals, often trained in philosophy, physics, physiology, or neurology, began to provide these crucial elements. The anatomist Ernst Heinrich Weber and the physicist and philosopher Gustav Fechner measured the relations between objective changes in physical stimuli, such as brightness or weight, and subjective changes in the internal sensations the stimuli gen- erate. The crucial finding of Weber and Fechner was that subjective differences were not simply equivalent to objective differences. Rather, it turned out that for many dimensions, the magnitude of change required to make a subjective difference (“just noticeable difference,” or “jnd”) increased as overall intensity increased, often follow- ing an approximately logarithmic function, known as the Weber-Fechner Law. Weber and Fechner's contribution to cognitive psychology was much more general than iden- tifying the law that links their names. They convincingly demonstrated that, contrary to the claim that thought is inherently impossible to measure, it is in fact possible to measure mental concepts, such as the degree of sensation produced by a stimulus. Fechner called this new field of psychological measurement PSYCHOPHYSICS: the interface of psychology and physics, of the mental and the physical. A further foundational issue concerns the speed of human thought. In the nine- teenth century, many believed that thought was either instantaneous or else so fast that it could never be measured. But HERMANN VON HELMHOLTZ, a physicist and physiolo- gist, succeeded in measuring the speed at which signals are conducted through the nervous system. He first experimented on frogs by applying an electric current to the top of a frog’s leg and measuring the time it took the muscle at the end to twitch in response. Later he used a similar technique with humans, touching various parts of a person’s body and measuring the time taken to press a button in response. The response time increased with the distance of the stimulus (i.e., the point of the touch) from the finger that pressed the button, in proportion to the length of the neural path over which the signal had to travel. Helmholtz’s estimate of the speed of nerve signals was close to modern estimates—roughly 100 meters per second for large nerve fibers. This transmission rate is surprisingly slow—vastly slower than the speed of electricity through a wire. Because our brains are composed of neurons, our thoughts cannot be generated any faster than the speed at which neurons communicate with each other. It follows that the speed of thought is neither instantaneous nor immeasurable. Helmholtz also pioneered the experimental study of vision, formulating a theory of color vision that remains highly influential today. He argued forcefully against the commonsensical idea that perception is simply a matter of somehow “copying” sen- sory input into the brain. Rather, he pointed out that even the most basic aspects of per- ception require major acts of construction by the nervous system. For example, it is possible for two different objects—a large object far away, and a small object nearby— to create precisely the same image on the retinas of a viewer’s eyes. Yet normally the viewer will correctly perceive the one object as being larger, but further away, than the other. The brain somehow manages to unconsciously perform some basic geometrical calculations. The brain, Helmholtz argued, must construct this unified view by a pro- cess of “unconscious inference”—a process akin to reasoning without awareness. xlii Psychology Helmholtz’s insight was that the “reality” we perceive is not simply a copy of the exter- nal world, but rather the product of the constructive activities of the brain. Another philosopher, HERMANN EBBINGHAUS, who was influenced by Fechner’s ideas about psychophysical measurements, developed experimental methods tailored to the study of human memory. Using himself as a subject, Ebbinghaus studied mem- ory for nonsense syllables—consonant-vowel-consonant combinations, such as “zad,” “bim,” and “sif.” He measured how long it took to commit lists of nonsense syllables to memory, the effects of repetition on how well he could remember the syllables later, and the rate of forgetting as a function of the passage of time. Ebbinghaus made several fundamental discoveries about memory, including the typical form of the “for- getting curve”—the gradual, negatively accelerated decline in the proportion of items that can be recalled as a function of time. Like Weber, Fechner, and Helmholtz, Ebb- inghaus provided evidence that it is indeed possible to measure mental phenomena by objective experimental procedures. Many key ideas about possible components of cognition were systematically pre- sented by the American philosopher WILLIAM JAMES in the first great psychology text- book, Principles of Psychology, published in 1890. His monumental work included topics that remain central in psychology, including brain function, perception, atten- tion, voluntary movement, habit, memory, reasoning, the SELF, and hypnosis. James discussed the nature of “will,” or mental effort, which remains one of the basic aspects of attention. He also drew a distinction between different memory systems: primary memory, which roughly corresponds to the current contents of consciousness, and secondary memory, which comprises the vast store of knowledge of which we are not conscious at any single time, yet continually draw upon. Primary memory is closely related to what we now term active, short-term, or WORKING MEMORY, while second- ary memory corresponds to what is usually called long-term memory. James emphasized the adaptive nature of cognition: the fact that perception, mem- ory, and reasoning operate not simply for their own sake, but to allow us to survive and prosper in our physical and social world. Humans evolved as organisms skilled in tool use and in social organization, and it is possible (albeit a matter of controversy) that much of our cognitive apparatus evolved to serve these basic functions (see EVO- LUTIONARY PSYCHOLOGY). Thus, human cognition involves intricate systems for MOTOR CONTROL and MOTOR LEARNING; the capacity to understand that other people have minds, with intentions and goals that may lead them to help or hinder us; and the ability to recognize and remember individual persons and their characteristics. Fur- thermore, James (1890:8) recognized that the hallmark of an intelligent being is its ability to link ends with means—to select actions that will achieve goals: “The pursu- ance of future ends and the choice of means for their attainment are thus the mark and criterion of the presence of mentality in a phenomenon.” This view of goal-directed thinking continues to serve as the foundation of modern work on PROBLEM SOLVING, as reflected in the views of theorists such as ALAN NEWELL and Herbert Simon. Another pioneer of psychology was Sigmund Freud, the founder of psychoanaly- sis, whose theoretical ideas about cognition and consciousness anticipated many key aspects of the modern conception of cognition. Freud attacked the idea that the “self” has some special status as a unitary entity that somehow governs our thought and action. Modern cognitive psychologists also reject (though for different reasons) explanations of intelligent behavior that depend upon postulating a “homunculus”— that is, an internal mental entity endowed with all the intelligence we are trying to explain. Behavior is viewed not as the product of a unitary self or homunculus, but as the joint product of multiple interacting subsystems. Freud argued that the “ego”—the information-processing system that modulates various motivational forces—is not a unitary entity, but rather a complex system that includes attentional bottlenecks, mul- tiple memory stores, and different ways of representing information (e.g., language, imagery, and physiognomic codes, or “body language”). Furthermore, as Freud also emphasized, much of information processing takes place at an unconscious level. We are aware of only a small portion of our overall mental life, a tip of the cognitive ice- berg. For example, operating beneath the level of awareness are attentional “gates” that open or close to selectively attend to portions of the information that reaches our Psychology xliii senses, memory stores that hold information for very brief periods of time, and inac- cessible memories that we carry with us always but might never retrieve for years at a time. Given the breadth and depth of the contributions of the nineteenth-century pioneers to what would eventually become cognitive science, it is ironic that early in the twen- tieth century the study of cognition went into a steep decline. Particularly in the United States, psychology in the first half of the century came to be dominated by BEHAVIORISM, an approach characterized by the rejection of theories that depended on “mentalistic” concepts such as goals, intentions, or plans. The decline of cognitive psychology was in part due to the fact that a great deal of psychological research had moved away from the objective measurement techniques developed by Fechner, Helmholtz, Ebbinghaus, and others, and instead gave primacy to the method of INTRO- SPECTION, promoted by WILHELM WUNDT, in which trained observers analyzed their own thought processes as they performed various cognitive tasks. Not surprisingly, given what is now known about how expectancies influence the way we think, intro- spectionists tended to find themselves thinking in more or less the manner to which they were theoretically predisposed. For example, researchers who believed thinking always depended on IMAGERY usually found themselves imaging, whereas those who did not subscribe to such a theory were far more likely to report “imageless thought.” The apparent subjectivity and inconstancy of the introspective method encouraged charges that all cognitive theories (rather than simply the method itself, as might seem more reasonable) were “unscientific.” Cognitive theories were overshadowed by the behaviorist theories of such leading figures as John Watson, Edward Thorndike, Clark Hull, and B. F. Skinner. Although there were major differences among the behavior- ists in the degree to which they actually avoided explanations based on assumptions about unobservable mental states (e.g., Hull postulated such states rather freely, whereas Watson was adamant that they were scientifically illicit), none supported the range of cognitive ideas advanced in the nineteenth century. Cognitive psychology did not simply die out during the era of behaviorism. Work- ing within the behaviorist tradition, Edward Tolman pursued such cognitive issues as how animals represented spatial information internally as COGNITIVE MAPS of their environment. European psychologists were far less captivated with behaviorism than were Americans. In England, Sir FREDERICK BARTLETT analyzed the systematic dis- tortions that people exhibit when trying to remember stories about unfamiliar events, and introduced the concept of “schema” (see SCHEMATA) as a mental representation that captures the systematic structural relations in categories of experience. In Soviet Russia, the neuropsychologist Aleksandr LURIA provided a detailed portrait of links between cognitive functions and the operation of specific regions of the brain. Another Russian, LEV VYGOTSKY, developed a sociohistorical approach to cognitive development that emphasized the way in which development is constructed through social interaction, cultural practices, and the internalization of cognitive tools. Vygotsky emphasized social interaction through language in the development of chil- dren’s concepts. The Swiss psychologist JEAN PIAGET spent decades refining a theory of cognitive development. Piaget's theory emphasizes milestones in the child’s devel- opment including decentration, the ability to perform operations on concrete objects, and finally the ability to perform operations on thoughts and beliefs. Given its empha- sis on logical thought, Piaget's theory is closely related to SCIENTIFIC THINKING AND ITS DEVELOPMENT. In addition, the great German tradition in psychology, which had produced so many of the nineteenth-century pioneers, gave rise to a new cognitive movement in the early twentieth century: GESTALT PSYCHOLOGY. The German word Gestalt trans- lates roughly as “form,” and the Gestalt psychologists emphasized that the whole form is something different from the mere sum of its parts, due to emergent properties that arise as new relations are created. Gestalt psychology was in some ways an exten- sion of Helmholtz's constructivist ideas, and the greatest contributions of this intellec- tual movement were in the area of GESTALT PERCEPTION. Where the behaviorists insisted that psychology was simply the study of how objective stimuli come to elicit objective responses, the Gestaltists pointed to simple demonstrations casting doubt on xliv Psychology the idea that “objective” stimuli—that is, stimuli perceived in a way that can be described strictly in terms of the sensory input—even exist. Figure 1 illustrates a famous Gestalt example of the constructive nature of perception, the ambiguous Necker cube. Although this figure is simply a flat line drawing, we immediately per- ceive it as a three-dimensional cube. Moreover, if you look carefully, you will see that the figure can actually be seen as either of two different three-dimensional cubes. The same objective stimulus—the two-dimensional line drawing—gives rise to two dis- tinct three-dimensional perceptions. Although many of the major contributions by key Gestalt figures such as Max Wer- theimer were in the area of perception, their central ideas were extended to memory and problem solving as well, through the work of people such as Wolfgang Köhler and Karl Duncker. Indeed, one of the central tenets of Gestalt psychology was that high- level thinking is based on principles similar to those that govern basic perception. As we do in everyday language, Gestalt psychologists spoke of suddenly “seeing” the solution to a problem, often after “looking at it” in a different way and achieving a new “insight.” In all the areas in which they worked, the Gestalt idea of “a whole different from the sum of parts” was based on the fundamental fact that organized configura- tions are based not simply on individual elements, but also on the relations between those elements. Just as H2O is not simply two hydrogen atoms and one oxygen atom, but also a particular spatial organization of these elements into a configuration that makes a molecule of water, so too “squareness” is more than four lines: it crucially depends on the way the lines are related to one another to make four right angles. Fur- thermore, relations can take on a “life of their own,” separable from any particular set of elements. For example, we can take a tune, move it to a different key so that all the notes are changed, and still immediately recognize it as the “same” tune as long as the relations among the notes are preserved. A focus on relations calls attention to the cen- trality of the BINDING PROBLEM, which involves the issue of how elements are system- atically organized to fill relational roles. Modern work on such topics as ANALOGY and SIMILARITY emphasizes the crucial role of relations in cognition. Modern cognitive psychology emerged in the second half of this century. The “cog- nitive revolution” of the 1950s and 1960s involved not only psychology but also the allied disciplines that now contribute to cognitive science. In the 1940s the Canadian psychologist DONALD HEBB began to draw connections between cognitive processes and neural mechanisms, anticipating modern cognitive neuroscience. During World War II, many experimental psychologists (including JAMES GIBSON) were confronted with such pressing military problems as finding ways to select good pilots and train radar operators, and it turned out that the then-dominant stimulus-response theories simply had little to offer in the way of solutions. More detailed process models of human information processing were needed. After the war, DONALD BROADBENT in England developed the first such detailed model of attention. Even more importantly, Broadbent helped develop and popularize a wide range of experimental tasks in which an observer's attention is carefully controlled by having him or her perform some task, such as listening to a taped message for a particular word, and then precisely measur- ing how quickly responses can be made and what can be remembered. In the United States, William K. Estes added to the mathematical tools available for theory building and data analysis, and Saul Sternberg developed a method for decomposing reaction times into component processes using a simple recognition task. Figure 1. Psychology xlv Meanwhile, the birth of computer science provided further conceptual tools. Strict behaviorists had denounced models of internal mental processes as unscientific. How- ever, the modern digital computer provided a clear example of a device that took inputs, fed them through a complex series of internal procedures, and then produced outputs. As well as providing concrete examples of what an information-processing device could be, computers made possible the beginnings of artificial intelligence— the construction of computer programs designed to perform tasks that require intelli- gence, such as playing chess, understanding stories, or diagnosing diseases. Herbert Simon (1978 Nobel Laureate in Economics) and Allan Newell were leaders in build- ing close ties between artificial intelligence and the new cognitive psychology. It was also recognized that actual computers represent only a small class of a much larger set of theoretically possible computing devices, which had been described back in the 1940s by the brilliant mathematician ALAN TURING. Indeed, it was now possible to view the brain itself as a biological computer, and to use various real and possible computing devices as models of human cognition. Another key influence on modern cognitive psychology came from the field of linguistics. In the late 1950s work by the young linguist Noam Chomsky radically changed conceptions of the nature of human language by demonstrating that language could not be learned or understood by merely associating adjacent words, but rather required computations on abstract struc- tures that existed in the minds of the speaker and listener. The collective impact of this work in the mid-twentieth century was to provide a seminal idea that became the foundation of cognitive psychology and also cognitive science in general: the COMPUTATIONAL THEORY OF MIND, according to which human cognition is based on mental procedures that operate on abstract mental representa- tions. The nature of the COGNITIVE ARCHITECTURE has been controversial, including proposals such as PRODUCTION SYSTEMS and NEURAL NETWORKS. In particular, there has been disagreement as to whether procedures and representations are inherently separable or whether procedures actually embody representations, and whether some mental representations are abstract and amodal, rather than tied to specific perceptual systems. Nonetheless, the basic conception of biological information processing as some form of computation continues to guide psychological theories of the represen- tation and processing of information. See also ANALOGY; BARTLETT, FREDERICK; BEHAVIORISM; BINDING PROBLEM; BROADBENT, DONALD; COGNITIVE ARCHITECTURE; COGNITIVE MAPS; COMPUTA- TIONAL THEORY OF MIND; EBBINGHAUS, HERMANN; EVOLUTIONARY PSYCHOLOGY; GESTALT PERCEPTION; GESTALT PSYCHOLOGY; GIBSON, JAMES; HEBB, DONALD; HELMHOLTZ, HERMANN VON; IMAGERY; INTROSPECTION; JAMES, WILLIAM; LURIA, ALEXSANDR ROMANOVICH; MOTOR CONTROL; MOTOR LEARNING; NEURAL NET- WORKS; NEWELL, ALAN; PIAGET, JEAN; PROBLEM SOLVING; PRODUCTION SYSTEMS; PSYCHOPHYSICS; SCHEMATA; SCIENTIFIC THINKING AND ITS DEVELOPMENT; SELF; SIMILARITY; TURING, ALAN; VYGOTSKY, LEV; WORKING MEMORY; WUNDT, WILHELM 3 The Science of Information Processing In broad strokes, an intelligent organism operates in a perception-action cycle (Neisser 1967), taking in sensory information from the environment, performing internal computations on it, and using the results of the computation to guide the selection and execution of goal-directed actions. The initial sensory input is provided by separate sensory systems, including smell, taste, haptic perception, and audition. The most sophisticated sensory system in primates is vision (see MID-LEVEL VISION; HIGH-LEVEL VISION), which includes complex specialized subsystems for DEPTH PER- CEPTION, SHAPE PERCEPTION, LIGHTNESS PERCEPTION, and COLOR VISION. The interpretation of sensory inputs begins with FEATURE DETECTORS that respond selectively to relatively elementary aspects of the stimulus (e.g., lines at specific orienta- tions in the visual field, or phonetic cues in an acoustic speech signal). Some basic prop- erties of the visual system result in systematic misperceptions, or ILLUSIONS. TOP-DOWN PROCESSING IN VISION serves to integrate the local visual input with the broader context in which it occurs, including prior knowledge stored in memory. Theorists working in xlvi Psychology the tradition of Gibson emphasize that a great deal of visual information may be pro- vided by higher-order features that become available to a perceiver moving freely in a natural environment, rather than passively viewing a static image (see ECOLOGICAL PSY- CHOLOGY). In their natural context, both perception and action are guided by the AFFOR- DANCES of the environment: properties of objects that enable certain uses (e.g., the elongated shape of a stick may afford striking an object otherwise out of reach). Across all the sensory systems, psychophysics methods are used to investigate the quantitative functions relating physical inputs received by sensory systems to subjec- tive experience (e.g., the relation between luminance and perceived brightness, or between physical and subjective weight). SIGNAL DETECTION THEORY provides a sta- tistical method for measuring how accurately observers can distinguish a signal from noise under conditions of uncertainty (i.e., with limited viewing time or highly similar alternatives) in a way that separates the signal strength received from possible response bias. In addition to perceiving sensory information about objects at locations in space, animals perceive and record information about time (see TIME IN THE MIND). Knowledge about both space and time must be integrated to provide the capability for animal and HUMAN NAVIGATION in the environment. Humans and other animals are capable of forming sophisticated representations of spatial relations integrated as COGNITIVE MAPS. Some more central mental representations appear to be closely tied to perceptual systems. Humans use various forms of imagery based on visual, audi- tory and other perceptual systems to perform internal mental processes such as MEN- TAL ROTATION. The close connection between PICTORIAL ART AND VISION also reflects the links between perceptual systems and more abstract cognition. A fundamental property of biological information processing is that it is capacity- limited and therefore necessarily selective. Beginning with the seminal work of Broadbent, a great deal of work in cognitive psychology has focused on the role of attention in guiding information processing. Attention operates selectively to deter- mine what information is received by the senses, as in the case of EYE MOVEMENTS AND VISUAL ATTENTION, and also operates to direct more central information process- ing, including the operation of memory. The degree to which information requires active attention or memory resources varies, decreasing with the AUTOMATICITY of the required processing. Modern conceptions of memory maintain some version of William James’s basic distinction between primary and secondary memory. Primary memory is now usually called WORKING MEMORY, which is itself subdivided into multiple stores involving specific forms of representation, especially phonological and visuospatial codes. Working memory also includes a central executive, which provides attentional resources for strategic management of the cognitive processes involved in problem solving and other varieties of deliberative thought. Secondary or long-term memory is also viewed as involving distinct subsystems, particularly EPISODIC VS. SEMANTIC MEMORY. Each of these subsystems appears to be specialized to perform one of the two basic functions of long-term memory. One function is to store individuated repre- sentations of “what happened when” in specific contexts (episodic memory); a second function is to extract and store generalized representations of “the usual kind of thing” (semantic memory). Another key distinction, related to different types of memory measures, is between IMPLICIT VS. EXPLICIT MEMORY. In explicit tests (typically recall or recognition tests), the person is aware of the requirement to access memory. In con- trast, implicit tests (such as completing a word stem, or generating instances of a cate- gory) make no reference to any particular memory episode. Nonetheless, the influence of prior experiences may be revealed by the priming of particular responses (e.g., if the word “crocus” has recently been studied, the person is more likely to generate “crocus” when asked to list flowers, even if they do not explicitly remember having studied the word). There is evidence that implicit and explicit knowledge are based on separable neural systems. In particular, forms of amnesia caused by damage to the hippocampus and related structures typically impair explicit memory for episodes, but not implicit memory as revealed by priming measures. A striking part of human cognition is the ability to speak and comprehend lan- guage. The psychological study of language, or psycholinguistics, has a close rela- Psychology xlvii tionship to work in linguistics and on LANGUAGE ACQUISITION. The complex formal properties of language, together with its apparent ease of acquisition by very young children, have made it the focus of debates about the extent and nature of NATIVISM in cognition. COMPUTATIONAL PSYCHOLINGUISTICS is concerned with modeling the complex processes involved in language use. In modern cultures that have achieved LITERACY with the introduction of written forms of language, the process of READING lies at the interface of psycholinguistics, perception, and memory retrieval. The inti- mate relationship between language and thought, and between language and human concepts, is widely recognized but still poorly understood. The use of METAPHOR in language is related to other symbolic processes in human cognition, particularly ANALOGY and CATEGORIZATION. One of the most fundamental aspects of biological intelligence is the capacity to adaptively alter behavior. It has been clear at least from the time of William James that the adaptiveness of human behavior and the ability to achieve EXPERTISE in diverse domains is not generally the direct product of innate predispositions, but rather the result of adaptive problem solving and LEARNING SYSTEMS that operate over the lifespan. Both production systems and neural networks provide computational models of some aspects of learning, although no model has captured anything like the full range of human learning capacities. Humans as well as some other animals are able to learn by IMITATION, for example, translating visual information about the behavior of others into motor routines that allow the observer/imitator to produce comparable behavior. Many animal species are able to acquire expectancies about the environment and the consequences of the individual's actions on the basis of CONDITIONING, which enables learning of contingencies among events and actions. Conditioning appears to be a primitive form of causal induction, the process by which humans and other animals learn about the cause-effect structure of the world. Both causal knowledge and similarity relations contribute to the process of categoriza- tion, which leads to the development of categories and concepts that serve to organize knowledge. People act as if they assume the external appearances of category mem- bers are caused by hidden (and often unknown) internal properties (e.g., the appear- ance of an individual dog may be attributed to its internal biology), an assumption sometimes termed psychological ESSENTIALISM. There are important developmental influences that lead to CONCEPTUAL CHANGE over childhood. These developmental aspects of cognition are particularly important in understanding SCIENTIFIC THINKING AND ITS DEVELOPMENT. Without formal schooling, children and adults arrive at systematic beliefs that comprise NAIVE MATHEMATICS and NAIVE PHYSICS. Some of these beliefs provide the foundations for learning mathematics and physics in formal EDUCATION, but some are misconceptions that can impede learn- ing these topics in school (see also AI AND EDUCATION). Young children are prone to ANIMISM, attributing properties of people and other animals to plants and nonliving things. Rather than being an aberrant form of early thought, animism may be an early manifestation of the use of ANALOGY to make inferences and learn new cognitive struc- tures. Analogy is the process used to find systematic structural correspondences between a familiar, well-understood situation and an unfamiliar, poorly understood one, and then using the correspondences to draw plausible inferences about the less familiar case. Analogy, along with hypothesis testing and evaluation of competing explanations, plays a role in the discovery of new regularities and theories in science. In its more complex forms, learning is intimately connected to thinking and reason- ing. Humans are not only able to think, but also to think about their own cognitive processes, resulting in METACOGNITION. They can also form higher-level representa- tions, termed METAREPRESENTATION. There are major individual differences in intelli- gence as assessed by tasks that require abstract thinking. Similarly, people differ in their CREATIVITY in finding solutions to problems. Various neural disorders, such as forms of MENTAL RETARDATION and AUTISM, can impair or radically alter normal thinking abilities. Some aspects of thinking are vulnerable to disruption in later life due to the links between AGING AND COGNITION. Until the last few decades, the psychology of DEDUCTIVE REASONING was domi- nated by the view that human thinking is governed by formal rules akin to those used xlviii Psychology in LOGIC. Although some theorists continue to argue for a role for formal, content-free rules in reasoning, others have focused on the importance of content-specific rules. For example, people appear to have specialized procedures for reasoning about broad classes of pragmatically important tasks, such as understanding social relations or causal relations among events. Such pragmatic reasoning schemas (Cheng and Holyoak 1985) enable people to derive useful inferences in contexts related to impor- tant types of recurring goals. In addition, both deductive and inductive inferences may sometimes be made using various types of MENTAL MODELS, in which specific possi- ble cases are represented and manipulated (see also CASE-BASED REASONING AND ANALOGY). Much of human inference depends not on deduction, but on inductive PROBABILIS- TIC REASONING under conditions of UNCERTAINTY. Work by researchers such as AMOS TVERSKY and Daniel Kahneman has shown that everyday inductive reasoning and decision making is often based on simple JUDGMENT HEURISTICS related to ease of memory retrieval (the availability heuristic) and degree of similarity (the representa- tiveness heuristic). Although judgment heuristics are often able to produce fast and accurate responses, they can sometimes lead to errors of prediction (e.g., conflating the subjective ease of remembering instances of a class of events with their objective frequency in the world). More generally, the impressive power of human information processing has appar- ent limits. People all too often take actions that will not achieve their intended ends, and pursue short-term goals that defeat their own long-term interests. Some of these mistakes arise from motivational biases, and others from computational limitations that constrain human attention, memory, and reasoning processes. Although human cognition is fundamentally adaptive, we have no reason to suppose that “all’s for the best in this best of all possible minds.” See also AFFORDANCES; AGING AND COGNITION; AI AND EDUCATION; ANALOGY; ANIMISM; AUTISM; AUTOMATICITY; CASE-BASED REASONING AND ANALOGY; CATE- GORIZATION; COGNITIVE MAPS; COLOR VISION; CONCEPTUAL CHANGE; CONDITION- ING; CREATIVITY; DEDUCTIVE REASONING; DEPTH PERCEPTION; ECOLOGICAL PSYCHOLOGY; EDUCATION; EPISODIC VS. SEMANTIC MEMORY; ESSENTIALISM; EXPER- TISE; EYE MOVEMENTS AND VISUAL ATTENTION; FEATURE DETECTORS; HIGH-LEVEL VISION; HUMAN NAVIGATION; ILLUSIONS; IMITATION; IMPLICIT VS. EXPLICIT MEMORY; JUDGMENT HEURISTICS; LANGUAGE ACQUISITION; LEARNING SYSTEMS; LIGHTNESS PERCEPTION; LITERACY; LOGIC; MENTAL MODELS; MENTAL RETARDATION; MENTAL ROTATION; METACOGNITION; METAPHOR; METAREPRESENTATION; MID-LEVEL VISION; NAIVE MATHEMATICS; NAIVE PHYSICS; NATIVISM; PICTORIAL ART AND VISION; PROBA- BILISTIC REASONING; READING; SCIENTIFIC THINKING AND ITS DEVELOPMENT; SHAPE PERCEPTION; SIGNAL DETECTION THEORY; TIME IN THE MIND; TOP-DOWN PROCESSING IN VISION; TVERSKY, AMOS; UNCERTAINTY; WORKING MEMORY References Cheng, P. W., and K. J. Holyoak. (1985). Pragmatic reasoning schemas. Cognitive Psychology 17: 391–394. James, W. (1890). The Principles of Psychology. New York: Dover. Neisser, U. (1967). Cognitive Psychology. Englewood Cliffs, NJ: Prentice-Hall. Further Readings Anderson, J. R. (1995). Cognitive Psychology and Its Implications. 4th ed. San Francisco: W. H. Freeman. Baddeley, A. D. (1997). Human Memory: Theory and Practice. 2nd ed. Hove, Sussex: Psychology Press. Evans, J., S. E. Newstead, and R. M. J. Byrne. (1993). Human Reasoning. Mahwah, NJ: Erlbaum. Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: MIT Press. Gazzaniga, M. S. (1995). The Cognitive Neurosciences. Cambridge, MA: MIT Press. Gibson, J. J. (1979). The Ecological Approach to Visual Perception. Boston: Houghton-Mifflin. Gregory, R. L. (1997). Eye and Brain: The Psychology of Seeing. 5th ed. Princeton, NJ: Princeton University Press. Psychology xlix Holyoak, K. J., and P. Thagard. (1995). Mental Leaps: Analogy in Creative Thought. Cambridge, MA: MIT Press. James, W. (1890). Principles of Psychology. New York: Dover. Kahneman, D., P. Slovic, and A. Tversky. (1982). Judgments Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press. Keil, F. C. (1989). Concepts, Kinds, and Cognitive Development. Cambridge, MA: MIT Press. Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Imagery Debate. Cambridge, MA: MIT Press. Newell, A., and H. A. Simon. (1972.) Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. Pashler, H. (1997). The Psychology of Attention. Cambridge, MA: MIT Press. Pinker, S. (1994). The Language Instinct. New York: William Morrow. Reisberg, D. (1997). Cognition: Exploring the Science of the Mind. New York: Norton. Rumelhart, D. E., J. L. McClelland, and PDP Research Group. (1986). Parallel Distributed Process- ing: Explorations in the Microstructure of Cognition. 2 vols. Cambridge, MA: MIT Press. Smith, E. E., and D. L. Medin. (1981). Categories and Concepts. Cambridge, MA: Harvard Univer- sity Press. Sperber, D., D. Premack, and A. J. Premack. (1995). Causal Cognition: A Multidisciplinary Debate. Oxford: Clarendon Press. Tarpy, R. M. (1997). Contemporary Learning Theory and Research. New York: McGraw Hill. Tomasello, M., and J. Call. (1997). Primate Cognition. New York: Oxford University Press. Neurosciences Thomas D. Albright and Helen J. Neville 1 Cognitive Neuroscience The term alone suggests a field of study that is pregnant and full of promise. It is a large field of study, uniting concepts and techniques from many disciplines, and its boundaries are rangy and often loosely defined. At the heart of cognitive neuro- science, however, lies the fundamental question of knowledge and its representation by the brain—a relationship characterized not inappropriately by WILLIAM JAMES (1842–1910) as “the most mysterious thing in the world” (James 1890 vol. 1, 216). Cognitive neuroscience is thus a science of information processing. Viewed as such, one can identify key experimental questions and classical areas of study: How is infor- mation acquired (sensation), interpreted to confer meaning (perception and recogni- tion), stored or modified (learning and memory), used to ruminate (thinking and consciousness), to predict the future state of the environment and the consequences of action (decision making), to guide behavior (motor control), and to communicate (lan- guage)? These questions are, of course, foundational in cognitive science generally, and it is instructive to consider what distinguishes cognitive neuroscience from cogni- tive science and psychology, on the one hand, and the larger field of neuroscience, on the other. The former distinction is perhaps the fuzzier, depending heavily as it does upon how one defines cognitive science. A neurobiologist might adopt the progressive (or naive) view that the workings of the brain are the subject matter of both, and the distinction is therefore moot. But this view evidently has not prevailed (wit- ness the fact that neuroscience is but one of the subdivisions of this volume); indeed the field of cognitive science was founded upon and continues to press the distinction between software (the content of cognition) and hardware (the physical stuff, for example, the brain) upon which cognitive processes are implemented. Much has been written on this topic, and one who pokes at the distinction too hard is likely to unshelve as much dusty political discourse as true science. In any case, for present purposes, we will consider both the biological hardware and the extent to which it constrains the software, and in doing so we will discuss answers to the questions of cognitive science that are rooted in the elements of biological sys- tems. The relationship between cognitive neuroscience and the umbrella of modern neu- roscience is more straightforward and less embattled. While the former is clearly a subdivision of the latter, the questions of cognitive neuroscience lie at the root of much of neuroscience’s turf. Where distinctions are often made, they arise from the fact that cognitive neuroscience is a functional neuroscience—particular structures and signals of the nervous system are of interest inasmuch as they can be used to explain cognitive functions. There being many levels of explanation in biological systems—ranging from cellu- lar and molecular events to complex behavior—a key challenge of the field of cogni- tive neuroscience has been to identify the relationships between different levels and the train of causality. In certain limited domains, this challenge has met with spectac- ular success; in others, it is clear that the relevant concepts have only begun to take shape and the necessary experimental tools are far behind. Using examples drawn from well-developed areas of research, such as vision, memory, and language, we illustrate concepts, experimental approaches, and general principles that have emerged—and, more specifically, how the work has answered many of the informa- tion processing questions identified above. Our contemporary view of cognitive neu- roscience owes much to the heights attained by our predecessors; to appreciate the state of this field fully, it is useful to begin with a consideration of how we reached this vantage point. See also JAMES, WILLIAM lii Neurosciences 2 Origins of Cognitive Neuroscience Legend has it that the term “cognitive neuroscience” was coined by George A. Miller—the father of modern cognitive psychology—in the late 1970s over cocktails with Michael Gazzaniga at the Rockefeller University Faculty Club. That engaging tidbit of folklore nevertheless belies the ancient history of this pursuit. Indeed, identi- fication of the biological structures and events that account for our ability to acquire, store, and utilize knowledge of the world was one of the earliest goals of empirical science. The emergence of the interdisciplinary field of cognitive neuroscience that we know today, which lies squarely at the heart of twentieth-century neuroscience, can thus be traced from a common stream in antiquity, with many tributaries converg- ing in time as new concepts and techniques have evolved (Boring 1950). Localization of Function The focal point of the earliest debates on the subject—and a topic that has remained a centerpiece of cognitive neuroscience to the present day—is localization of the material source of psychological functions. With Aristotle as a notable exception (he thought the heart more important), scholars of antiquity rightly identified the brain as the seat of intellect. Relatively little effort was made to localize specific mental functions to particular brain regions until the latter part of the eighteenth century, when the anatomist Franz Josef Gall (1758–1828) unleashed the science of phrenology. Although flawed in its premises, and touted by charlatans, phrenology focused attention on the CEREBRAL CORTEX and brought the topic of localization of function to the forefront of an emerging nineteenth century physiology and psy- chology of mind (Zola-Morgan 1995). The subsequent HISTORY OF CORTICAL LOCALIZATION of function (Gross 1994a) is filled with colorful figures and weighty confrontations between localizationists and functional holists (antilocalizationists). Among the longest shadows is that cast by PAUL BROCA (1824–1880), who in 1861 reported that damage to a “speech center” in the left frontal lobe resulted in loss of speech function, and was thus responsible for the first widely cited evidence for localization of function in the cerebral cortex. An important development of a quite different nature came in the form of the Bell-Magendie law, discovered indepen- dently in the early nineteenth century by the physiologists Sir Charles Bell (1774– 1842) and François Magendie (1783–1855). This law identified the fact that sen- sory and motor nerve fibers course through different roots (dorsal and ventral, respectively) of the spinal cord. Although far from the heavily contested turf of the cerebral cortex, the concept of nerve specificity paved the way for the publication in 1838 by Johannes Muller (1801–1858) of the law of specific nerve energies, which included among its principles the proposal that nerves carrying different types of sensory information terminate in distinct brain loci, perhaps in the cerebral cortex. Persuasive though the accumulated evidence seemed at the dawn of the twentieth century, the debate between localizationists and antilocalizationists raged on for another three decades. By this time the chief experimental tool had become the “lesion method,” through which the functions of specific brain regions are inferred from the behavioral or psychological consequences of loss of the tissue in question (either by clinical causes or deliberate experimental intervention). A central player during this period was the psychologist KARL SPENCER LASHLEY (1890–1958)—often inaccurately characterized as professing strong antilocalizationist beliefs, but best known for the concept of equipotentiality and the law of mass action of brain func- tion. Lashley’s descendants include several generations of flag bearers for the local- izationist front—Carlyle Jacobsen, John Fulton, Karl Pribram, Mortimer Mishkin, Lawrence Weiskrantz, and Charles Gross, among others—who established footholds for our present understanding of the cognitive functions of the frontal and temporal lobes. These later efforts to localize cognitive functions using the lesion method were complemented by studies of the effects of electrical stimulation of the human brain on psychological states. The use of stimulation as a probe for cognitive function followed Neurosciences liii its more pragmatic application as a functional brain mapping procedure executed in preparation for surgical treatment of intractable epilepsy. The neurosurgeon WILDER PENFIELD (1891–1976) pioneered this approach in the 1930s at the legendary Mont- real Neurological Institute and, with colleagues Herbert Jasper and Brenda Milner, subsequently began to identify specific cortical substrates of language, memory, emo- tion, and perception. The years of the mid-twentieth century were quarrelsome times for the expanding field of psychology, which up until that time had provided a home for much of the work on localization of brain function. It was from this fractious environment, with inspiration from the many successful experimental applications of the lesion method and a growing link to wartime clinical populations, that the field of neuropsychology emerged—and with it the wagons were drawn up around the first science explicitly devoted to the relationship between brain and cognitive function. Early practitioners included the great Russian neuropsychologist ALEKSANDR ROMANOVICH LURIA (1902–1977) and the American behavioral neurologist NORMAN GESCHWIND (1926– 1984), both of whom promoted the localizationist cause with human case studies and focused attention on the role of connections between functionally specific brain regions. Also among the legendary figures of the early days of neuropsychology was HANS-LUKAS TEUBER (1916–1977). Renowned scientifically for his systematization of clinical neuropsychology, Teuber is perhaps best remembered for having laid the cra- dle of modern cognitive neuroscience in the 1960s MIT Psychology Department, through his inspired recruitment of an interdisciplinary faculty with a common inter- est in brain structure and function, and its relationship to complex behavior (Gross 1994b). See also BROCA, PAUL; CEREBRAL CORTEX; CORTICAL LOCALIZATION, HISTORY OF; GESCHWIND, NORMAN; LASHLEY, KARL SPENCER; LURIA, ALEXANDER ROMANOVICH; PENFIELD, WILDER; TEUBER, HANS-LUKAS Neuron Doctrine Although the earliest antecedents of modern cognitive neuroscience focused by neces- sity on the macroscopic relationship between brain and psychological function, the last 50 years have seen a shift of focus, with major emphasis placed upon local neu- ronal circuits and the causal link between the activity of individual cells and behavior. The payoff has been astonishing, but one often takes for granted the resolution of much hotly debated turf. The debates in question focused on the elemental units of nervous system structure and function. We accept these matter-of-factly to be special- ized cells known as NEURONS, but prior to the development of techniques to visualize cellular processes, their existence was mere conjecture. Thus the two opposing views of the nineteenth century were reticular theory, which held that the tissue of the brain was composed of a vast anastomosing reticulum, and neuron theory, which postulated neurons as differentiated cell types and the fundamental unit of nervous system func- tion. The ideological chasm between these camps ran deep and wide, reinforced by ties to functional holism in the case of reticular theory, and localizationism in the case of neuron theory. The deadlock broke in 1873 when CAMILLO GOLGI (1843–1926) introduced a method for selective staining of individual neurons using silver nitrate, which permitted their visualization for the first time. (Though this event followed the discovery of the microscope by approximately two centuries, it was the Golgi method’s complete staining of a minority of neurons that enabled them to be distin- guished from one another.) In consequence, the neuron doctrine was cast, and a grand stage was set for studies of differential cellular morphology, patterns of connectivity between different brain regions, biochemical analysis, and, ultimately, electrophysio- logical characterization of the behavior of individual neurons, their synaptic interac- tions, and relationship to cognition. Undisputedly, the most creative and prolific applicant of the Golgi technique was the Spanish anatomist SANTIAGO RAMÓN Y CAJAL (1852–1934), who used this new method to characterize the fine structure of the nervous system in exquisite detail. Cajal’s efforts yielded a wealth of data pointing to the existence of discrete neuronal liv Neurosciences elements. He soon emerged as a leading proponent of the neuron doctrine and subse- quently shared the 1906 Nobel Prize in physiology and medicine with Camillo Golgi. (Ironically, Golgi held vociferously to the reticular theory throughout his career.) Discovery of the existence of independent neurons led naturally to investigations of their means of communication. The fine-scale stereotyped contacts between neurons were evident to Ramón y Cajal, but it was Sir Charles Scott Sherrington (1857–1952) who, at the turn of the century, applied the term “synapses” to label them. The trans- mission of information across synapses by chemical means was demonstrated experi- mentally by Otto Loewi (1873–1961) in 1921. The next several decades saw an explosion of research on the nature of chemical synaptic transmission, including the discovery of countless putative NEUROTRANSMITTERS and their mechanisms of action through receptor activation, as well as a host of revelations regarding the molecular events that are responsible for and consequences of neurotransmitter release. These findings have provided a rich foundation for our present understanding of how neu- rons compute and store information about the world (see COMPUTING IN SINGLE NEU- RONS). The ability to label neurons facilitated two other noteworthy developments bearing on the functional organization of the brain: (1) cytoarchitectonics, which is the use of coherent regional patterns of cellular morphology in the cerebral cortex to identify candidates for functional specificity; and (2) neuroanatomical tract tracing, by which the patterns of connections between and within different brain regions are established. The practice of cytoarchitectonics began at the turn of the century and its utility was espoused most effectively by the anatomists Oscar Vogt (1870–1950), Cecile Vogt (1875–1962), and Korbinian Brodmann (1868–1918). Cytoarchitectonics never fully achieved the functional parcellation that it promised, but clear histological differences across the cerebral cortex, such as those distinguishing primary visual and motor cor- tices from surrounding tissues, added considerable reinforcement to the localizationist camp. By contrast, the tracing of neuronal connections between different regions of the brain, which became possible in the late nineteenth century with the development of a variety of specialized histological staining techniques, has been an indispensable source of knowledge regarding the flow of information through the brain and the hier- archy of processing stages. Recent years have seen the emergence of some remarkable new methods for tracing individual neuronal processes and for identifying the physio- logical efficacy of specific anatomical connections (Callaway 1998), the value of which is evidenced most beautifully by studies of the CELL TYPES AND CONNECTIONS IN THE VISUAL CORTEX. The neuron doctrine also paved the way for an understanding of the information represented by neurons via their electrical properties, which has become a cornerstone of cognitive neuroscience in the latter half of the twentieth century. The electrical nature of nervous tissue was well known (yet highly debated) by the beginning of the nineteenth century, following advancement of the theory of “animal electricity” by Luigi Galvani (1737–1798) in 1791. Subsequent work by Emil du Bois-Reymond (1818–1896), Carlo Matteucci (1811–1862), and HERMANN LUDWIG FERDINAND VON HELMHOLTZ (1821–1894) established the spreading nature of electrical potentials in nervous tissue (nerve conduction), the role of the nerve membrane in maintaining and propagating an electrical charge (“wave of negativity”), and the velocity of nervous conduction. It was in the 1920s that Lord Edgar Douglas Adrian (1889–1977), using new cathode ray tube and amplification technology, developed the means to record “action potentials” from single neurons. Through this means, Adrian discovered the “all-or-nothing property” of nerve conduction via action potentials and demonstrated that action potential frequency is the currency of information transfer by neurons. Because of the fundamental importance of these discoveries, Adrian shared the 1932 Nobel Prize in physiology and medicine with Sherrington. Not long afterward, the Finnish physiologist Ragnar Granit developed techniques for recording neuronal activity using electrodes placed on the surface of the skin (Granit discovered the elec- troretinogram, or ERG, which reflects large-scale neuronal activity in the RETINA). These techniques became the foundation for non-invasive measurements of brain Neurosciences lv activity (see ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC EVOKED FIELDS), which have played a central role in human cognitive neuroscience over the past 50 years. With technology for SINGLE-NEURON RECORDING and large-scale electrophysiol- ogy safely in hand, the mid-twentieth century saw a rapid proliferation of studies of physiological response properties in the central nervous system. Sensory processing and motor control emerged as natural targets for investigation, and major emphasis was placed on understanding (1) the topographic mapping of the sensory or motor field onto central target zones (such as the retinotopic mapping in primary visual cor- tex), and (2) the specific sensory or motor events associated with changes in frequency of action potentials. Although some of the earliest and most elegant research was directed at the peripheral auditory system—culminating with Georg von Bekesy’s (1889–1972) physical model of cochlear function and an understanding of its influ- ence on AUDITORY PHYSIOLOGY—it is the visual system that has become the model for physiological investigations of information processing by neurons. The great era of single-neuron studies of visual processing began in the 1930s with the work of Haldan Keffer Hartline (1903–1983), whose recordings from the eye of the horseshoe crab (Limulus) led to the discovery of neurons that respond when stim- ulated by light and detect differences in the patterns of illumination (i.e., contrast; Hartline, Wagner, and MacNichol 1952). It was for this revolutionary advance that Hartline became a corecipient of the 1967 Nobel Prize in physiology and medicine (together with Ragnar Granit and George Wald). Single-neuron studies of the mam- malian visual system followed in the 1950s, with the work of Steven Kuffler (1913– 1980) and Horace Barlow, who recorded from retinal ganglion cells. This research led to the development of the concept of the center-surround receptive field and high- lighted the key role of spatial contrast detection in early vision (Kuffler 1953). Subse- quent experiments by Barlow and Jerome Lettvin, among others, led to the discovery of neuronal FEATURE DETECTORS for behaviorally significant sensory inputs. This set the stage for the seminal work of David Hubel and Torsten Wiesel, whose physiologi- cal investigations of visual cortex, beginning in the late 1950s, profoundly shaped our understanding of the relationship between neuronal and sensory events (Hubel and Wiesel 1977). See also AUDITORY PHYSIOLOGY; CAJAL, SANTIAGO RAMÓN Y; COMPUTING IN SIN- GLE NEURONS; ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC EVOKED FIELDS; FEA- TURE DETECTORS; GOLGI, CAMILLO; HELMHOLTZ, HERMANN LUDWIG FERDINAND VON; NEURON; NEUROTRANSMITTERS; RETINA; SINGLE-NEURON RECORDING; VISUAL CORTEX, CELL TYPES AND CONNECTIONS IN Sensation, Association, Perception, and Meaning The rise of neuroscience from its fledgling origins in the nineteenth century was par- alleled by the growth of experimental psychology and its embracement of sensation and perception as primary subject matter. The origins of experimental psychology as a scientific discipline coincided, in turn, with the convergence and refinement of views on the nature of the difference between sensation and perception. These views, which began to take their modern shape with the concept of “associationism” in the empiri- cist philosophy of John Locke (1632–1704), served to focus attention on the extrac- tion of meaning from sensory events and, not surprisingly, lie at the core of much twentieth century cognitive neuroscience. The proposition that things perceived cannot reflect directly the material of the external world, but rather depend upon the states of the sense organs and the interme- diary nerves, is as old as rational empiricism itself. Locke’s contribution to this topic was simply that meaning—knowledge of the world, functional relations between sen- sations, nee perception—is born from an association of “ideas,” of which sensation was the primary source. The concept was developed further by George Berkeley (1685–1753) in his “theory of objects,” according to which a sensation has meaning— that is, a reference to an external material source—only via the context of its relation- ship to other sensations. This associationism was a principal undercurrent of Scottish and English philosophy for the next two centuries, the concepts refined and the debate lvi Neurosciences further fueled by the writings of James Mill and, most particularly, John Stuart Mill. It was the latter who defined the “laws of association” between elemental sensations, and offered the useful dictum that perception is the belief in the “permanent possibili- ties of sensation.” By so doing, Mill bridged the gulf between the ephemeral quality of sensations and the permanence of objects and our experience of them: it is the link between present sensations and those known to be possible (from past experience) that allows us to perceive the enduring structural and relational qualities of the exter- nal world. In the mid-nineteenth century the banner of associationism was passed from phi- losophy of mind to the emerging German school of experimental psychology, which numbered among its masters Gustav Fechner (1801–1887), Helmholtz, WILHELM WUNDT (1832–1920), and the English-American disciple of that tradition Edward Titchener (1867–1927). Fechner’s principal contribution in this domain was the intro- duction of a systematic scientific methodology to a topic that had before that been solely the province of philosophers and a target of introspection. Fechner's Elements of Psychophysics, published in 1860, founded an “exact science of the functional rela- tionship . . . between body and mind,” based on the assumption that the relationship between brain and perception could be measured experimentally as the relationship between a stimulus and the sensation it gives rise to. PSYCHOPHYSICS thus provided the new nineteenth-century psychology with tools of a rigorous science and has subse- quently become a mainstay of modern cognitive neuroscience. It was during this move toward quantification and systematization that Helmholtz upheld the prevailing associationist view of objects as sensations bound together through experience and memory, and he advanced the concept of unconscious inference to account for the attribution of perceptions to specific environmental causes. Wundt pressed further with the objectification and deconstruction of psychological reality by spelling out the concept—implicit in the manifestoes of his associationist predecessors—of element- ism. Although Wundt surely believed that the meaning of sensory events lay in the relationship between them, elementism held that any complex association of sensa- tions—any perception—was reducible to the sensory elements themselves. Titchener echoed the Wundtian view and elaborated upon the critical role of context in the asso- ciative extraction of meaning from sensation. It was largely in response to this doctrine of elementism, its spreading influence, and its corrupt reductionistic account of perceptual experience that GESTALT PSY- CHOLOGY was born in the late nineteenth century. In simplest terms, the Gestalt theo- rists, led by the venerable trio of Max Wertheimer (1880–1943), Wolfgang Kohler (1887–1967), and Kurt Koffka (1886–1941), insisted—and backed up their insistence with innumerable compelling demonstrations—that our phenomenal experience of objects, which includes an appreciation of their meanings and functions, is not gener- ally reducible to a set of elemental sensations and the relationships between them. Moreover, rather than accepting the received wisdom that perception amounts to an inference about the world drawn from the associations between sensations, the Gestalt theorists held the converse to be true: perception is native experience and efforts to identify the underlying sensory elements are necessarily inferential (Koffka 1935). In spite of other flaws and peculiarities of the broad-ranging Gestalt psychology, this holistic view of perception, its distinction from sensation, and the nature of meaning, has become a central theme of modern cognitive neuroscience. At the time the early associationist doctrine was being formed, there emerged a physiological counterpart in the form of Johannes Muller’s (1801–1858) law of spe- cific nerve energies, which gave rise in turn to the concept of specific fiber energies, and, ultimately, our twentieth-century receptive fields and feature detectors. Muller’s law followed, intellectually as well as temporally, the Bell-Magendie law of distinct sensory and motor spinal roots, which set a precedent for the concept of specificity of nerve action. Muller’s law was published in his 1838 Handbook of Physiology and consisted of several principles, those most familiar being the specificity of the sensory information (Muller identified five kinds) carried by different nerves and the specific- ity of the site of termination in the brain (a principle warmly embraced by functional localizationists of the era). For present discussion, the essential principle is that “the Neurosciences lvii immediate objects of the perception of our senses are merely particular states induced in the nerves, and felt as sensations either by the nerves themselves or by the senso- rium” (Boring 1950). Muller thus sidestepped the ancient problem of the mind's access to the external world by observing that all it can hope to access is the state of its sensory nerves. Accordingly, perception of the external world is a consequence of the stable relationship between external stimuli and nerve activation, and—tailing the associationist philosophers—meaning is granted by the associative interactions between nerves carrying different types of information. The concept was elaborated further by Helmholtz and others to address the different submodalities (e.g., color vs. visual distance) and qualities (e.g., red vs. green) of information carried by different fibers, and is a tenet of contemporary sensory neurobiology and cognitive neuro- science. The further implications of associationism for an understanding of the neu- ronal basis of perception—or, more precisely, of functional knowledge of the world— are profound and, as we shall see, many of the nineteenth-century debates on the topic are being replayed in the courts of modern single-neuron physiology. See also GESTALT PSYCHOLOGY; PSYCHOPHYSICS; WUNDT, WILHELM 3 Cognitive Neuroscience Today And so it was from these ancient but rapidly converging lines of inquiry, with the blush still on the cheek of a young cognitive science, that the modern era of cognitive neuroscience began. The field continues to ride a groundswell of optimism borne by new experimental tools and concepts—particularly single-cell electrophysiology, functional brain imaging, molecular genetic manipulations, and neuronal computa- tion—and the access they have offered to neuronal operations underlying cognition. The current state of the field and its promise of riches untapped can be summarized through a survey of the processes involved in the acquisition, storage, and use of information by the nervous system: sensation, perception, decision formation, motor control, memory, language, emotions, and consciousness. Sensation We acquire knowledge of the world through our senses. Not surprisingly, sensory pro- cesses are among the most thoroughly studied in cognitive neuroscience. Systematic explorations of these processes originated in two domains. The first consisted of investigations of the physical nature of the sensory stimuli in question, such as the wave nature of light and sound. Sir Isaac Newton’s (1642–1727) Optiks is an exem- plar of this approach. The second involved studies of the anatomy of the peripheral sense organs, with attention given to the manner in which anatomical features pre- pared the physical stimulus for sensory transduction. Von Bekesy’s beautiful studies of the structural features of the cochlea and the relation of those features to the neu- ronal frequency coding of sound is a classic example (for which he was awarded the 1961 Nobel Prize in physiology and medicine). Our present understanding of the neu- ronal bases of sensation was further enabled by three major developments: (1) estab- lishment of the neuron doctrine, with attendant anatomical and physiological studies of neurons; (2) systematization of behavioral studies of sensation, made possible through the development of psychophysics; and (3) advancement of sophisticated the- ories of neuronal function, as embodied by the discipline of COMPUTATIONAL NEURO- SCIENCE. For a variety of reasons, vision has emerged as the model for studies of sensory processing, although many fundamental principles of sensory processing are conserved across modalities. Initial acquisition of information about the world, by all sensory modalities, begins with a process known as transduction, by which forms of physical energy (e.g., pho- tons) alter the electrical state of a sensory neuron. In the case of vision, phototrans- duction occurs in the RETINA, which is a specialized sheet-like neural network with a regular repeating structure. In addition to its role in transduction, the retina also func- tions in the initial detection of spatial and temporal contrast (Enroth-Cugell and Rob- son 1966; Kaplan and Shapley 1986) and contains specialized neurons that subserve lviii Neurosciences (see also COLOR, NEUROPHYSIOLOGY OF). The outputs of the retina are COLOR VISION carried by a variety of ganglion cell types to several distinct termination sites in the central nervous system. One of the largest projections forms the “geniculostriate” pathway, which is known to be critical for normal visual function in primates. This pathway ascends to the cerebral cortex by way of the lateral geniculate nucleus of the THALAMUS. The cerebral cortex itself has been a major focus of study during the past forty years of vision research (and sensory research of all types). The entry point for ascending visual information is via primary visual cortex, otherwise known as striate cortex or area V1, which lies on the posterior pole (the occipital lobe) of the cerebral cortex in primates. The pioneering studies of V1 by Hubel and Wiesel (1977) estab- lished the form in which visual information is represented by the activity of single neurons and the spatial arrangement of these representations within the cortical man- tle (“functional architecture”). With the development of increasingly sophisticated techniques, our understanding of cortical VISUAL ANATOMY AND PHYSIOLOGY, and their relationships to sensory experience, has been refined considerably. Several gen- eral principles have emerged: Receptive Field This is an operationally defined attribute of a sensory neuron, origi- nally offered by the physiologist Haldan Keffer Hartline, which refers to the portion of the sensory field that, when stimulated, elicits a change in the electrical state of the cell. More generally, the receptive field is a characterization of the filter properties of a sensory neuron, which are commonly multidimensional and include selectivity for parameters such as spatial position, intensity, and frequency of the physical stimulus. Receptive field characteristics thus contribute to an understanding of the information represented by the brain, and are often cited as evidence for the role of a neuron in specific perceptual and cognitive functions. Contrast Detection The elemental sensory operation, that is, one carried out by all receptive fields—is detection of spatial or temporal variation in the incoming signal. It goes without saying that if there are no environmental changes over space and time, then nothing in the input is worthy of detection. Indeed, under such constant conditions sensory neurons quickly adapt. The result is a demonstrable loss of sensation—such as “snow blindness”—that occurs even though there may be energy continually imping- ing on the receptor surface. On the other hand, contrast along some sensory dimension indicates a change in the environment, which may in turn be a call for action. All sen- sory modalities have evolved mechanisms for detection of such changes. Topographic Organization Representation of spatial patterns of activation within a sensory field is a key feature of visual, auditory, and tactile senses, which serves the behavioral goals of locomotor navigation and object recognition. Such representations are achieved for these modalities, in part, by topographically organized neuronal maps. In the visual system, for example, the retinal projection onto the lateral genicu- late nucleus of the thalamus possesses a high degree of spatial order, such that neurons with spatially adjacent receptive fields lie adjacent to one another in the brain. Similar visuotopic maps are seen in primary visual cortex and in several successively higher levels of processing (e.g., Gattass, Sousa, and Covey 1985). These maps are com- monly distorted relative to the sensory field, such that, in the case of vision, the num- bers of neurons representing the central portion of the visual field greatly exceed those representing the visual periphery. These variations in “magnification factor” coincide with (and presumably underlie) variations in the observer's resolving power and sensi- tivity. Modular and Columnar Organization The proposal that COLUMNS AND MODULES form the basis for functional organization in the sensory neocortex is a natural exten- sion of the nineteenth-century concept of localization of function. The 1970s and 1980s saw a dramatic rise in the use of electrophysiological and anatomical tools to subdivide sensory cortices—particularly visual cortex—into distinct functional mod- Neurosciences lix ules. At the present time, evidence indicates that the visual cortex of monkeys is com- posed of over thirty such regions, including the well-known and heavily studied areas V1, V2, V3, V4, MT, and IT, as well as some rather more obscure and equivocal des- ignations (Felleman and Van Essen 1991). These efforts to reveal order in heterogene- ity have been reinforced by the appealing computational view (e.g., Marr 1982) that larger operations (such as seeing) can be subdivided and assigned to dedicated task- specific modules (such as ones devoted to visual motion or color processing, for example). The latter argument also dovetails nicely with the nineteenth-century con- cept of elementism, the coincidence of which inspired a fevered effort to identify visual areas that process specific sensory “elements.” Although this view appears to be supported by physiological evidence for specialized response properties in some visual areas—such as a preponderance of motion-sensitive neurons in area MT (Albright 1993) and color-sensitive neurons in area V4 (Schein and Desimone 1990)—the truth is that very little is yet known of the unique contributions of most other cortical visual areas. Modular organization of sensory cortex also occurs at a finer spatial scale, in the form of regional variations in neuronal response properties and anatomical connec- tions, which are commonly referred to as columns, patches, blobs, and stripes. The existence of a column-like anatomical substructure in the cerebral cortex has been known since the early twentieth century, following the work of Ramón y Cajal, Con- stantin von Economo (1876–1931), and Rafael Lorente de Nó. It was the latter who first suggested that this characteristic structure may have some functional significance (Lorento de Nó 1938). The concept of modular functional organization was later expanded upon by the physiologist Vernon B. Mountcastle (1957), who obtained the first evidence for columnar function through his investigations of the primate soma- tosensory system, and offered this as a general principle of cortical organization. The most well known examples of modular organization of the sort predicted by Mount- castle are the columnar systems for contour orientation and ocular dominance discov- ered in primary visual cortex in the 1960s by David Hubel and Torsten Wiesel (1968). Additional evidence for functional columns and for the veracity of Mountcastle’s dic- tum has come from studies of higher visual areas, such as area MT (Albright, Desi- mone, and Gross 1984) and the inferior temporal cortex (Tanaka 1997). Other investigations have demonstrated that modular representations are not limited to strict columnar forms (Born and Tootell 1993; Livingstone and Hubel 1984) and can exist as relatively large cortical zones in which there is a common feature to the neuronal representation of sensory information (such as clusters of cells that exhibit a greater degree of selectivity for color, for example). The high incidence of columnar structures leads one to wonder why they exist. One line of argument, implicit in Mountcastle’s original hypothesis, is based on the need for adequate “coverage”—that is, nesting the representation of one variable (such as preferred orientation of a visual contour) across changes in another (such as the topo- graphic representation of the visual field)—which makes good computational sense and has received considerable empirical support (Hubel and Wiesel 1977). Other arguments include those based on developmental constraints (Swindale 1980; Miller 1994; Goodhill 1997) and computational advantages afforded by representation of sensory features in a regular periodic structure (see COMPUTATIONAL NEUROANAT- OMY; Schwartz 1980). Hierarchical Processing A consistent organizational feature of sensory systems is the presence of multiple hierarchically organized processing stages, through which incoming sensory information is represented in increasingly complex or abstract forms. The existence of multiple stages has been demonstrated by anatomical studies, and the nature of the representation at each stage has commonly been revealed through electrophysiological analysis of sensory response properties. As we have seen for the visual system, the first stage of processing beyond transduction of the physical stimulus is one in which a simple abstraction of light intensity is rendered, namely a representation of luminance contrast. Likewise, the outcome of processing in primary visual cortex is, in part, a representation of image contours—formed, it is believed, by lx Neurosciences a convergence of inputs from contrast-detecting neurons at earlier stages (Hubel and Wiesel 1962). At successively higher stages of processing, information is combined to form representations of even greater complexity, such that, for example, at the pinna- cle of the pathway for visual pattern processing—a visual area known as inferior tem- poral (IT) cortex—individual neurons encode complex, behaviorally significant objects, such as faces (see FACE RECOGNITION). Parallel Processing In addition to multiple serial processing stages, the visual sys- tem is known to be organized in parallel streams. Incoming information of different types is channeled through a variety of VISUAL PROCESSING STREAMS, such that the output of each serves a unique function. This type of channeling occurs on several scales, the grossest of which is manifested as multiple retinal projections (typically six) to different brain regions. As we have noted, it is the geniculostriate projection that serves pattern vision in mammals. The similarly massive retinal projection to the midbrain superior colliculus (the “tectofugal” pathway) is known to play a role in ori- enting responses, OCULOMOTOR CONTROL, and MULTISENSORY INTEGRATION. Other pathways include a retinal projection to the hypothalamus, which contributes to the entrainment of circadian rhythms by natural light cycles. Finer scale channeling of visual information is also known to exist, particularly in the case of the geniculostriate pathway (Shapley 1990). Both anatomical and physio- logical evidence (Perry, Oehler, and Cowey 1984; Kaplan and Shapley 1986) from early stages of visual processing support the existence of at least three subdivisions of this pathway, known as parvocellular, magnocellular, and the more recently identified koniocellular (Hendry and Yoshioka 1994). Each of these subdivisions is known to convey a unique spectrum of retinal image information and to maintain that informa- tion in a largely segregated form at least as far into the system as primary visual cortex (Livingstone and Hubel 1988). Beyond V1, the ascending anatomical projections fall into two distinct streams, one of which descends ventrally into the temporal lobe, while the other courses dor- sally to the parietal lobe. Analyses of the behavioral effects of lesions, as well as electrophysiological studies of neuronal response properties, have led to the hypoth- esis (Ungerleider and Mishkin 1982) that the ventral stream represents information about form and the properties of visual surfaces (such as their color or TEXTURE)— and is thus termed the “what” pathway—while the dorsal stream represents infor- mation regarding motion, distance, and the spatial relations between environmental surfaces—the so-called “where” pathway. The precise relationship, if any, between the early-stage channels (magno, parvo, and konio) and these higher cortical streams has been a rich source of debate and controversy over the past decade, and the answers remain far from clear (Livingstone and Hubel 1988; Merigan and Maunsell 1993). See also COLOR, NEUROPHYSIOLOGY OF; COLOR VISION; COLUMNS AND MODULES; COMPUTATIONAL NEUROANATOMY; COMPUTATIONAL NEUROSCIENCE; FACE RECOG- NITION; MULTISENSORY INTEGRATION; OCULOMOTOR CONTROL; RETINA; TEXTURE; THALAMUS; VISUAL ANATOMY AND PHYSIOLOGY; VISUAL PROCESSING STREAMS Perception Perception reflects the ability to derive meaning from sensory experience, in the form of information about structure and causality in the perceiver's environment, and of the sort necessary to guide behavior. Operationally, we can distinguish sensation from perception by the nature of the internal representations: the former encode the physi- cal properties of the proximal sensory stimulus (the retinal image, in the case of vision), and the latter reflect the world that likely gave rise to the sensory stimulus (the visual scene). Because the mapping between sensory and perceptual events is never unique—multiple scenes can cause the same retinal image—perception is necessarily an inference about the probable causes of sensation. As we have seen, the standard approach to understanding the information repre- sented by sensory neurons, which has evolved over the past fifty years, is to measure Neurosciences lxi the correlation between a feature of the neuronal response (typically magnitude) and some physical parameter of a sensory stimulus (such as the wavelength of light or the orientation of a contour). Because the perceptual interpretation of a sensory event is necessarily context-dependent, this approach alone is capable of revealing little, if anything, about the relationship between neuronal events and perceptual state. There are, however, some basic variations on this approach that have led to increased under- standing of the neuronal bases of perception. Experimental Approaches to the Neuronal Bases of Perception Origins of a Neuron Doctrine for Perceptual Psychology The first strategy involves evaluation of neuronal responses to visual stimuli that consist of complex objects of behavioral significance. The logic behind this approach is that if neurons are found to be selective for such stimuli, they may be best viewed as representing something of perceptual meaning rather than merely coincidentally selective for the collection of sensory features. The early studies of “bug detectors” in the frog visual system by Lettvin and colleagues (Lettvin, Maturana, MCCULLOCH, and PITTS 1959) exemplify this approach and have led to fully articulated views on the subject, including the concept of the “gnostic unit” advanced by Jerzy Konorski (1967) and the “cardinal cell” hypothesis from Barlow's (1972) classic “Neuron Doctrine for Perceptual Psy- chology.” Additional evidence in support of this concept came from the work of Charles Gross in the 1960s and 1970s, in the extraordinary form of cortical cells selective for faces and hands (Gross, Bender, and Rocha-Miranda 1969; Desimone et al. 1984). Although the suggestion that perceptual experience may be rooted in the activity of single neurons or small neuronal ensembles has been decried, in part, on the grounds that the number of possible percepts greatly exceeds the number of avail- able neurons, and is often ridiculed as the “grandmother-cell” hypothesis, the evi- dence supporting neuronal representations for visual patterns of paramount behavioral significance, such as faces, is now considerable (Desimone 1991; Rolls 1992). Although a step in the right direction, the problem with this general approach is that it relies heavily upon assumptions about how the represented information is used. If a cell is activated by a face, and only a face, then it seems likely that the cell contrib- utes directly to the perceptually meaningful experience of face recognition rather than simply representing a collection of sensory features (Desimone et al. 1984). To some, that distinction is unsatisfactorily vague, and it is, in any case, impossible to prove that a cell only responds to a face. An alternative approach that has proved quite successful in recent years is one in which an effort is made to directly relate neuronal and percep- tual events. Neuronal Discriminability Predicts Perceptual Discriminability In the last quarter of the twentieth century, the marriage of single-neuron recording with visual psycho- physics has yielded one of the dominant experimental paradigms of cognitive neuro- science, through which it has become possible to explain behavioral performance on a perceptual task in terms of the discriminative capacity of sensory neurons. The ear- liest effort of this type was a study of tactile discrimination conducted by Vernon Mountcastle in the 1960s (Mountcastle et al. 1967). In this study, thresholds for behavioral discrimination performance were directly compared to neuronal thresh- olds for the same stimulus set. A later study by Tolhurst, Movshon, and Dean (1983) introduced techniques from SIGNAL DETECTION THEORY that allowed more rigorous quantification of the discriminative capacity of neurons and thus facilitated neuronal- perceptual comparisons. Several other studies over the past ten years have signifi- cantly advanced this cause (e.g., Dobkins and Albright 1995), but the most direct approach has been that adopted by William Newsome and colleagues (e.g., News- ome, Britten, and Movshon 1989). In this paradigm, behavioral and neuronal events are measured simultaneously in response to a sensory stimulus, yielding by brute force some of the strongest evidence to date for neural substrates of perceptual dis- criminability. lxii Neurosciences Decoupling Sensation and Perception A somewhat subtler approach has been forged by exploiting the natural ambiguity between sensory events and perceptual experience (see ILLUSIONS). This ambiguity is manifested in two general forms: (1) single sensory events that elicit multiple distinct percepts, a phenomenon commonly known as “perceptual metastability,” and (2) multiple sensory events—”sensory syn- onyms”—that elicit the same perceptual state. Both of these situations, which are ubiquitous in normal experience, afford opportunities to experimentally decouple sen- sation and perception. The first form of sensory-perceptual ambiguity (perceptual metastability) is a natu- ral consequence of the indeterminate mapping between a sensory signal and the phys- ical events that gave rise to it. A classic and familiar example is the Necker Cube, in which the three-dimensional interpretation—the observer's inference about visual scene structure—periodically reverses despite the fact that the retinal image remains unchanged. Logothetis and colleagues (Logothetis and Schall 1989) have used a form of perceptual metastability known as binocular rivalry to demonstrate the existence of classes of cortical neurons that parallel changes in perceptual state in the face of con- stant retinal inputs. The second type of sensory-perceptual ambiguity, in which multiple sensory images give rise to the same percept, is perhaps the more common. Such effects are termed perceptual constancies, and they reflect efforts by sensory systems to recon- struct behaviorally significant attributes of the world in the face of variation along irrelevant sensory dimensions. Size constancy—the invariance of perceived size of an object across different retinal sizes—and brightness or color constancy—the invariance of perceived reflectance or color of a surface in the presence of illumina- tion changes—are classic examples. These perceptual constancies suggest an under- lying neuronal invariance across specific image changes. Several examples of neuronal constancies have been reported, including invariant representations of direction of motion and shape across different cues for form (Albright 1992; Sary et al. 1995). Contextual Influences on Perception and its Neuronal Bases One of the most prom- ising new approaches to the neuronal bases of perception is founded on the use of con- textual manipulations to influence the perceptual interpretation of an image feature. As we have seen, the contextual dependence of perception is scarcely a new finding, but contextual manipulations have been explicitly avoided in traditional physiological approaches to sensory coding. As a consequence, most existing data do not reveal whether and to what extent the neuronal representation of an image feature is context dependent. Gene Stoner, Thomas Albright, and colleagues have pioneered the use of contextual manipulations in studies of the neuronal basis of the PERCEPTION OF MOTION (e.g., Stoner and Albright 1992, 1993). The results of these studies demon- strate that context can alter neuronal filter properties in a manner that predictably par- allels its influence on perception. Stages of Perceptual Representation Several lines of evidence suggest that there may be multiple steps along the path to extracting meaning from sensory signals. These steps are best illustrated by examples drawn from studies of visual processing. Sensation itself is commonly identified with “early” or “low-level vision.” Additional steps are as follows. Mid-Level Vision This step involves a reconstruction of the spatial relationships between environmental surfaces. It is implicit in the accounts of the perceptual psy- chologist JAMES JEROME GIBSON (1904–1979), present in the computational approach of DAVID MARR (1945–1980), and encompassed by what has recently come to be known as MID-LEVEL VISION. Essential features of this processing stage include a dependence upon proximal sensory context to establish surface relationships (see SURFACE PERCEPTION) and a relative lack of dependence upon prior experience. By establishing environmental STRUCTURE FROM VISUAL INFORMATION SOURCES, mid- level vision thus invests sensory events with some measure of meaning. A clear exam- Neurosciences lxiii ple of this type of visual processing is found in the phenomenon of perceptual TRANS- PARENCY (Metelli 1974) and the related topic of LIGHTNESS PERCEPTION. Physiological studies of the response properties of neurons at mid-levels of the corti- cal hierarchy have yielded results consistent with a mid-level representation (e.g., Stoner and Albright 1992). High-Level Vision HIGH-LEVEL VISION is a loosely defined processing stage, but one that includes a broad leap in the assignment of meaning to sensory events—namely identification and classification on the basis of previous experience with the world. It is through this process that recognition of objects occurs (see OBJECT RECOGNITION, HUMAN NEUROPSYCHOLOGY; OBJECT RECOGNITION, ANIMAL STUDIES; and VISUAL OBJECT RECOGNITION, AI), as well as assignment of affect and semantic categoriza- tion. This stage thus constitutes a bridge between sensory processing and MEMORY. Physiological and neuropsychological studies of the primate temporal lobe have dem- onstrated an essential contribution of this region to object recognition (Gross 1973; Gross et al. 1985). See also GIBSON, JAMES JEROME; HIGH-LEVEL VISION; ILLUSIONS; LIGHTNESS PER- CEPTION; MARR, DAVID; MCCULLOCH, WARREN S.; MEMORY; MID-LEVEL VISION; MOTION, PERCEPTION OF; OBJECT RECOGNITION, ANIMAL STUDIES; OBJECT RECOGNI- TION, HUMAN NEUROPSYCHOLOGY; PITTS, WALTER; SIGNAL DETECTION THEORY; STRUCTURE FROM VISUAL INFORMATION SOURCES; SURFACE PERCEPTION; TRANSPAR- ENCY; VISUAL OBJECT RECOGNITION, AI Sensory-Perceptual Plasticity The processes by which information is acquired and interpreted by the brain are mod- ifiable throughout life and on many time scales. Although plasticity of the sort that occurs during brain development and that which underlies changes in the sensitivity of mature sensory systems may arise from similar mechanisms, it is convenient to con- sider them separately. Developmental Changes The development of the mammalian nervous system is a complex, multistaged pro- cess that extends from embryogenesis through early postnatal life. This process begins with determination of the fate of precursor cells such that a subset becomes neurons. This is followed by cell division and proliferation, and by differentiation of cells into different types of neurons. The patterned brain then begins to take shape as cells migrate to destinations appropriate for their assigned functions. Finally, neurons begin to extend processes and to make synaptic connections with one another. These connections are sculpted and pruned over a lengthy postnatal period. A central tenet of modern neuroscience is that these final stages of NEURAL DEVELOPMENT corre- spond to specific stages of COGNITIVE DEVELOPMENT. These stages are known as “critical periods,” and they are characterized by an extraordinary degree of plasticity in the formation of connections and cognitive functions. Although critical periods for development are known to exist for a wide range of cognitive functions such as sensory processing, motor control, and language, they have been studied most intensively in the context of the mammalian visual system. These studies have included investigations of the timing, necessary condi- tions for, and mechanisms of (1) PERCEPTUAL DEVELOPMENT (e.g., Teller 1997), (2) formation of appropriate anatomical connections (e.g., Katz and Shatz 1996), and (3) neuronal representations of sensory stimuli (e.g., Hubel, Wiesel, and LeVay 1977). The general view that has emerged is that the newborn brain pos- sesses a considerable degree of order, but that sensory experience is essential dur- ing critical periods to maintain that order and to fine-tune it to achieve optimal performance in adulthood. These principles obviously have profound implications for clinical practice and social policy. Efforts to further understand the cellular mechanisms of developmental plasticity, their relevance to other facets of cogni- tive function, the relative contributions of genes and experience, and routes of lxiv Neurosciences clinical intervention, are all among the most important topics for the future of cog- nitive neuroscience. Dynamic Control of Sensitivity in the Mature Brain Mature sensory systems have limited information processing capacities. An exciting area of research in recent years has been that addressing the conditions under which processing capacity is dynamically reallocated, resulting in fluctuations in sensitivity to sensory stimuli. The characteristics of sensitivity changes are many and varied, but all serve to optimize acquisition of information in a world in which environmental fea- tures and behavioral goals are constantly in flux. The form of these changes may be broad in scope or highly stimulus-specific and task-dependent. Changes may be nearly instantaneous, or they may come about gradually through exposure to specific environmental features. Finally, sensitivity changes differ greatly in the degree to which they are influenced by stored information about the environment and the degree to which they are under voluntary control. Studies of the visual system reveal at least three types of sensitivity changes rep- resented by the phenomena of (1) contrast gain control, (2) attention, and (3) per- ceptual learning. All can be viewed as recalibration of incoming signals to compensate for changes in the environment, the fidelity of signal detection (such as that associated with normal aging or trauma to the sensory periphery), and behav- ioral goals. Generally speaking, neuronal gain control is the process by which the sensitivity of a neuron (or neural system) to its inputs is dynamically controlled. In that sense, all of the forms of adult plasticity discussed below are examples of gain control, although they have different dynamics and serve different functions. Contrast Gain Control A well-studied example of gain control is the invariance of perceptual sensitivity to the features of the visual world over an enormous range of lighting conditions. Evidence indicates that the limited dynamic range of responsivity of individual neurons in visual cortex is adjusted in an illumination-dependent manner (Shapley and Victor 1979), the consequence of which is a neuronal invariance that can account for the sensory invariance. It has been suggested that this scaling of neuronal sensitivity as a function of lighting conditions may be achieved by response “normal- ization,” in which the output of a cortical neuron is effectively divided by the pooled activity of a large number of other cells of the same type (Carandini, Heeger, and Movshon 1997). Attention Visual ATTENTION is, by definition, a rapidly occurring change in visual sensitivity that is selective for a specific location in space or specific stimulus features. The stimulus and mnemonic factors that influence attentional allocation have been studied for over a century (James 1890), and the underlying brain structures and events are beginning to be understood (Desimone and Duncan 1995). Much of our understanding comes from analysis of ATTENTION IN THE HUMAN BRAIN—particu- larly the effects of cortical lesions, which can selectively interfere with attentional allocation (VISUAL NEGLECT), and through electrical and magnetic recording (ERP, MEG) and imaging studies—POSITRON EMISSION TOMOGRAPHY (PET) and functional MAGNETIC RESONANCE IMAGING (fMRI). In addition, studies of ATTENTION IN THE ANIMAL BRAIN have revealed that attentional shifts are correlated with changes in the sensitivity of single neurons to sensory stimuli (Moran and Desimone 1985; Bushnell, Goldberg, and Robinson 1981; see also AUDITORY ATTENTION). Although attentional phenomena differ from contrast gain control in that they can be influenced by feed- back WORKING MEMORY as well as feedforward (sensory) signals, attentional effects can also be characterized as an expansion of the dynamic range of sensitivity, but in a manner that is selective for the attended stimuli. Perceptual Learning Both contrast gain control and visual attention are rapidly occurring and short-lived sensitivity changes. Other experiments have targeted neu- ronal events that parallel visual sensitivity changes occurring over a longer time scale, Neurosciences lxv such as those associated with the phenomenon of perceptual learning. Perceptual learning refers to improvements in discriminability along any of a variety of sensory dimensions that come with practice. Although it has long been known that the sensi- tivity of the visual system is refined in this manner during critical periods of neuronal development, recent experiments have provided tantalizing evidence of improvements in the sensitivity of neurons at early stages of processing, which parallel perceptual learning in adults (Recanzone, Schreiner, and Merzenich 1993; Gilbert 1996). See also ATTENTION; ATTENTION IN THE ANIMAL BRAIN; ATTENTION IN THE HUMAN BRAIN; AUDITORY ATTENTION; COGNITIVE DEVELOPMENT; MAGNETIC RESO- NANCE IMAGING; NEURAL DEVELOPMENT; PERCEPTUAL DEVELOPMENT; POSITRON EMISSION TOMOGRAPHY; VISUAL NEGLECT; WORKING MEMORY; WORKING MEMORY, NEURAL BASIS OF Forming a Decision to Act The meaning of many sensations can be found solely in their symbolic and experience- dependent mapping onto actions (e.g., green = go, red = stop). These mappings are commonly many-to-one or one-to-many (a whistle and a green light can both be sig- nals to “go”; conversely, a whistle may be either a signal to “go” or a call to attention, depending upon the context). The selection of a particular action from those possible at any point in time is thus a context-dependent transition between sensory processing and motor control. This transition is commonly termed the decision stage, and it has become a focus of recent electrophysiological studies of the cerebral cortex (e.g., Sha- dlen and Newsome 1996). Because of the nonunique mappings, neurons involved in making such decisions should be distinguishable from those representing sensory events by a tendency to generalize across specific features of the sensory signal. Simi- larly, the representation of the neuronal decision should be distinguishable from a motor control signal by generalization across specific motor actions. In addition, the strength of the neuronal decision signal should increase with duration of exposure to the sensory stimulus (integration time), in parallel with increasing decision confidence on the part of the observer. New data in support of some of these predictions suggests that this may be a valuable new paradigm for accessing the neuronal substrates of internal cognitive states, and for bridging studies of sensory or perceptual processing, memory, and motor control. Motor Control Incoming sensory information ultimately leads to action, and actions, in turn, are often initiated in order to acquire additional sensory information. Although MOTOR CONTROL systems have often been studied in relative isolation from sensory pro- cesses, this sensory-motor loop suggests that they are best viewed as different phases of a processing continuum. This integrated view, which seeks to understand how the nature of sensory representations influences movements, and vice-versa, is rapidly gaining acceptance. The oculomotor control system has become the model for the study of motor processes at behavioral and neuronal levels. Important research topics that have emerged from consideration of the transition from sensory processing to motor control include (1) the process by which representa- tions of space (see SPATIAL PERCEPTION) are transformed from the coordinate system of the sensory field (e.g., retinal space) to a coordinate system for action (e.g., Gra- ziano and Gross 1998) and (2) the processes by which the neuronal links between sen- sation and action are modifiable (Raymond, Lisberger, and Mauk 1996), as needed to permit MOTOR LEARNING and to compensate for degenerative sensory changes or structural changes in the motor apparatus. The brain structures involved in motor control include portions of the cerebral cor- tex, which are thought to contribute to fine voluntary motor control, as well as the BASAL GANGLIA and CEREBELLUM, which play important roles in motor learning; the superior colliculus, which is involved in sensorimotor integration, orienting responses, and oculomotor control; and a variety of brainstem motor nuclei, which convey motor signals to the appropriate effectors. lxvi Neurosciences See also BASAL GANGLIA; CEREBELLUM; MOTOR CONTROL; MOTOR LEARNING; SPATIAL PERCEPTION Learning and Memory Studies of the neuronal mechanisms that enable information about the world to be stored and retrieved for later use have a long and rich history—being, as they were, a central part of the agenda of the early functional localizationists—and now lie at the core of our modern cognitive neuroscience. Indeed, memory serves as the linchpin that binds and shapes nearly every aspect of information processing by brains, includ- ing perception, decision making, motor control, emotion, and consciousness. Memory also exists in various forms, which have been classified on the basis of their relation to other cognitive functions, the degree to which they are explicitly encoded and avail- able for use in a broad range of contexts, and their longevity. (We have already consid- ered some forms of nonexplicit memory, such as those associated with perceptual and motor learning.) Taxonomies based upon these criteria have been reviewed in detail elsewhere (e.g., Squire, Knowlton, and Musen 1993). The phenomenological and functional differences among different forms of memory suggest the existence of a variety of different brain substrates. Localization of these substrates is a major goal of modern cognitive neuroscience. Research is also clarifying the mechanisms underly- ing the oft-noted role of affective or emotional responses in memory consolidation (see MEMORY STORAGE, MODULATION OF; AMYGDALA, PRIMATE), and the loss of memory that occurs with aging (see AGING, MEMORY, AND THE BRAIN). Three current approaches (broadly defined and overlapping) to memory are among the most promising for the future of cognitive neuroscience: (1) neuropsychological and neurophysiological studies of the neuronal substrates of explicit memory in pri- mates, (2) studies of the relationship between phenomena of synaptic facilitation or depression and behavioral manifestations of learning and memory, and (3) molecular genetic studies that enable highly selective disruption of cellular structures and events thought to be involved in learning and memory. Brain Substrates of Explicit Memory in Primates The current approach to this topic has its origins in the early studies of Karl Lashley and colleagues, in which the lesion method was used to infer the contributions of spe- cific brain regions to a variety of cognitive functions, including memory. The field took a giant step forward in the 1950s with the discovery by Brenda Milner and col- leagues of the devastating effects of damage to the human temporal lobe—particularly the HIPPOCAMPUS—on human memory formation (see MEMORY, HUMAN NEUROPSY- CHOLOGY). Following that discovery, Mortimer Mishkin and colleagues began to use the lesion technique to develop an animal model of amnesia. More recently, using a similar approach, Stuart Zola, Larry Squire, and colleagues have further localized the neuronal substrates of memory consolidation in the primate temporal lobe (see MEM- ORY, ANIMAL STUDIES). Electrophysiological studies of the contributions of individual cortical neurons to memory began in the 1970s with the work of Charles Gross and Joaquin Fuster. The logic behind this approach is that by examining neuronal responses of an animal engaged in a standard memory task (e.g., match-to-sample: determine whether a sam- ple stimulus corresponds to a previously viewed cue stimulus), one can distinguish the components of the response that reflect memory from those that are sensory in nature. Subsequent electrophysiological studies by Robert Desimone and Patricia Goldman- Rakic, among others, have provided some of the strongest evidence for single-cell substrates of working memory in the primate temporal and frontal lobes. These tradi- tional approaches to explicit memory formation in primates are now being comple- mented by brain imaging studies in humans. Do Synaptic Changes Mediate Memory Formation? The phenomenon of LONG-TERM POTENTIATION (LTP), originally discovered in the 1970s—and the related phenomenon of long-term depression—consists of physiolog- ically measurable changes in the strength of synaptic connections between neurons. Neurosciences lxvii LTP is commonly produced in the laboratory by coincident activation of pre- and post-synaptic neurons, in a manner consistent with the predictions of DONALD O. HEBB (1904–1985), and it is often dependent upon activation of the postsynaptic NMDA glutamate receptor. Because a change in synaptic efficacy could, in principle, underlie behavioral manifestations of learning and memory, and because LTP is commonly seen in brain structures that have been implicated in memory formation (such as the hippocampus, cerebellum, and cerebral cortex) by other evidence, it is considered a likely mechanism for memory formation. Attempts to test that hypothesis have led to one of the most exciting new approaches to memory. From Genes to Behavior: A Molecular Genetic Approach to Memory The knowledge that the NMDA receptor is responsible for many forms of LTP, in conjunction with the hypothesis that LTP underlies memory formation, led to the prediction that memory formation should be disrupted by elimination of NMDA receptors. The latter can be accomplished in mice by engineering genetic mutations that selectively knock out the NMDA receptor, although this technique has been problematic because it has been difficult to constrain the effects to specific brain regions and over specific periods of time. Matthew Wilson and Susumu Tonegawa have recently overcome these obstacles by production of a knockout in which NMDA receptors are disrupted only in a subregion of the hippocampus (the CA1 layer), and only after the brain has matured. In accordance with the NMDA-mediated synaptic plasticity hypothesis, these animals were deficient on both behavioral and physiological assays of memory formation (Tonegawa et al. 1996). Further develop- ments along these lines will surely involve the ability to selectively disrupt action potential generation in specific cell populations, as well as genetic manipulations in other animals (such as monkeys). See also AGING, MEMORY, AND THE BRAIN; AMYGDALA, PRIMATE; HEBB, DONALD O.; HIPPOCAMPUS; LONG-TERM POTENTIATION; MEMORY, ANIMAL STUDIES; MEMORY, HUMAN NEUROPSYCHOLOGY; MEMORY STORAGE, MODULATION OF Language One of the first cognitive functions to be characterized from a biological perspective was language. Nineteenth-century physicians, including Broca, observed the effects of damage to different brain regions and described the asymmetrical roles of the left and right hemispheres in language production and comprehension (see HEMISPHERIC SPECIALIZATION; APHASIA; LANGUAGE, NEURAL BASIS OF). Investigators since then have discovered that different aspects of language, including the PHONOLOGY, SYN- TAX, and LEXICON, each rely on different and specific neural structures (see PHONOL- OGY, NEURAL BASIS OF; GRAMMAR, NEURAL BASIS OF; LEXICON, NEURAL BASIS OF). Modern neuroimaging techniques, including ERPs, PET, and fMRI, have confirmed the role of the classically defined language areas and point to the contribution of sev- eral other areas as well. Such studies have also identified “modality neutral” areas that are active when language is processed through any modality: auditory, written, and even sign language (see SIGN LANGUAGE AND THE BRAIN). Studies describing the effects of lesions on language can identify neural tissue that is necessary and suffi- cient for processing. An important additional perspective can be obtained from neu- roimaging studies of healthy neural tissue, which can reveal all the activity associated with language production and comprehension. Taken together the cur- rently available evidence reveals a strong bias for areas within the left hemisphere to mediate language if learned early in childhood, independently of its form or modal- ity. However, the nature of the language learned and the age of acquisition have effects on the configuration of the language systems of the brain (see BILINGUALISM AND THE BRAIN). Developmental disorders of language (see LANGUAGE IMPAIRMENT, DEVELOPMEN- TAL; DYSLEXIA) can occur in isolation or in association with other disorders and can result from deficits within any of the several different skills that are central to the per- ception and modulation of language. lxviii Neurosciences See also APHASIA; BILINGUALISM AND THE BRAIN; DYSLEXIA; GRAMMAR, NEURAL BASIS OF; HEMISPHERIC SPECIALIZATION; LANGUAGE, NEURAL BASIS OF; LANGUAGE IMPAIRMENT, DEVELOPMENTAL; LEXICON; LEXICON, NEURAL BASIS OF; PHONOLOGY; PHONOLOGY, NEURAL BASIS OF; SIGN LANGUAGE AND THE BRAIN; SYNTAX Consciousness Rediscovery of the phenomena of perception and memory without awareness has renewed research and debate on issues concerning the neural basis of CONSCIOUSNESS (see CONSCIOUSNESS, NEUROBIOLOGY OF). Some patients with cortical lesions that have rendered them blind can nonetheless indicate (by nonverbal methods) accurate perception of stimuli presented to the blind portion of the visual field (see BLIND- SIGHT). Similarly, some patients who report no memory for specific training events nonetheless demonstrate normal learning of those skills. Systematic study of visual consciousness employing several neuroimaging tools within human and nonhuman primates is being conducted to determine whether con- sciousness emerges as a property of a large collection of interacting neurons or whether it arises as a function of unique neuronal characteristics possessed by some neurons or by an activity pattern temporarily occurring within a subset of neurons (see BINDING BY NEURAL SYNCHRONY). Powerful insights into systems and cellular and molecular events critical in cogni- tion and awareness, judgment and action have come from human and animal studies of SLEEP and DREAMING. Distinct neuromodulatory effects of cholenergic and aminergic systems permit the panoply of conscious cognitive processing, evaluation, and plan- ning during waking states and decouple cognition, emotional, and mnemonic func- tions during sleep. Detailed knowledge of the neurobiology of sleep and dreaming presents an important opportunity for future studies of cognition and consciousness. See also BINDING BY NEURAL SYNCHRONY; BLINDSIGHT; CONSCIOUSNESS; CON- SCIOUSNESS, NEUROBIOLOGY OF; DREAMING; SLEEP Emotions Closely related to questions about consciousness are issues of EMOTIONS and feelings that have, until very recently, been ignored in cognitive science. Emotions sit at the interface between incoming events and preparation to respond, however, and recent studies have placed the study of emotion more centrally in the field. Animal models have provided detailed anatomical and physiological descriptions of fear responses (Armony and LeDoux 1997) and highlight the role of the amygdala and LIMBIC SYS- TEM as well as different inputs to this system (see EMOTION AND THE ANIMAL BRAIN). Studies of human patients suggest specific roles for different neural systems in the perception of potentially emotional stimuli (Adolphs et al. 1994; Hamann et al. 1996), in their appraisal, and in organizing appropriate responses to them (see EMOTION AND THE HUMAN BRAIN; PAIN). An important area for future research is to characterize the neurochemistry of emotions. The multiple physiological responses to real or imagined threats (i.e., STRESS) have been elucidated in both animal and human studies. Several of the systems most affected by stress play central roles in emotional and cognitive functions (see NEUROENDOCRINOLOGY). Early pre- and postnatal experiences play a significant role in shaping the activity of these systems and in their rate of aging. The profound role of the stress-related hormones on memory-related brain structures, including the hippocampus, and their role in regulating neural damage following strokes and seizures and in aging, make them a central object for future research in cognitive neuroscience (see AGING AND COGNITION). See also AGING AND COGNITION; EMOTION AND THE ANIMAL BRAIN; EMOTION AND THE HUMAN BRAIN; EMOTIONS; LIMBIC SYSTEM; NEUROENDOCRINOLOGY; PAIN; STRESS 4 Cognitive Neuroscience: A Promise for the Future A glance at the neuroscience entries for this volume reveals that we are amassing detailed knowledge of the highly specialized neural systems that mediate different and Neurosciences lxix specific cognitive functions. Many questions remain unanswered, however, and the applications of new experimental techniques have often raised more questions than they have answered. But such are the expansion pains of a thriving science. Among the major research goals of the next century will be to elucidate how these highly differentiated cognitive systems arise in ontogeny, the degree to which they are maturationally constrained, and the nature and the timing of the role of input from the environment in NEURAL DEVELOPMENT. This is an area where research has just begun. It is evident that there exist strong genetic constraints on the overall patterning of dif- ferent domains within the developing nervous system. Moreover, the same class of genes specify the rough segmentation of the nervous systems of both vertebrates and invertebrates. However, the information required to specify the fine differentiation and connectivity within the cortex exceeds that available in the genome. Instead, a process of selective stabilization of transiently redundant connections permits individual dif- ferences in activity and experience to organize developing cortical systems. Some brain circuits display redundant connectivity and pruning under experience only dur- ing a limited time period in development (“critical period”). These time periods are different for different species and for different functional brain systems within a spe- cies. Other brain circuits retain the ability to change under external stimulation throughout life, and this capability, which now appears more ubiquitous and long last- ing than initially imagined, is surely a substrate for adult learning, recovery of func- tion after brain damage, and PHANTOM LIMB phenomena (see also AUDITORY PLASTICITY; NEURAL PLASTICITY). A major challenge for future generations of cogni- tive neuroscientists will be to characterize and account for the markedly different extents and timecourses of biological constraints and experience-dependent modifi- ability of the developing human brain. Though the pursuit may be ancient, consider these the halcyon days of cognitive neuroscience. As we cross the threshold of the millenium, look closely as the last veil begins to fall. And bear in mind that if cognitive neuroscience fulfills its grand prom- ise, later editions of this volume may contain a section on history, into which all of the nonneuro cognitive science discussion will be swept. See also AUDITORY PLASTICITY; NEURAL DEVELOPMENT; NEURAL PLASTICITY; PHANTOM LIMB References Adolphs, R., D. Tranel, H. Damasio, and A. Damasio. (1994). Impaired recognition of emotion in facial expressions following bilateral damage to the human amygdala. Nature 372: 669–672. Albright, T. D. (1992). Form-cue invariant motion processing in primate visual cortex. Science 255: 1141–1143. Albright, T. D. (1993). Cortical processing of visual motion. In J. Wallman and F. A. Miles, Eds., Visual Motion and its Use in the Stabilization of Gaze. Amsterdam: Elsevier. Albright, T. D., R. Desimone, and C. G. Gross. (1984). Columnar organization of directionally selec- tive cells in visual area MT of the macaque. Journal of Neurophysiology 51: 16–31. Armony, J. L., and J. E. LeDoux. (1997). How the brain processes emotional information. Annals of the New York Academy of Sciences 821: 259–270. Barlow, H. B. (1972). Single units and sensation: A neuron doctrine for perceptual psychology? Per- ception 1: 371–394. Boring, E. G. (1950). A History of Experimental Psychology, 2nd ed. R. M. Elliott, Ed. New Jersey: Prentice-Hall. Born, R. T., and R. B. Tootell. (1993). Segregation of global and local motion processing in primate middle temporal visual area. Nature 357: 497–499. Bushnell, M. C., M. E. Goldberg, and D. L. Robinson. (1981). Behavioral enhancement of visual responses in monkey cerebral cortex. 1. Modulation in posterior parietal cortex related to selec- tive visual attention. Journal of Neurophysiology 46(4): 755–772. Callaway, E. M. (1998). Local circuits in primary visual cortex of the macaque monkey. Annual Review of Neuroscience 21: 47–74. Carandini, M., D. J. Heeger, and J. A. Movshon. (1997). Linearity and normalization in simple cells of the macaque primary visual cortex. Journal of Neuroscience 17(21): 8621–8644. Desimone, R. (1991). Face-selective cells in the temporal cortex of monkeys. Journal of Cognitive Neuroscience 3: 1–7. Desimone, R., T. D. Albright, C. G. Gross, and C. J. Bruce. (1984). Stimulus selective properties of inferior temporal neurons in the macaque. Journal of Neuroscience 8: 2051–2062. lxx Neurosciences Desimone, R., and J. Duncan. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience 18: 193–222. Dobkins, K. R., and T. D. Albright. (1995). Behavioral and neural effects of chromatic isoluminance in the primate visual motion system. Visual Neuroscience 12: 321–332. Enroth-Cugell, C., and J. G. Robson. (1966). The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology (London) 187: 517–552. Felleman, D. J., and D. C. Van Essen. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1: 1–47. Gattass, R., A. P. B. Sousa, and E. Cowey. (1985). Cortical visual areas of the macaque: Possible substrates for pattern recognition mechanisms. In C. Chagas, R. Gattass, and C. Gross, Eds., Pat- tern Recognition Mechanisms. Vatican City: Pontifica Academia Scientiarum, pp. 1–17. Gilbert, C. D. (1996). Learning and receptive field plasticity. Proceedings of the National Academy of Sciences USA 93: 10546–10547. Goodhill, G. J. (1997). Stimulating issues in cortical map development. Trends in Neurosciences 20: 375–376. Graziano, M. S., and C. G. Gross. (1998). Spatial maps for the control of movement. Current Opin- ion in Neurobiology 8: 195–201. Gross, C. G. (1973). Visual functions of inferotemporal cortex. In H. Autrum, R. Jung, W. Lowen- stein, D. McKay, and H.-L. Teuber, Eds., Handbook of Sensory Physiology, vol. 7, 3B. Berlin: Springer. Gross, C. G. (1994a). How inferior temporal cortex became a visual area. Cerebral Cortex 5: 455–469. Gross, C. G. (1994b). Hans-Lukas Teuber: A tribute. Cerebral Cortex 4: 451–454. Gross, C. G. (1998). Brain, Vision, Memory: Tales in the History of Neuroscience. Cambridge, MA: MIT Press. Gross, C. G., D. B. Bender, and C. E. Rocha-Miranda. (1969). Visual receptive fields of neurons in inferotemporal cortex of the monkey. Science 166: 1303–1306. Gross, C. G., R. Desimone, T. D. Albright, and E. L. Schwartz. (1985). Inferior temporal cortex and pattern recognition. In C. Chagas, Ed., Study Group on Pattern Recognition Mechanisms. Vatican City: Pontifica Academia Scientiarum, pp. 179–200. Hamann, S. B., L. Stefanacci, L. R. Squire, R. Adolphs, D. Tranel, H. Damasio, and A. Damasio. (1996). Recognizing facial emotion [letter]. Nature 379(6565): 497. Hartline, H. K., H. G. Wagner, and E. F. MacNichol, Jr. (1952). The peripheral origin of nervous activity in the visual system. Cold Spring Harbor Symposium on Quantitative Biology 17: 125– 141. Hendry, S. H., and T. Yoshioka. (1994). A neurochemically distinct third channel in the macaque dorsal lateral geniculate nucleus. Science 264(5158): 575–577. Hubel, D. H., and T. N. Wiesel. (1962). Receptive fields, binocular interaction and functional archi- tecture in the cat’s visual cortex. Journal of Physiology 160: 106–154. Hubel, D. H., and T. N. Wiesel. (1968). Receptive fields and functional architecture of monkey stri- ate cortex. Journal of Physiology 195: 215–243. Hubel, D. H., and T. N. Wiesel. (1977). Ferrier lecture. Functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society of London, Series B, Biological Sciences 198(1130): 1–59. Hubel, D. H., T. N. Wiesel, and S. LeVay. (1977). Plasticity of ocular dominance columns in monkey striate cortex. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences 278: 377–409. James, W. (1890). The Principles of Psychology, vol. 1. New York: Dover. Kaplan, E., and R. M. Shapley. (1986). The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proceedings of the National Academy of Sciences of the USA 83(8): 2755–2757. Katz, L. C., and C. J. Shatz. (1996). Synaptic activity and the construction of cortical circuits. Sci- ence 274: 1133–1138. Koffka, K. (1935). Principles of Gestalt Psychology. New York: Harcourt, Brace. Konorski, J. (1967). Integrative Activity of the Brain. Chicago: University of Chicago Press. Kuffler, S. W. (1953). Discharge patterns and functional organization of the mammalian retina. Jour- nal of Neurophysiology 16: 37–68. Lettvin, J. Y., H. R. Maturana, W. S. McCulloch, and W. H. Pitts. (1959). What the frog’s eye tells the frog’s brain. Proceedings of the Institute of Radio Engineers 47: 1940–1951. Livingstone, M. S., and D. H. Hubel. (1984). Anatomy and physiology of a color system in the pri- mate visual cortex. Journal of Neuroscience 4: 309–356. Livingstone, M. S., and D. H. Hubel. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science 240: 740–749. Logothetis, N. K., and J. D. Schall. (1989). Neuronal correlates of subjective visual perception. Sci- ence 245: 761–763. Lorento de Nó, R. (1938). Cerebral cortex: Architecture, intracortical connections, motor projec- tions. In J. F. Fulton, Ed., Physiology of the Nervous System. New York: Oxford University Press, pp. 291–339. Neurosciences lxxi Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Pro- cessing of Visual Information. San Francisco: W. H. Freeman. Merigan, W. H., and J. H. Maunsell. (1993). How parallel are the primate visual pathways? Annual Review of Neuroscience 16: 369–402. Metelli, F. (1974). The perception of transparency. Scientific American 230(4): 90–98. Miller, K. D. (1994). A model for the development of simple cell receptive fields and the ordered arrangement of orientation columns through activity-dependent competition between ON- and OFF-center inputs. Journal of Neuroscience 14: 409–441. Moran, J., and R. Desimone. (1985). Selective attention gates visual processing in the extrastriate cortex. Science 229(4715): 782–784. Mountcastle, V. B. (1957). Modality and topographic properties of single neurons of cat's somatic sensory cortex. Journal of Neurophysiology 20: 408–434. Mountcastle, V. B., W. H. Talbot, I. Darian-Smith, and H. H. Kornhuber. (1967). Neural basis of the sense of flutter-vibration. Science 155(762): 597–600. Newsome, W. T., K. H. Britten, and J. A. Movshon. (1989). Neuronal correlates of a perceptual deci- sion. Nature 341: 52–54. Perry, V. H., R. Oehler, and A. Cowey. (1984). Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience 12(4): 1101–1123. Raymond, J. L., S. G. Lisberger, and M. D. Mauk. (1996). The cerebellum: A neuronal learning machine? Science 272: 1126–1131. Recanzone, G. H., C. E. Schreiner, and M. M. Merzenich. (1993). Plasticity in the frequency repre- sentation of primary auditory cortex following discrimination training in adult owl monkeys. Journal of Neuroscience 13: 87–103. Rolls, E. T. (1992). Neurophysiological mechanisms underlying face processing within and beyond the temporal cortical visual areas. In V. Bruce, A. Cowey, W. W. Ellis, and D. I. Perrett, Eds., Pro- cessing the Facial Image. Oxford: Clarendon Press, pp. 11–21. Sary, G., R. Vogels, G. Kovacs, and G. A. Orban. (1995). Responses of monkey inferior temporal neurons to luminance-, motion-, and texture-defined gratings. Journal of Neurophysiology 73: 1341–1354. Schein, S. J., and R. Desimone. (1990). Spectral properties of V4 neurons in the macaque. Journal of Neuroscience 10: 3369–3389. Schwartz, E. L. (1980). Computational anatomy and functional architecture of striate cortex: A spa- tial mapping approach to perceptual coding. Vision Research 20: 645–669. Shadlen, M. N., and W. T. Newsome. (1996). Motion perception: Seeing and deciding. Proceedings of the National Academy of Sciences 93: 628–633. Shapley, R. (1990). Visual sensitivity and parallel retinocortical channels. Annual Review of Psy- chology 41: 635–658. Shapley, R. M., and J. D. Victor. (1979). Nonlinear spatial summation and the contrast gain control of cat retinal ganglion cells. Journal of Physiology (London) 290: 141–161. Squire, L. R., B. Knowlton, and G. Musen. (1993). The structure and organization of memory. Annual Review of Psychology 44: 453–495. Stoner, G. R., and T. D. Albright. (1992). Neural correlates of perceptual motion coherence. Nature 358: 412–414. Stoner, G. R., and T. D. Albright. (1993). Image segmentation cues in motion processing: Implica- tions for modularity in vision. Journal of Cognitive Neuroscience 5: 129–149. Swindale, N. V. (1980). A model for the formation of ocular dominance stripes. Proceedings of the Royal Society of London Series B, Biological Sciences 208(1171): 243–264. Tanaka, K. (1997). Columnar organization in the inferotemporal cortex. Cerebral Cortex 12: 469– 498. Teller, D. Y. (1997). First glances: The vision of infants. The Friedenwald lecture. Investigative Oph- thalmology and Visual Science 38: 2183–2203. Tolhurst, D. J., J. A. Movshon, and A. F. Dean. (1983). The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Research 23(8): 775–785. Tonegawa, S., J. Z. Tsien, T. J. McHugh, P. Huerta, K. I. Blum, and M. A. Wilson. (1996). Hippoc- ampal CA1-region-restricted knockout of NMDAR1 gene disrupts synaptic plasticity, place fields, and spatial learning. Cold Spring Harbor Symposium on Quantitative Biology 61: 225– 238. Ungerleider, L. G., and M. Mishkin. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, and R. J. W. Mansfield, Eds., Analysis of Visual Behavior. Cambridge, MA: MIT Press, pp. 549–586. Zola-Morgan, S. (1995). Localization of brain function: The legacy of Franz Joseph Gall. Annual Review of Neuroscience 18: 359–383. Further Readings Churchland, P. S., and T. J. Sejnowski. (1992). The Computational Brain. Cambridge, MA: MIT Press. lxxii Neurosciences Cohen, N. J., and H. Eichenbaum. (1993). Memory, Amnesia, and the Hippocampal System. Cam- bridge, MA: MIT Press. Dowling, J. E. (1987). The Retina: An Approachable Part of the Brain. Cambridge, MA: Belknap Press of Harvard University Press. Finger, S. (1994). Origins of Neuroscience: A History of Explorations into Brain Function. New York: Oxford University Press. Gazzaniga, M. S. (1995). The Cognitive Neurosciences. Cambridge, MA: MIT Press. Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin. Heilman, E. M., and E. Valenstein. (1985). Clinical Neuropsychology, 2nd ed. New York: Oxford University Press. Helmholtz, H. von. (1924). Physiological Optics. English translation by J. P. C. Southall for the Optical Society of America from the 3rd German ed., Handbuch der Physiologischen Optik (1909). Hamburg: Voss. Kanizsa, G. (1979). Organization in Vision. New York: Praeger. Kosslyn, S. M. (1994). Image and Brain. Cambridge, MA: MIT Press. LeDoux, J. E. (1996). The Emotional Brain: The Mysterious Underpinnings of Emotional Life. New York: Simon & Schuster. Milner, A. D., and M. A. Goodale. (1995). The Visual Brain in Action. New York: Oxford University Press. Penfield, W. (1975). The Mystery of the Mind. New Jersey: Princeton University Press. Posner, M. L. (1989). Foundations of Cognitive Science. Cambridge, MA: MIT Press. Squire, L. R. (1987). Memory and Brain. New York: Oxford University Press. Weiskrantz, L. (1997). Consciousness Lost and Found. New York: Oxford University Press. Computational Intelligence Michael I. Jordan and Stuart Russell There are two complementary views of artificial intelligence (AI): one as an engineer- ing discipline concerned with the creation of intelligent machines, the other as an empirical science concerned with the computational modeling of human intelligence. When the field was young, these two views were seldom distinguished. Since then, a substantial divide has opened up, with the former view dominating modern AI and the latter view characterizing much of modern cognitive science. For this reason, we have adopted the more neutral term “computational intelligence” as the title of this arti- cle—both communities are attacking the problem of understanding intelligence in computational terms. It is our belief that the differences between the engineering models and the cogni- tively inspired models are small compared to the vast gulf in competence between these models and human levels of intelligence. For humans are, to a first approxima- tion, intelligent; they can perceive, act, learn, reason, and communicate successfully despite the enormous difficulty of these tasks. Indeed, we expect that as further progress is made in trying to emulate this success, the engineering and cognitive mod- els will become more similar. Already, the traditionally antagonistic “connectionist” and “symbolic” camps are finding common ground, particularly in their understand- ing of reasoning under uncertainty and learning. This sort of cross-fertilization was a central aspect of the early vision of cognitive science as an interdisciplinary enter- prise. 1 Machines and Cognition The conceptual precursors of AI can be traced back many centuries. LOGIC, the formal theory of deductive reasoning, was studied in ancient Greece, as were ALGORITHMS for mathematical computations. In the late seventeenth century, Wilhelm Leibniz actually constructed simple “conceptual calculators,” but their representational and combinatorial powers were far too limited. In the nineteenth century, Charles Babbage designed (but did not build) a device capable of universal computation, and his collab- orator Ada Lovelace speculated that the machine might one day be programmed to play chess or compose music. Fundamental work by ALAN TURING in the 1930s for- malized the notion of universal computation; the famous CHURCH-TURING THESIS pro- posed that all sufficiently powerful computing devices were essentially identical in the sense that any one device could emulate the operations of any other. From here it was a small step to the bold hypothesis that human cognition was a form of COMPUTATION in exactly this sense, and could therefore be emulated by computers. By this time, neurophysiology had already established that the brain consisted largely of a vast interconnected network of NEURONS that used some form of electrical signalling mechanism. The first mathematical model relating computation and the brain appeared in a seminal paper entitled “A logical calculus of the ideas immanent in nervous activity,” by WARREN MCCULLOCH and WALTER PITTS (1943). The paper proposed an abstract model of neurons as linear threshold units—logical “gates” that output a signal if the weighted sum of their inputs exceeds a threshold value (see COM- PUTING IN SINGLE NEURONS). It was shown that a network of such gates could repre- sent any logical function, and, with suitable delay components to implement memory, would be capable of universal computation. Together with HEBB’s model of learning in networks of neurons, this work can be seen as a precursor of modern NEURAL NET- WORKS and connectionist cognitive modeling. Its stress on the representation of logi- cal concepts by neurons also provided impetus to the “logicist” view of AI. The emergence of AI proper as a recognizable field required the availability of usable computers; this resulted from the wartime efforts led by Turing in Britain and by JOHN VON NEUMANN in the United States. It also required a banner to be raised; lxxiv Computational Intelligence this was done with relish by Turing’s (1950) paper “Computing Machinery and Intel- ligence,” wherein an operational definition for intelligence was proposed (the Turing test) and many future developments were sketched out. One should not underestimate the level of controversy surrounding AI’s initial phase. The popular press was only too ready to ascribe intelligence to the new “elec- tronic super-brains,” but many academics refused to contemplate the idea of intelli- gent computers. In his 1950 paper, Turing went to great lengths to catalogue and refute many of their objections. Ironically, one objection already voiced by Kurt Gödel, and repeated up to the present day in various forms, rested on the ideas of incompleteness and undecidability in formal systems to which Turing himself had contributed (see GÖDEL’S THEOREMS and FORMAL SYSTEMS, PROPERTIES OF). Other objectors denied the possibility of CONSCIOUSNESS in computers, and with it the pos- sibility of intelligence. Turing explicitly sought to separate the two, focusing on the objective question of intelligent behavior while admitting that consciousness might remain a mystery—as indeed it has. The next step in the emergence of AI was the formation of a research community; this was achieved at the 1956 Dartmouth meeting convened by John McCarthy. Per- haps the most advanced work presented at this meeting was that of ALLEN NEWELL and Herb Simon, whose program of research in symbolic cognitive modeling was one of the principal influences on cognitive psychology and information-processing psy- chology. Newell and Simon’s IPL languages were the first symbolic programming languages and among the first high-level languages of any kind. McCarthy’s LISP language, developed slightly later, soon became the standard programming language of the AI community and in many ways remains unsurpassed even today. Contemporaneous developments in other fields also led to a dramatic increase in the precision and complexity of the models that could be proposed and analyzed. In linguistics, for example, work by Chomsky (1957) on formal grammars opened up new avenues for the mathematical modeling of mental structures. NORBERT WIENER developed the field of cybernetics (see CONTROL THEORY and MOTOR CONTROL) to provide mathematical tools for the analysis and synthesis of physical control systems. The theory of optimal control in particular has many parallels with the theory of ratio- nal agents (see below), but within this tradition no model of internal representation was ever developed. As might be expected from so young a field with so broad a mandate that draws on so many traditions, the history of AI has been marked by substantial changes in fash- ion and opinion. Its early days might be described as the “Look, Ma, no hands!” era, when the emphasis was on showing a doubting world that computers could play chess, learn, see, and do all the other things thought to be impossible. A wide variety of methods was tried, ranging from general-purpose symbolic problem solvers to simple neural networks. By the late 1960s, a number of practical and theoretical setbacks had convinced most AI researchers that there would be no simple “magic bullet.” The gen- eral-purpose methods that had initially seemed so promising came to be called weak methods because their reliance on extensive combinatorial search and first-principles knowledge could not overcome the complexity barriers that were, by that time, seen as unavoidable. The 1970s saw the rise of an alternative approach based on the applica- tion of large amounts of domain-specific knowledge, expressed in forms that were close enough to the explicit solution as to require little additional computation. Ed Feigenbaum’s gnomic dictum, “Knowledge is power,” was the watchword of the boom in industrial and commercial application of expert systems in the early 1980s. When the first generation of expert system technology turned out to be too fragile for widespread use, a so-called AI Winter set in—government funding of AI and pub- lic perception of its promise both withered in the late 1980s. At the same time, a revival of interest in neural network approaches led to the same kind of optimism as had characterized “traditional” AI in the early 1980s. Since that time, substantial progress has been made in a number of areas within AI, leading to renewed commer- cial interest in fields such as data mining (applied machine learning) and a new wave of expert system technology based on probabilistic inference. The 1990s may in fact come to be seen as the decade of probability. Besides expert systems, the so-called Computational Intelligence lxxv Bayesian approach (named after the Reverend Thomas Bayes, eighteenth-century author of the fundamental rule for probabilistic reasoning) has led to new methods in planning, natural language understanding, and learning. Indeed, it seems likely that work on the latter topic will lead to a reconciliation of symbolic and connectionist views of intelligence. See also ALGORITHM; CHURCH-TURING THESIS; COMPUTATION; COMPUTING IN SIN- GLE NEURONS; CONTROL THEORY; GÖDEL’S THEOREMS; FORMAL SYSTEMS, PROPER- TIES OF; HEBB, DONALD O.; LOGIC; MCCULLOCH, WARREN; MOTOR CONTROL; NEURAL NETWORKS; NEURON; NEWELL, ALLEN; PITTS, WALTER; TURING, ALAN; VON NEU- MANN, JOHN; WIENER, NORBERT 2 Artificial Intelligence: What’s the Problem? The consensus apparent in modern textbooks (Russell and Norvig 1995; Poole, Mack- worth, and Goebel 1997; Nilsson 1998) is that AI is about the design of intelligent agents. An agent is an entity that can be understood as perceiving and acting on its environment. An agent is rational to the extent that its actions can be expected to achieve its goals, given the information available from its perceptual processes. Whereas the Turing test defined only an informal notion of intelligence as emulation of humans, the theory of RATIONAL AGENCY (see also RATIONAL CHOICE THEORY) provides a first pass at a formal specification for intelligent agents, with the possibility of a constructive theory to satisfy this specification. Although the last section of this introduction argues that this specification needs a radical rethinking, the idea of RATIONAL DECISION MAKING has nonetheless been the foundation for most of the cur- rent research trends in AI. The focus on AI as the design of intelligent agents is a fairly recent preoccupation. Until the mid-1980s, most research in “core AI” (that is, AI excluding the areas of robotics and computer vision) concentrated on isolated reasoning tasks, the inputs of which were provided by humans and the outputs of which were interpreted by humans. Mathematical theorem-proving systems, English question-answering sys- tems, and medical expert systems all had this flavor—none of them took actions in any meaningful sense. The so-called situated movement in AI (see SITUATEDNESS/ EMBEDDEDNESS) stressed the point that reasoning is not an end in itself, but serves the purpose of enabling the selection of actions that will affect the reasoner’s environment in desirable ways. Thus, reasoning always occurs in a specific context for specific goals. By removing context and taking responsibility for action selection, AI research- ers were in danger of defining a subtask that, although useful, actually had no role in the design of a complete intelligent system. For example, some early medical expert systems were constructed in such a way as to accept as input a complete list of symp- toms and to output the most likely diagnosis. This might seem like a useful tool, but it ignores several key aspects of medicine: the crucial role of hypothesis-directed gath- ering of information, the very complex task of interpreting sensory data to obtain sug- gestive and uncertain indicators of symptoms, and the overriding goal of curing the patient, which may involve treatments aimed at less likely but potentially dangerous conditions rather than more likely but harmless ones. A second example occurred in robotics. Much research was done on motion planning under the assumption that the locations and shapes of all objects in the environment were known exactly; yet no fea- sible vision system can, or should, be designed to obtain this information. When one thinks about building intelligent agents, it quickly becomes obvious that the task environment in which the agent will operate is a primary determiner of the appropriate design. For example, if all relevant aspects of the environment are imme- diately available to the agent’s perceptual apparatus—as, for example, when playing backgammon—then the environment is said to be fully observable and the agent need maintain no internal model of the world at all. Backgammon is also discrete as opposed to continuous—that is, there is a finite set of distinct backgammon board states, whereas tennis, say, requires real-valued variables and changes continuously over time. Backgammon is stochastic as opposed to deterministic, because it includes dice rolls and unpredictable opponents; hence an agent may need to make contingency lxxvi Computational Intelligence plans for many possible outcomes. Backgammon, unlike tennis, is also static rather than dynamic, in that nothing much happens while the agent is deciding what move to make. Finally, the “physical laws” of the backgammon universe—what the legal moves are and what effect they have—are known rather than unknown. These distinc- tions alone (and there are many more) define thirty-two substantially different kinds of task environment. This variety of tasks, rather than any true conceptual differences, may be responsible for the variety of computational approaches to intelligence that, on the surface, seem so philosophically incompatible. See also RATIONAL AGENCY; RATIONAL CHOICE THEORY; RATIONAL DECISION MAKING; SITUATEDNESS/EMBEDDEDNESS 3 Architectures of Cognition Any computational theory of intelligence must propose, at least implicitly, an INTELLI- GENT AGENT ARCHITECTURE. Such an architecture defines the underlying organization of the cognitive processes comprising intelligence, and forms the computational sub- strate upon which domain-specific capabilities are built. For example, an architecture may provide a generic capability for learning the “physical laws” of the environment, for combining inputs from multiple sensors, or for deliberating about actions by envi- sioning and evaluating their effects. There is, as yet, no satisfactory theory that defines the range of possible architec- tures for intelligent systems, or identifies the optimal architecture for a given task environment, or provides a reasonable specification of what is required for an archi- tecture to support “general-purpose” intelligence, either in machines or humans. Some researchers see the observed variety of intelligent behaviors as a consequence of the operation of a unified, general-purpose problem-solving architecture (Newell 1990). Others propose a functional division of the architecture with modules for per- ception, learning, reasoning, communication, locomotion, and so on (see MODULAR- ITY OF MIND). Evidence from neuroscience (for example, lesion studies) is often interpreted as showing that the brain is divided into areas, each of which performs some function in this sense; yet the functional descriptions (e.g., “language,” “face recognition,” etc.) are often subjective and informal and the nature of the connections among the components remains obscure. In the absence of deeper theory, such gener- alizations from scanty evidence must remain highly suspect. That is, the basic organi- zational principles of intelligence are still up for grabs. Proposed architectures vary along a number of dimensions. Perhaps the most com- monly cited distinction is between “symbolic” and “connectionist” approaches. These approaches are often thought to be based on fundamentally irreconcilable philosophi- cal foundations. We will argue that, to a large extent, they are complementary; where comparable, they form a continuum. Roughly speaking, a symbol is an object, part of the internal state of an agent, that has two properties: it can be compared to other symbols to test for equality, and it can be combined with other symbols to form symbol structures. The symbolic approach to AI, in its purest form, is embodied in the physical symbol system (PSS) hypothesis (Newell and Simon 1972), which proposes that algorithmic manipulation of symbol structures is necessary and sufficient for general intelligence (see also COMPUTA- TIONAL THEORY OF MIND.) The PSS hypothesis, if taken to its extreme, is identical to the view that cognition can be understood as COMPUTATION. Symbol systems can emulate any Turing machine; in particular, they can carry out finite-precision numerical operations and thereby implement neural networks. Most AI researchers interpret the PSS hypothesis more narrowly, ruling out primitive numerical quantities that are manipulated as mag- nitudes rather than simply tested for (in)equality. The Soar architecture (Newell 1990), which uses PROBLEM SOLVING as its underlying formalism, is the most well developed instantiation of the pure symbolic approach to cognition (see COGNITIVE MODELING, SYMBOLIC). The symbolic tradition also encompasses approaches to AI that are based on logic. The symbols in the logical languages are used to represent objects and relations Computational Intelligence lxxvii among objects, and symbol structures called sentences are used to represent facts that the agent knows. Sentences are manipulated according to certain rules to generate new sentences that follow logically from the original sentences. The details of logical agent design are given in the section on knowledge-based systems; what is relevant here is the use of symbol structures as direct representations of the world. For exam- ple, if the agent sees John sitting on the fence, it might construct an internal represen- tation from symbols that represent John, the fence, and the sitting-on relation. If Mary is on the fence instead, the symbol structure would be the same except for the use of a symbol for Mary instead of John. This kind of compositionality of representations is characteristic of symbolic approaches. A more restricted kind of compositionality can occur even in much sim- pler systems. For example, in the network of logical gates proposed by McCulloch and Pitts, we might have a neuron J that is “on” whenever the agent sees John on the fence; and another neuron M that is “on” when Mary is on the fence. Then the propo- sition “either John or Mary is on the fence” can be represented by a neuron that is con- nected to J and M with the appropriate connection strengths. We call this kind of representation propositional, because the fundamental elements are propositions rather than symbols, denoting objects and relations. In the words of McCulloch and Pitts (1943), the state of a neuron was conceived of as “factually equivalent to a prop- osition which proposed its adequate stimulus.” We will also extend the standard sense of “propositional” to cover neural networks comprised of neurons with continuous real-valued activations, rather than the 1/0 activations in the original McCulloch-Pitts threshold neurons. It is clear that, in this sense, the raw sensory data available to an agent are proposi- tional. For example, the elements of visual perception are “pixels” whose proposi- tional content is, for example, “this area of my retina is receiving bright red light.” This observation leads to the first difficulty for the symbolic approach: how to move from sensory data to symbolic representations. This so-called symbol grounding problem has been deemed insoluble by some philosophers (see CONCEPTS), thereby dooming the symbolic approach to oblivion. On the other hand, existence proofs of its solubility abound. For example, Shakey, the first substantial robotics project in AI, used symbolic (logical) reasoning for its deliberations, but interacted with the world quite happily (albeit slowly) through video cameras and wheels (see Raphael 1976). A related problem for purely symbolic approaches is that sensory information about the physical world is usually thought of as numerical—light intensities, forces, strains, frequencies, and so on. Thus, there must at least be a layer of nonsymbolic computation between the real world and the realm of pure symbols. Neither the theory nor the practice of symbolic AI argues against the existence of such a layer, but its existence does open up the possibility that some substantial part of cognition occurs therein without ever reaching the symbolic level. A deeper problem for the narrow PSS hypothesis is UNCERTAINTY—the unavoid- able fact that unreliable and partial sensory information, combined with unreliable and partial theories of how the world works, must leave an agent with some doubt as to the truth of virtually all propositions of interest. For example, the stock market may soon recover this week’s losses, or it may not. Whether to buy, sell, or hold depends on one’s assessment of the prospects. Similarly, a person spotted across a crowded, smoky night club may or may not be an old friend. Whether to wave in greeting depends on how certain one is (and on one’s sensitivity to embarrassment due to wav- ing at complete strangers). Although many decisions under uncertainty can be made without reference to numerical degrees of belief (Wellman 1990), one has a lingering sense that degrees of belief in propositions may be a fundamental component of our mental representations. Accounts of such phenomena based on probability theory are now widely accepted within AI as an augmentation of the purely symbolic view; in particular, probabilistic models are a natural generalization of the logical approach. Recent work has also shown that some connectionist representations (e.g., Boltzmann machines) are essentially identical to probabilistic network models developed in AI (see NEURAL NETWORKS). lxxviii Computational Intelligence The three issues raised in the preceding paragraphs—sensorimotor connections to the external world, handling real-valued inputs and outputs, and robust handling of noisy and uncertain information—are primary motivations for the connectionist approach to cognition. (The existence of networks of neurons in the brain is obviously another.) Neural network models show promise for many low-level tasks such as visual pattern recognition and speech recognition. The most obvious drawback of the connectionist approach is the difficulty of envisaging a means to model higher levels of cognition (see BINDING PROBLEM and COGNITIVE MODELING, CONNECTIONIST), par- ticularly when compared to the ability of symbol systems to generate an unbounded variety of structures from a finite set of symbols (see COMPOSITIONALITY). Some solu- tions have been proposed (see, for example, BINDING BY NEURAL SYNCHRONY); these solutions provide a plausible neural implementation of symbolic models of cognition, rather than an alternative. Another problem for connectionist and other propositional approaches is the mod- eling of temporally extended behavior. Unless the external environment is completely observable by the agent’s sensors, such behavior requires the agent to maintain some internal state information that reflects properties of the external world that are not directly observable. In the symbolic or logical approach, sentences such as “My car is parked at the corner of Columbus and Union” can be stored in “working memory” or in a “temporal knowledge base” and updated as appropriate. In connectionist models, internal states require the use of RECURRENT NETWORKS, which are as yet poorly understood. In summary, the symbolic and connectionist approaches seem not antithetical but complementary—connectionist models may handle low-level cognition and may (or rather must, in some form) provide a substrate for higher-level symbolic processes. Probabilistic approaches to representation and reasoning may unify the symbolic and connectionist traditions. It seems that the more relevant distinction is between propo- sitional and more expressive forms of representation. Related to the symbolic-connectionist debate is the distinction between delibera- tive and reactive models of cognition. Most AI researchers view intelligent behavior as resulting, at least in part, from deliberation over possible courses of action based on the agent’s knowledge of the world and of the expected results of its actions. This seems self-evident to the average person in the street, but it has always been a contro- versial hypothesis—according to BEHAVIORISM, it is meaningless. With the develop- ment of KNOWLEDGE-BASED SYSTEMS, starting from the famous “Advice Taker” paper by McCarthy (1958), the deliberative model could be put to the test. The core of a knowledge-based agent is the knowledge base and its associated reasoning proce- dures; the rest of the design follows straightforwardly. First, we need some way of acquiring the necessary knowledge. This could be from experience through MACHINE LEARNING methods, from humans and books through NATURAL LANGUAGE PROCESS- ING, by direct programming, or through perceptual processes such as MACHINE VISION. Given knowledge of its environment and of its objectives, an agent can reason that certain actions will achieve those objectives and should be executed. At this point, if we are dealing with a physical environment, robotics takes over, handling the mechanical and geometric aspects of motion and manipulation. The following sections deal with each of these areas in turn. It should be noted, however, that the story in the preceding paragraph is a gross idealization. It is, in fact, close to the view caricatured as good old-fashioned AI (GOFAI) by John Haugeland (1985) and Hubert Dreyfus (1992). In the five decades since Turing’s paper, AI researchers have discovered that attaining real competence is not so simple—the prin- ciple barrier being COMPUTATIONAL COMPLEXITY. The idea of reactive systems (see also AUTOMATA) is to implement direct mappings from perception to action that avoid the expensive intermediate steps of representation and reasoning. This observa- tion was made within the first month of the Shakey project (Raphael 1976) and given new life in the field of BEHAVIOR-BASED ROBOTICS (Brooks 1991). Direct mappings of this kind can be learned from experience or can be compiled from the results of deliberation within a knowledge-based architecture (see EXPLANATION-BASED LEARNING). Most current models propose a hybrid agent design incorporating a vari- Computational Intelligence lxxix ety of decision-making mechanisms, perhaps with capabilities for METAREASONING to control and integrate these mechanisms. Some have even proposed that intelligent systems should be constructed from large numbers of separate agents, each with per- cepts, actions, and goals of its own (Minsky 1986)—much as a nation’s economy is made up of lots of separate humans. The theory of MULTIAGENT SYSTEMS explains how, in some cases, the goals of the whole agent can be achieved even when each sub-agent pursues its own ends. See also AUTOMATA; BEHAVIORISM; BEHAVIOR-BASED ROBOTICS; BINDING BY NEURAL SYNCHRONY; BINDING PROBLEM; COGNITIVE MODELING, CONNECTIONIST; COGNITIVE MODELING, SYMBOLIC; COMPOSITIONALITY; COMPUTATION; COMPUTA- TIONAL COMPLEXITY; COMPUTATIONAL THEORY OF MIND; CONCEPTS; EXPLANATION- BASED LEARNING; INTELLIGENT AGENT ARCHITECTURE; KNOWLEDGE-BASED SYS- TEMS; MACHINE LEARNING; MACHINE VISION; METAREASONING; MODULARITY OF MIND; MULTIAGENT SYSTEMS; NATURAL LANGUAGE PROCESSING; NEURAL NET- WORKS; PROBLEM SOLVING; RECURRENT NETWORKS; UNCERTAINTY 4 Knowledge-Based Systems The procedural-declarative controversy, which raged in AI through most of the 1970s, was about which way to build AI systems (see, for example, Boden 1977). The procedural view held that systems could be constructed by encoding expertise in domain-specific algorithms—for example, a procedure for diagnosing migraines by asking specific sequences of questions. The declarative view, on the other hand, held that systems should be knowledge-based, that is, composed from domain-specific knowledge—for example, the symptoms typically associated with various ailments— combined with a general-purpose reasoning system. The procedural view stressed efficiency, whereas the declarative view stressed the fact that the overall internal rep- resentation can be decomposed into separate sentences, each of which has an identifi- able meaning. Advocates of knowledge-based systems often cited the following advantages: Ease of construction: knowledge-based systems can be constructed simply by encod- ing domain knowledge extracted from an expert; the system builder need not con- struct and encode a solution to the problems in the domain. Flexibility: the same knowledge can be used to answer a variety of questions and as a component in a variety of systems; the same reasoning mechanism can be used for all domains. Modularity: each piece of knowledge can be identified, encoded, and debugged inde- pendently of the other pieces. Learnability: various learning methods exist that can be used to extract the required knowledge from data, whereas it is very hard to construct programs by automatic means. Explainability: a knowledge-based system can explain its decisions by reference to the explicit knowledge it contains. With arguments such as these, the declarative view prevailed and led to the boom in expert systems in the late 1970s and early 1980s. Unfortunately for the field, the early knowledge-based systems were seldom equal to the challenges of the real world, and since then there has been a great deal of research to remedy these failings. The area of KNOWLEDGE REPRESENTATION deals with methods for encoding knowledge in a form that can be processed by a computer to derive consequences. Formal logic is used in various forms to represent definite knowledge. To handle areas where definite knowledge is not available (for example, medical diagnosis), methods have been developed for representation and reasoning under uncertainty, including the extension of logic to so-called NONMONOTONIC LOG- ICS. All knowledge representation systems need some process for KNOWLEDGE ACQUISITION, and much has been done to automate this process through better inter- face tools, machine learning methods, and, most recently, extraction from natural lan- guage texts. Finally, substantial progress has been made on the question of the computational complexity of reasoning. lxxx Computational Intelligence See also KNOWLEDGE ACQUISITION; KNOWLEDGE REPRESENTATION; NONMONO- TONIC LOGICS 5 Logical Representation and Reasoning Logical reasoning is appropriate when the available knowledge is definite. McCar- thy’s (1958) “Advice Taker” paper proposed first-order logic (FOL) as a formal lan- guage for the representation of commonsense knowledge in AI systems. FOL has sufficient expressive power for most purposes, including the representation of objects, relations among objects, and universally quantified statements about sets of objects. Thanks to work by a long line of philosophers and mathematicians, who were also interested in a formal language for representing general (as well as mathematical) knowledge, FOL came with a well-defined syntax and semantics, as well as the pow- erful guarantee of completeness: there exists a computational procedure such that, if the answer to a question is entailed by the available knowledge, then the procedure will find that answer (see GÖDEL’S THEOREMS). More expressive languages than FOL generally do not allow completeness—roughly put, there exist theorems in these lan- guages that cannot be proved. The first complete logical reasoning system for FOL, the resolution method, was devised by Robinson (1965). An intense period of activity followed in which LOGICAL REASONING SYSTEMS were applied to mathematics, automatic programming, plan- ning, and general-knowledge question answering. Theorem-proving systems for full FOL have proved new theorems in mathematics and have found widespread applica- tion in areas such as program verification, which spun off from mainstream AI in the early 1970s. Despite these early successes, AI researchers soon realized that the computational complexity of general-purpose reasoning with full FOL is prohibitive; such systems could not scale up to handle large knowledge bases. A great deal of attention has therefore been given to more restricted languages. Database systems, which have long been distinct from AI, are essentially logical question-answering systems the knowl- edge bases of which are restricted to very simple sentences about specific objects. Propositional languages avoid objects altogether, representing the world by the dis- crete values of a fixed set of propositional variables and by logical combinations thereof. (Most neural network models fall into this category also.) Propositional rea- soning methods based on CONSTRAINT SATISFACTION and GREEDY LOCAL SEARCH have been very successful in real-world applications, but the restricted expressive power of propositional languages severely limits their scope. Much closer to the expressive power of FOL are the languages used in LOGIC PROGRAMMING. Although still allowing most kinds of knowledge to be expressed very naturally, logic program- ming systems such as Prolog provide much more efficient reasoning and can work with extremely large knowledge bases. Reasoning systems must have content with which to reason. Researchers in knowl- edge representation study methods for codifying and reasoning with particular kinds of knowledge. For example, McCarthy (1963) proposed the SITUATION CALCULUS as a way to represent states of the world and the effects of actions within first-order logic. Early versions of the situation calculus suffered from the infamous FRAME PROB- LEM—the apparent need to specify sentences in the knowledge base for all the nonef- fects of actions. Some philosophers see the frame problem as evidence of the impossibility of the formal, knowledge-based approach to AI, but simple technical advances have resolved the original issues. Situation calculus is perhaps the simplest form of TEMPORAL REASONING; other for- malisms have been developed that provide substantially more general frameworks for handling time and extended events. Reasoning about knowledge itself is important par- ticularly when dealing with other agents, and is usually handled by MODAL LOGIC, an extension of FOL. Other topics studied include reasoning about ownership and transac- tions, reasoning about substances (as distinct from objects), and reasoning about physi- cal representations of information. A general ontology—literally, a description of existence—ties all these areas together into a unified taxonomic hierarchy of catego- Computational Intelligence lxxxi ries. FRAME-BASED SYSTEMS are often used to represent such hierarchies, and use spe- cialized reasoning methods based on inheritance of properties in the hierarchy. See also CONSTRAINT SATISFACTION; FRAME PROBLEM; FRAME-BASED SYSTEMS; GÖDEL’S THEOREMS; GREEDY LOCAL SEARCH; LOGIC PROGRAMMING; LOGICAL REA- SONING SYSTEMS; MODAL LOGIC; SITUATION CALCULUS; TEMPORAL REASONING 6 Logical Decision Making An agent’s job is to make decisions, that is, to commit to particular actions. The con- nection between logical reasoning and decision making is simple: the agent must con- clude, based on its knowledge, that a certain action is best. In philosophy, this is known as practical reasoning. There are many routes to such conclusions. The sim- plest leads to a reactive system using condition-action rules of the form “If P then do A.” Somewhat more complex reasoning is required when the agent has explicitly rep- resented goals. A goal “G” is a description of a desired state of affairs—for example, one might have the goal “On vacation in the Seychelles.” The practical syllogism, first expounded by Aristotle, says that if G is a goal, and A achieves G, then A should be done. Obviously, this rule is open to many objections: it does not specify which of many eligible As should be done, nor does it account for possibly disastrous side- effects of A. Nonetheless, it underlies most forms of decision making in the logical context. Often, there will be no single action A that achieves the goal G, but a solution may exist in the form of a sequence of actions. Finding such a sequence is called PROBLEM SOLVING, where the word “problem” refers to a task defined by a set of actions, an ini- tial state, a goal, and a set of reachable states. Much of the early cognitive modeling work of Newell and Simon (1972) focused on problem solving, which was seen as a quintessentially intelligent activity. A great deal of research has been done on efficient algorithms for problem solving in the areas of HEURISTIC SEARCH and GAME-PLAYING SYSTEMS. The “cognitive structure” of such systems is very simple, and problem-solv- ing competence is often achieved by means of searching through huge numbers of possibilities. For example, the Deep Blue chess program, which defeated human world champion Gary Kasparov, often examined over a billion positions prior to each move. Human competence is not thought to involve such computations (see CHESS, PSYCHOLOGY OF). Most problem-solving algorithms treat the states of the world as atomic—that is, the internal structure of the state representation is not accessible to the algorithm as it considers the possible sequences of actions. This fails to take advantage of two very important sources of power for intelligent systems: the ability to decompose complex problems into subproblems and the ability to identify relevant actions from explicit goal descriptions. For example, an intelligent system should be able decompose the goal “have groceries and a clean car” into the subgoals “have groceries” and “have a clean car.” Furthermore, it should immediately consider buying groceries and washing the car. Most search algorithms, on the other hand, may consider a variety of action sequences—sitting down, standing up, going to sleep, and so on—before happening on some actions that are relevant. In principle, a logical reasoning system using McCarthy’s situation calculus can generate the kinds of reasoning behaviors necessary for decomposing complex goals and selecting relevant actions. For reasons of computational efficiency, however, special-purpose PLANNING systems have been developed, originating with the STRIPS planner used by Shakey the Robot (Fikes and Nilsson 1971). Modern planners have been applied to logistical problems that are, in some cases, too complex for humans to handle effectively. See also CHESS, PSYCHOLOGY OF; GAME-PLAYING SYSTEMS; HEURISTIC SEARCH; PLANNING; PROBLEM-SOLVING 7 Representation and Reasoning under Uncertainty In many areas to which one might wish to apply knowledge-based systems, the avail- able knowledge is far from definite. For example, a person who experiences recurrent lxxxii Computational Intelligence headaches may suffer from migraines or a brain tumor. A logical reasoning system can represent this sort of disjunctive information, but cannot represent or reason with the belief that migraine is a more likely explanation. Such reasoning is obviously essential for diagnosis, and has turned out to be central for expert systems in almost all areas. The theory of probability (see PROBABILITY, FOUNDATIONS OF) is now widely accepted as the basic calculus for reasoning under uncertainty (but see FUZZY LOGIC for a complementary view). Questions remain as to whether it is a good model for human reasoning (see TVERSKY and PROBABILISTIC REASONING), but within AI many of the computational and representational problems that deterred early research- ers have been resolved. The adoption of a probabilistic approach has also created rich connections with statistics and control theory. Standard probability theory views the world as comprised of a set of interrelated random variables the values of which are initially unknown. Knowledge comes in the form of prior probability distributions over the possible assignments of values to sub- sets of the random variables. Then, when evidence is obtained about the values of some of the variables, inference algorithms can infer posterior probabilities for the remaining unknown variables. Early attempts to use probabilistic reasoning in AI came up against complexity barriers very soon, because the number of probabilities that make up the prior probability distribution can grow exponentially in the number of variables considered. Starting in the early 1980s, researchers in AI, decision analysis, and statistics developed what are now known as BAYESIAN NETWORKS (Pearl 1988). These net- works give structure to probabilistic knowledge bases by expressing conditional inde- pendence relationships among the variables. For example, given the actual temperature, the temperature measurements of two thermometers are independent. In this way, Bayesian networks capture our intuitive notions of the causal structure of the domain of application. In most cases, the number of probabilities that must be specified in a Bayesian network grows only linearly with the number of variables. Such systems can therefore handle quite large problems, and applications are very widespread. Moreover, methods exist for learning Bayesian networks from raw data (see BAYESIAN LEARNING), making them a natural bridge between the symbolic and neural-network approaches to AI. In earlier sections, we have stressed the importance of the distinction between propositional and first-order languages. So far, probability theory has been limited to essentially propositional representations; this prevents its application to the more complex forms of cognition addressed by first-order methods. The attempt to unify probability theory and first-order logic, two of the most fundamental developments in the history of mathematics and philosophy, is among the more important topics in cur- rent AI research. See also BAYESIAN LEARNING; BAYESIAN NETWORKS; FUZZY LOGIC; PROBABILIS- TIC REASONING; PROBABILITY, FOUNDATIONS OF; TVERSKY, AMOS 8 Decision Making under Uncertainty Just as logical reasoning is connected to action through goals, probabilistic reasoning is connected to action through utilities, which describe an agent’s preferences for some states over others. It is a fundamental result of UTILITY THEORY (see also RATIO- NAL CHOICE THEORY) that an agent whose preferences obey certain rationality con- straints, such as transitivity, can be modeled as possessing a utility function that assigns a numerical value to each possible state. Furthermore, RATIONAL DECISION MAKING consists of selecting an action to maximize the expected utility of outcome states. An agent that makes rational decisions will, on average, do better than an agent that does not—at least as far as satisfying its own preferences is concerned. In addition to their fundamental contributions to utility theory, von Neumann and Morgenstern (1944) also developed GAME THEORY to handle the case where the envi- ronment contains other agents, which must be modeled as independent utility maxi- mizers. In some game-theoretic situations, it can be shown that optimal behavior must be randomized. Additional complexities arise when dealing with so-called sequential Computational Intelligence lxxxiii decision problems, which are analogous to planning problems in the logical case. DYNAMIC PROGRAMMING algorithms, developed in the field of operations research, can generate optimal behavior for such problems. (See also the discussion of REIN- FORCEMENT LEARNING in segment 12—Learning.) In a sense, the theory of rational decision making provides a zeroth-order theory of intelligence, because it provides an operational definition of what an agent ought to do in any situation. Virtually every problem an agent faces, including such problems as how to gather information and how to update its beliefs given that information, can be formulated within the theory and, in principle, solved. What the theory ignores is the question of complexity, which we discuss in the final section of this introduction. See also DYNAMIC PROGRAMMING; GAME THEORY; RATIONAL CHOICE THEORY; RATIONAL DECISION MAKING; REINFORCEMENT LEARNING; UTILITY THEORY 9 Learning has been a central aspect of AI from its earliest days. It is immediately LEARNING apparent that learning is a vital characteristic of any intelligent system that has to deal with changing environments. Learning may also be the only way in which complex and competent systems can be constructed—a proposal stated clearly by Turing (1950), who devoted a quarter of his paper to the topic. Perhaps the first major public success for AI was Arthur Samuel’s (1959) checker-playing system, which learned to play checkers to a level far superior to its creator’s abilities and attracted substantial television coverage. State-of-the-art systems in almost all areas of AI now use learn- ing to avoid the need for the system designer to have to anticipate and provide knowl- edge to handle every possible contingency. In some cases, for example speech recognition, humans are simply incapable of providing the necessary knowledge accu- rately. The discipline of machine learning has become perhaps the largest subfield of AI as well as a meeting point between AI and various other engineering disciplines con- cerned with the design of autonomous, robust systems. An enormous variety of learn- ing systems has been studied in the AI literature, but once superficial differences are stripped away, there seem to be a few core principles at work. To reveal these princi- ples it helps to classify a given learning system along a number of dimensions: (1) the type of feedback available, (2) the component of the agent to be improved, (3) how that component is represented, and (4) the role of prior knowledge. It is also important to be aware that there is a tradeoff between learning and inference and different sys- tems rely more on one than on the other. The type of feedback available is perhaps the most useful categorizer of learning algorithms. Broadly speaking, learning algorithms fall into the categories of super- vised learning, unsupervised learning, and reinforcement learning. Supervised learn- ing algorithms (see, e.g., DECISION TREES and SUPERVISED LEARNING IN MULTILAYER NEURAL NETWORKS) require that a target output is available for every input, an assumption that is natural in some situations (e.g., categorization problems with labeled data, imitation problems, and prediction problems, in which the present can be used as a target for a prediction based on the past). UNSUPERVISED LEARNING algo- rithms simply find structure in an ensemble of data, whether or not this structure is useful for a particular classification or prediction (examples include clustering algo- rithms, dimensionality-reducing algorithms, and algorithms that find independent components). REINFORCEMENT LEARNING algorithms require an evaluation signal that gives some measure of progress without necessarily providing an example of correct behavior. Reinforcement learning research has had a particular focus on temporal learning problems, in which the evaluation arrives after a sequence of responses. The different components of an agent generally have different kinds of representa- tional and inferential requirements. Sensory and motor systems must interface with the physical world and therefore generally require continuous representations and smooth input-output behavior. In such situations, neural networks have provided a useful class of architectures, as have probabilistic systems such as HIDDEN MARKOV MODELS and Bayesian networks. The latter models also are generally characterized by lxxxiv Computational Intelligence a clear propositional semantics, and as such have been exploited for elementary cogni- tive processing. Decision trees are also propositional systems that are appropriate for simple cognitive tasks. There are variants of decision trees that utilize continuous rep- resentations, and these have close links with neural networks, as well as variants of decision trees that utilize relational machinery, making a connection with INDUCTIVE LOGIC PROGRAMMING. The latter class of architecture provides the full power of first- order logic and the capability of learning complex symbolic theories. Prior knowledge is an important component of essentially all modern learning architectures, particularly so in architectures that involve expressive representations. Indeed, the spirit of inductive logic programming is to use the power of logical infer- ence to bootstrap background knowledge and to interpret new data in the light of that knowledge. This approach is carried to what is perhaps its (logical) extreme in the case of EXPLANATION-BASED LEARNING (EBL), in which the system uses its current theory to explain a new observation, and extracts from that explanation a useful rule for future use. EBL can be viewed as a form of generalized caching, also called speedup learning. CASE-BASED REASONING AND ANALOGY provides an alternate route to the same end through the solution of problems by reference to previous experience instead of first principles. Underlying all research on learning is a version of the general problem of INDUC- TION; in particular, on what basis can we expect that a system that performs well on past “training” data should also perform well on future “test” data? The theory of learning (see COMPUTATIONAL LEARNING THEORY and STATISTICAL LEARNING THE- ORY) attacks this problem by assuming that the data provided to a learner is obtained from a fixed but unknown probability distribution. The theory yields a notion of sam- ple complexity, which quantifies the amount of data that a learner must see in order to expect—with high probability—to perform (nearly) as well in the future as in the past. The theory also provides support for the intuitive notion of Ockham’s razor—the idea that if a simple hypothesis performs as well as a complex hypothesis, one should pre- fer the simple hypothesis (see PARSIMONY AND SIMPLICITY). General ideas from probability theory in the form of Bayesian learning, as well as related ideas from INFORMATION THEORY in the form of the MINIMUM DESCRIPTION LENGTH approach provide a link between learning theory and learning practice. In particular, Bayesian learning, which views learning as the updating of probabilistic beliefs in hypotheses given evidence, naturally embodies a form of Ockham’s razor. Bayesian methods have been applied to neural networks, Bayesian networks, decision trees, and many other learning architectures. We have seen that learning has strong relationships to knowledge representation and to the study of uncertainty. There are also important connections between learning and search. In particular, most learning algorithms involve some form of search through the hypothesis space to find hypotheses that are consistent (or nearly so) with the data and with prior expectations. Standard heuristic search algorithms are often invoked—either explicitly or implicitly—to perform this search. EVOLUTIONARY COMPUTATION also treats learning as a search process, in which the “hypothesis” is an entire agent, and learning takes place by “mutation” and “natural selection” of agents that perform well (see also ARTIFICIAL LIFE). There are also interesting links between learning and planning; in particular, it is possible to view reinforcement learning as a form of “on-line” planning. Finally, it is worth noting that learning has been a particularly successful branch of AI research in terms of its applications to real-world problems in specific fields; see for example the articles on PATTERN RECOGNITION AND FEEDFORWARD NETWORKS, STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING, VISION AND LEARN- ING, and ROBOTICS AND LEARNING. See also ARTIFICIAL LIFE; CASE-BASED REASONING AND ANALOGY; COMPUTA- TIONAL LEARNING THEORY; DECISION TREES; EVOLUTIONARY COMPUTATION; HIDDEN MARKOV MODELS; INDUCTION; INDUCTIVE LOGIC PROGRAMMING; INFORMATION THE- ORY; MINIMUM DESCRIPTION LENGTH; PARSIMONY AND SIMPLICITY; PATTERN RECOG- NITION AND FEEDFORWARD NETWORKS; ROBOTICS AND LEARNING; STATISTICAL LEARNING THEORY; STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING; Computational Intelligence lxxxv SUPERVISED LEARNING IN MULTILAYER NEURAL NETWORKS; UNSUPERVISED LEARN- ING; VISION AND LEARNING 10 Language or NLP—the ability to perceive, understand, and NATURAL LANGUAGE PROCESSING, generate language—is an essential part of HUMAN-COMPUTER INTERACTION as well as the most obvious task to be solved in passing the Turing test. As with logical reason- ing, AI researchers have benefited from a pre-existing intellectual tradition. The field of linguistics (see also LINGUISTICS, PHILOSOPHICAL ISSUES) has produced formal notions of SYNTAX and SEMANTICS, the view of utterances as speech acts, and very careful philosophical analyses of the meanings of various constructs in natural lan- guage. The field of COMPUTATIONAL LINGUISTICS has grown up since the 1960s as a fertile union of ideas from AI, cognitive science, and linguistics. As soon as programs were written to process natural language, it became obvious that the problem was much harder than had been anticipated. In the United States sub- stantial effort was devoted to Russian-English translation from 1957 onward, but in 1966 a government report concluded that “there has been no machine translation of general scientific text, and none is in immediate prospect.” Successful MACHINE TRANSLATION appeared to require an understanding of the content of the text; the bar- riers included massive ambiguity (both syntactic and semantic), a huge variety of word senses, and the vast numbers of idiosyncratic ways of using words to convey meanings. Overcoming these barriers seems to require the use of large amounts of commonsense knowledge and the ability to reason with it—in other words, solving a large fraction of the AI problem. For this reason, Robert Wilensky has described natu- ral language processing as an “AI-complete” problem (see also MODULARITY AND LANGUAGE). Research in NLP has uncovered a great deal of new information about language. There is a better appreciation of the actual syntax of natural language—as opposed to the vastly oversimplified models that held sway before computational investigation was possible. Several new families of FORMAL GRAMMARS have been proposed as a result. In the area of semantics, dozens of interesting phenomena have surfaced—for example, the surprising range of semantic relationships in noun-noun pairs such as “alligator shoes” and “baby shoes.” In the area of DISCOURSE understanding, research- ers have found that grammaticality is sometimes thrown out of the window, leading some to propose that grammar itself is not a useful construct for NLP. One consequence of the richness of natural language is that it is very difficult to build by hand a system capable of handling anything close to the full range of phe- nomena. Most systems constructed prior to the 1990s functioned only in predefined and highly circumscribed domains. Stimulated in part by the availability of large online text corpora, the use of STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PRO- CESSING has created something of a revolution. Instead of building complex grammars by hand, these techniques train very large but very simple probabilistic grammars and semantic models from millions of words of text. These techniques have reached the point where they can be usefully applied to extract information from general newspa- per articles. Few researchers expect simple probability models to yield human-level under- standing. On the other hand, the view of language entailed by this approach—that the text is a form of evidence from which higher-level facts can be inferred by a process of probabilistic inference—may prove crucial for further progress in NLP. A probabilis- tic framework allows the smooth integration of the multiple “cues” required for NLP, such as syntax, semantics, discourse conventions, and prior expectations. In contrast to the general problem of natural language understanding, the problem of SPEECH RECOGNITION IN MACHINES may be feasible without recourse to general knowledge and reasoning capabilities. The statistical approach was taken much ear- lier in the speech field, beginning in the mid-1970s. Together with improvements in the signal processing methods used to extract acoustic features, this has led to steady improvements in performance, to the point where commercial systems can handle lxxxvi Computational Intelligence dictated speech with over 95 percent accuracy. The combination of speech recogni- tion and SPEECH SYNTHESIS (see also NATURAL LANGUAGE GENERATION) promises to make interaction with computers much more natural for humans. Unfortunately, accuracy rates for natural dialogue seldom exceed 75 percent; possibly, speech sys- tems will have to rely on knowledge-based expectations and real understanding to make further progress. See also COMPUTATIONAL LINGUISTICS; DISCOURSE; FORMAL GRAMMARS; HUMAN- COMPUTER INTERACTION; LINGUISTICS, PHILOSOPHICAL ISSUES; MACHINE TRANSLA- TION; MODULARITY AND LANGUAGE; NATURAL LANGUAGE GENERATION; NATURAL LANGUAGE PROCESSING; SEMANTICS; SPEECH RECOGNITION IN MACHINES; SPEECH SYNTHESIS; STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING; SYNTAX 11 Vision The study of vision presents a number of advantages—visual processing systems are present across a wide variety of species, they are reasonably accessible experimentally (psychophysically, neuropsychologically, and neurophysiologically), and a wide vari- ety of artificial imaging systems are available that are sufficiently similar to their nat- ural counterparts so as to make research in machine vision highly relevant to research in natural vision. An integrated view of the problem has emerged, linking research in COMPUTATIONAL VISION, which is concerned with the development of explicit theo- ries of human and animal vision, with MACHINE VISION, which is concerned with the development of an engineering science of vision. Computational approaches to vision, including the influential theoretical frame- work of MARR, generally involve a succession of processes that begin with localized numeric operations on images (so-called early vision) and proceed toward the high- level abstractions thought to be involved in OBJECT RECOGNITION. The current view is that the interpretation of complex scenes involves inference in both the bottom-up and top-down directions (see also TOP-DOWN PROCESSING IN VISION). High-level object recognition is not the only purpose of vision. Representations at intermediate levels can also be an end unto themselves, directly subserving control processes of orienting, locomotion, reaching, and grasping. Visual analysis at all lev- els can be viewed as a process of recovering aspects of the visual scene from its pro- jection onto a 2-D image. Visual properties such as shape and TEXTURE behave in lawful ways under the geometry of perspective projection, and understanding this geometry has been a focus of research. Related geometrical issues have been studied in STEREO AND MOTION PERCEPTION, where the issue of finding correspondences between multiple images also arises. In all of these cases, localized spatial and tempo- ral cues are generally highly ambiguous with respect to the aspects of the scene from which they arise, and algorithms that recover such aspects generally involve some form of spatial or temporal integration. It is also important to prevent integrative processes from wrongly smoothing across discontinuities that correspond to visually meaningful boundaries. Thus, visual pro- cessing also requires segmentation. Various algorithms have been studied for the seg- mentation of image data. Again, an understanding of projective geometry has been a guide for the development of such algorithms. Integration and segmentation are also required at higher levels of visual processing, where more abstract principles (such as those studied by GESTALT PSYCHOLOGY; see GESTALT PERCEPTION) are needed to group visual elements. Finally, in many cases the goal of visual processing is to detect or recognize objects in the visual scene. A number of difficult issues arise in VISUAL OBJECT RECOGNITION, including the issue of what kinds of features should be used (2-D or 3-D, edge-based or filter-based), how to deal with missing features (e.g., due to occlusion or shadows), how to represent flexible objects (such as humans), and how to deal with variations in pose and lighting. Methods based on learning (cf. VISION AND LEARNING) have played an increasingly important role in addressing some of these issues. See also COMPUTATIONAL VISION; GESTALT PERCEPTION; GESTALT PSYCHOLOGY; MARR, DAVID; OBJECT RECOGNITION, ANIMAL STUDIES; OBJECT RECOGNITION, HUMAN Computational Intelligence lxxxvii NEUROPSYCHOLOGY; STEREO AND MOTION PERCEPTION; TEXTURE; TOP-DOWN PRO- CESSING IN VISION; VISION AND LEARNING; VISUAL OBJECT RECOGNITION, AI 12 Robotics Robotics is the control of physical effectors to achieve physical tasks such as naviga- tion and assembly of complex objects. Effectors include grippers and arms to perform MANIPULATION AND GRASPING and wheels and legs for MOBILE ROBOTS and WALKING AND RUNNING MACHINES. The need to interact directly with a physical environment, which is generally only partially known and partially controllable, brings certain issues to the fore in robotics that are often skirted in other areas in AI. One important set of issues arises from the fact that environments are generally dynamical systems, characterizable by a large (perhaps infinite) collection of real-valued state variables, whose values are not gener- ally directly observable by the robot (i.e., they are “hidden”). The presence of the robot control algorithm itself as a feedback loop in the environment introduces addi- tional dynamics. The robot designer must be concerned with the issue of stability in such a situation. Achieving stability not only prevents disasters but it also simplifies the dynamics, providing a degree of predictability that is essential for the success of planning algorithms. Stability is a key issue in manipulation and grasping, where the robot must impart a distributed pattern of forces and torques to an object so as to maintain a desired posi- tion and orientation in the presence of external disturbances (such as gravity). Research has tended to focus on static stability (ignoring the dynamics of the grasped object). Static stability is also of concern in the design of walking and running robots, although rather more pertinent is the problem of dynamic stability, in which a moving robot is stabilized by taking advantage of its inertial dynamics. Another important set of issues in robotics has to do with uncertainty. Robots are generally equipped with a limited set of sensors and these sensors are generally noisy and inherently ambiguous. To a certain extent the issue is the same as that treated in the preceding discussion of vision, and the solutions, involving algorithms for integra- tion and smoothing, are often essentially the same. In robotics, however, the sensory analysis is generally used to subserve a control law and the exigencies of feedback control introduce new problems (cf. CONTROL THEORY). Processing time must be held to a minimum and the system must focus on obtaining only that information needed for control. These objectives can be difficult to meet, and recent research in robotics has focused on minimizing the need for feedback, designing sequences of control actions that are guaranteed to bring objects into desired positions and orientations regardless of the initial conditions. Uncertainty is due not only to noisy sensors and hidden states, but also to igno- rance about the structure of the environment. Many robot systems actively model the environment, using system identification techniques from control theory, as well as more general supervised and unsupervised methods from machine learning. Special- ized representations are often used to represent obstacles (“configuration space”) and location in space (graphs and grids). Probabilistic approaches are often used to explic- itly represent and manipulate uncertainty within these formalisms. In classical robotic control methodology, the system attempts to recover as much of the state of the environment as possible, operates on the internal representation of the state using general planning and reasoning algorithms, and chooses a sequence of control actions to implement the selected plan. The sheer complexity of designing this kind of architecture has led researchers to investigate simpler architectures that make do with minimal internal state. BEHAVIOR-BASED ROBOTICS approaches the problem via an interacting set of elemental processes called “behaviors,” each of which is a simplified control law relating sensations and actions. REINFORCEMENT LEARNING has provided algorithms that utilize simplified evaluation signals to guide a search for improved laws; over time these algorithms approach the optimal plans that are derived (with more computational effort) from explicit planning algorithms (see ROBOTICS AND LEARNING). lxxxviii Computational Intelligence See also BEHAVIOR-BASED ROBOTICS; CONTROL THEORY; MANIPULATION AND GRASPING; MOBILE ROBOTS; REINFORCEMENT LEARNING; ROBOTICS AND LEARNING; WALKING AND RUNNING MACHINES 13 Complexity, Rationality, and Intelligence We have observed at several points in this introduction that COMPUTATIONAL COM- PLEXITY is a major problem for intelligent agents. To the extent that they can be ana- lyzed, most of the problems of perceiving, learning, reasoning, and decision making are believed to have a worst-case complexity that is at least exponential in the size of the problem description. Exponential complexity means that, for example, a problem of size 100 would take 10 billion years to solve on the fastest available computers. Given that humans face much larger problems than this all the time—we receive as input several billion bytes of information every second—one wonders how we man- age at all. Of course, there are a number of mitigating factors: an intelligent agent must deal largely with the typical case, not the worst case, and accumulated experience with similar problems can greatly reduce the difficulty of new problems. The fact remains, however, that humans cannot even come close to achieving perfectly rational behav- ior—most of us do fairly poorly even on problems such as chess, which is an infinites- imal subset of the real world. What, then, is the right thing for an agent to do, if it cannot possibly compute the right thing to do? In practical applications of AI, one possibility is to restrict the allowable set of problems to those that are efficiently soluble. For example, deductive database sys- tems use restricted subsets of logic that allow for polynomial-time inference. Such research has given us a much deeper understanding of the sources of complexity in reasoning, but does not seem directly applicable to the problem of general intelli- gence. Somehow, we must face up to the inevitable compromises that must be made in the quality of decisions that an intelligent agent can make. Descriptive theories of such compromises—for example, Herbert Simon’s work on satisficing—appeared soon after the development of formal theories of rationality. Normative theories of BOUNDED RATIONALITY address the question at the end of the preceding paragraph by examining what is achievable with fixed computational resources. One promising approach is to devote some of those resources to METAREASONING (see also META- COGNITION), that is, reasoning about what reasoning to do. The technique of EXPLA- NATION-BASED LEARNING (a formalization of the common psychological concept of chunking or knowledge compilation) helps an agent cope with complexity by caching efficient solutions to common problems. Reinforcement learning methods enable an agent to learn effective (if not perfect) behaviors in complex environments without the need for extended problem-solving computations. What is interesting about all these aspects of intelligence is that without the need for effective use of limited computational resources, they make no sense. That is, computational complexity may be responsible for many, perhaps most, of the aspects of cognition that make intelligence an interesting subject of study. In contrast, the cog- nitive structure of an infinitely powerful computational device could be very straight- forward indeed. See also BOUNDED RATIONALITY; EXPLANATION-BASED LEARNING; METACOGNI- TION; METAREASONING 14 Additional Sources Early AI work is covered in Feigenbaum and Feldman’s (1963) Computers and Thought, Minsky’s (1968) Semantic Information Processing, and the Machine Intelli- gence series edited by Donald Michie. A large number of influential papers are col- lected in Readings in Artificial Intelligence (Webber and Nilsson 1981). Early papers on neural networks are collected in Neurocomputing (Anderson and Rosenfeld 1988). The Encyclopedia of AI (Shapiro 1992) contains survey articles on almost every topic in AI. The four-volume Handbook of Artificial Intelligence (Barr and Feigenbaum Computational Intelligence lxxxix 1981) contains descriptions of almost every major AI system published before 1981. Standard texts on AI include Artificial Intelligence: A Modern Approach (Russell and Norvig 1995) and Artificial Intelligence: A New Synthesis (Nilsson 1998). Historical surveys include Kurzweil (1990) and Crevier (1993). The most recent work appears in the proceedings of the major AI conferences: the biennial International Joint Conference on AI (IJCAI); the annual National Confer- ence on AI, more often known as AAAI after its sponsoring organization; and the European Conference on AI (ECAI). The major journals for general AI are Artificial Intelligence, Computational Intelligence, the IEEE Transactions on Pattern Analysis and Machine Intelligence, and the electronic Journal of Artificial Intelligence Research. There are also many journals devoted to specific areas, some of which are listed in the relevant articles. The main professional societies for AI are the American Association for Artificial Intelligence (AAAI), the ACM Special Interest Group in Artificial Intelligence (SIGART), and the Society for Artificial Intelligence and Simu- lation of Behaviour (AISB). AAAI’s AI Magazine and the SIGART Bulletin contain many topical and tutorial articles as well as announcements of conferences and work- shops. References Anderson, J. A., and E. Rosenfeld, Eds. (1988). Neurocomputing: Foundations of Research. Cam- bridge, MA: MIT Press. Barr, A., P. R. Cohen, and E. A. Feigenbaum, Eds. (1989). The Handbook of Artificial Intelli- gence, vol. 4. Reading, MA: Addison-Wesley. Barr, A., and E. A. Feigenbaum, Eds. (1981). The Handbook of Artificial Intelligence, vol. 1. Stan- ford and Los Altos, CA: HeurisTech Press and Kaufmann. Barr, A., and E. A. Feigenbaum, Eds. (1982). The Handbook of Artificial Intelligence, vol. 2. Stan- ford and Los Altos, CA: HeurisTech Press and Kaufmann. Boden, M. A. (1977). Artificial Intelligence and Natural Man. New York: Basic Books. Brooks, R. A. (1991). Intelligence without representation. Artificial Intelligence 47(1–3): 139–159. Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton. Cohen, P. R., and E. A. Feigenbaum, Eds. (1982). The Handbook of Artificial Intelligence, vol. 3. Stanford and Los Altos, CA: HeurisTech Press and Kaufmann. Crevier, D. (1993). AI: The Tumultuous History of the Search for Artificial Intelligence. New York: Basic Books. Dreyfus, H. L. (1992). What Computers Still Can’t Do: A Critique of Artificial Reason. Cambridge, MA: MIT Press. Feigenbaum, E. A., and J. Feldman, Eds. (1963). Computers and Thought. New York: McGraw-Hill. Fikes, R. E., and N. J. Nilsson. (1971). STRIPS: A new approach to the application of theorem prov- ing to problem solving. Artificial Intelligence 2(3–4): 189–208. Haugeland, J., Ed. (1985). Artificial Intelligence: The Very Idea. Cambridge, MA: MIT Press. Kurzweil, R. (1990). The Age of Intelligent Machines. Cambridge, MA: MIT Press. McCarthy, J. (1958). Programs with common sense. Proceedings of the Symposium on Mechanisa- tion of Thought Processes, vol. 1. London: Her Majesty’s Stationery Office, pp. 77–84. McCarthy, J. (1963). Situations, actions, and causal laws. Memo 2. Stanford, CA: Stanford Univer- sity Artificial Intelligence Project. McCulloch, W. S., and W. Pitts. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5: 115–137. Minsky, M. L., Ed. (1968). Semantic Information Processing. Cambridge, MA: MIT Press. Minsky, M. L. (1986). The Society of Mind. New York: Simon & Schuster. Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press. Newell, A., and H. A. Simon. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. Nilsson, N. J. (1998). Artificial Intelligence: A New Synthesis. San Mateo, CA: Morgan Kaufmann. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann. Poole, D., A. Mackworth, and R. Goebel. (1997). Computational Intelligence: A Logical Approach. Oxford: Oxford University Press. Raphael, B. (1976). The Thinking Computer: Mind Inside Matter. New York: W. H. Freeman. Robinson, J. A. (1965). A machine-oriented logic based on the resolution principle. Journal of the Association for Computing Machinery 12: 23–41. Russell, S. J., and P. Norvig. (1995). Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall. Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3(3): 210–229. xc Computational Intelligence Shapiro, S. C., Ed. (1992). Encyclopedia of Artificial Intelligence. 2nd ed. New York: Wiley. Turing, A. M. (1950). Computing machinery and intelligence. Mind 59: 433–460. Von Neumann, J., and O. Morgenstern. (1944). Theory of Games and Economic Behavior. 1st ed., Princeton, NJ: Princeton University Press. Webber, B. L., and N. J. Nilsson, Eds. (1981). Readings in Artificial Intelligence. San Mateo, CA: Morgan Kaufmann. Wellman, M. P. (1990). Fundamental concepts of qualitative probabilistic networks. Artificial Intelli- gence 44(3): 257–303. Linguistics and Language Gennaro Chierchia 1 Language and Cognition Why is the study of language central to cognition? The answer lies in the key proper- ties of language as they manifest themselves in the way speakers use it. The best way to get a sense of the centrality of language in understanding cognitive phenomena is through some examples. In the rest of this introduction I illustrate some features of language that display surprising regularities. Among the many ways in which an effi- cient communication code could be designed, natural languages seem to choose quite peculiar ones. The question is why. We consider some of the answers that modern linguistics gives to this question, which lead us into a scenic (if necessarily brief) tour of its main problematics. In particular, section 2 is devoted to language structure and its main articulations. Section 3 is devoted to language use, its interplay with lan- guage structure, and the various disciplines that deal with these matters. We then close, in section 4, with a few short remarks on the place of linguistics within cogni- tive science. Languages are made of words. How many words do we know? This is something that can be estimated quite accurately (see Pinker 1994: 149 ff.). To set a base line, consider that Shakespeare in his works uses roughly 15,000 different words. One would think that the vocabulary of, say, a high school student, is considerably poorer. Instead, it turns out that a high school senior reliably understands roughly 45,000 words out of a lexicon of 88,500 unrelated words. It might be worth mentioning how one arrives at this estimate. One samples randomly the target corpus of words and per- forms simple comprehension tests on the sample. The results are then statistically pro- jected to the whole corpus. Now, the size of the vocabulary of a high school senior entails that from when the child starts learning words at a few months of age until the age of eighteen, he or she must be learning roughly a word every hour and half when awake. We are talking here of learning arbitrary associations of sound patterns with meanings. Compare this with the effort it takes to learn an even short poem by heart, or the names of a handful of basketball players. The contrast is striking. We get to understand 45,000 words with incomparably less effort, to the point of not even being aware of it. This makes no sense without the assumption that our mind must be espe- cially equipped with something, a cognitive device of some sort, that makes us so suc- cessful at the task of learning words. This cognitive device must be quite specialized for such a task, as we are not as good at learning poems or the names of basketball players (cf. WORD MEANING, ACQUISITION OF) The world of sounds that make up words is similarly complex. We all find the sounds of our native language easy to distinguish. For example, to a native English speaker the i-sounds in “leave” and “live” are clearly different. And unless that person is in especially unfavorable conditions, he or she will not take one for the other. To a native English speaker, the difficulty that an Italian learning English (as an adult) encounters in mastering such distinctions looks a bit mysterious. Italians take revenge when English speakers try to learn the contrast between words like “fato” ‘fate’ vs. “fatto” ‘fact.’ The only difference between them is that the t-sound in fatto sounds to the Italian speaker slightly longer or tenser, a contrast that is difficult for a speaker of English to master. These observations are quite commonplace. The important point, however, is that a child exposed to the speech sounds of any language picks them up effortlessly. The clicks of Zulu (sounds similar to the “tsk-tsk” of disapproval) or the implosive sounds of Sindhi, spoken in India and Pakistan (sounds produced by suck- ing in air, rather than ejecting it—see ARTICULATION) are not harder for the child to acquire than the occlusives of English. Adults, in contrast, often fail to learn to pro- duce sounds not in their native repertoire. Figuring out the banking laws or the foods of a different culture is generally much easier. One would like to understand why. xcii Linguistics and Language Behind its daily almost quaint appearance, language seems to host many remark- able regularities of the sort just illustrated. Here is yet another example taken from a different domain, that of pronouns and ANAPHORA. Consider the following sentence: (1) John promised Bill to wash him. Any native speaker of English will agree that the pronoun “him” in (1) can refer to “Bill” (the object—see GRAMMATICAL RELATIONS), but there is no way it can refer to “John” (the subject). If we want a pronoun that refers to “John” in a sentence like (1), we have to use a reflexive: (2) John promised Bill to wash himself. The reflexive “himself” in (2) refers to “John.” It cannot refer to “Bill.” Compare now (1) with (3): (3) John persuaded Bill to wash him Here “him” can refer to the subject, but not to the object. If we want a pronoun to refer to “Bill” we have to use (4) John persuaded Bill to wash himself. The reflexive “himself” in (4) must refer to the object. It cannot refer to the subject. By comparing (1) and (2) with (3) and (4), we see that the way pronouns work with verbs like “promise” appears to be the opposite of verbs like “persuade.” Yet the struc- ture of these sentences appears to be identical. There must be a form of specialized, unconscious knowledge we have that makes us say “Yes, ‘him’ can refer to the subject in (1) but not in (3).” A very peculiar intuition we have grown to have. What is common to these different aspects of language is the fact that our linguistic behavior reveals striking and complex regularities. This is true throughout the lan- guages of the worlds. In fact the TYPOLOGY of the world's languages reveals signifi- cant universal tendencies. For example, the patterns of word order are quite limited. The most common basic orders of the major sentence constituents are subject-verb- object (abbreviated as SVO) and SOV. Patterns in which the object precedes the sub- ject are quite rare. Another language universal one might mention is that all languages have ways of using clauses to modify nouns (as in “the boy that you just met,” where the relative clause “that you just met” modifies the noun “boy”). Now structural prop- erties of this sort are not only common to all known spoken languages but in fact can be found even in SIGN LANGUAGES, that is, visual-gestural languages typically in use in populations with impaired verbal abilities (e.g., the deaf). It seems plausible to maintain that universal tendencies in language are grounded in the way we are; this must be so for speaking is a cognitive capacity, that capacity in virtue of which we say that we “know” our native language. We exercise such capacity in using language. A term often used in this connection is “linguistic competence.” The way we put such competence to use in interacting with our environment and with each other is called “performance.” The necessity to hypothesize a linguistic competence can be seen also from another point of view. Language is a dynamic phenomenon, dynamic in many senses. It changes across time and space (cf. LANGUAGE VARIATION AND CHANGE). It varies along social and gender dimensions (cf. LANGUAGE AND GENDER; LANGUAGE AND CULTURE). It also varies in sometimes seemingly idiosyncratic ways from speaker to speaker. Another important aspect of the dynamic character of language is the fact that a speaker can produce and understand an indefinite number of sentences, while having finite cognitive resources (memory, attention span, etc.). How is this possible? We must assume that this happens by analogy with the way we, say, add two numbers we have never added before. We can do it because we have mastered a combinatorial device, an ALGORITHM. But the algorithm for adding we have learned through explicit training. The one for speaking appears to grow spontaneously in the child. Such an algorithm is constitutive of our linguistic competence. The fact that linguistic competence does not develop through explicit training can be construed as an argument in favor of viewing it as a part of our genetic endowment Linguistics and Language xciii (cf. INNATENESS OF LANGUAGE). This becomes all the more plausible if one considers how specialized the knowledge of a language is and how quickly it develops in the child. In a way, the child should be in a situation analogous to that of somebody who is trying to break the mysteries of an unknown communication code. Such a code could have in principle very different features from that of a human language. It might lack a distinction between subjects and objects. Or it might lack the one between nouns and verbs. Many languages of practical use (e.g., many programming lan- guages) are designed just that way. The range of possible communication systems is huge and highly differentiated. This is part of the reason why cracking a secret code is very hard—as hard as learning an unfamiliar language as an adult. Yet the child does it without effort and without formal training. This seems hard to make sense of with- out assuming that, in some way, the child knows what to look for and knows what properties of natural speech he or she should attend to in order to figure out its gram- mar. This argument, based on the observation that language learning constitutes a spe- cialized skill acquired quickly through minimal input, is known as the POVERTY OF THE STIMULUS ARGUMENT. It suggests that linguistic competence is a relatively auton- omous computational device that is part of the biological endowment of humans and guides them through the acquisition of language. This is one of the planks of what has come to be known as GENERATIVE GRAMMAR, a research program started in the late 1950s by Noam Chomsky, which has proven to be quite successful and influential. It might be useful to contrast this view with another one that a priori might be regarded as equally plausible (see CONNECTIONIST APPROACHES TO LANGUAGE). Humans seem to be endowed with a powerful all-purpose computational device that is very good at extracting regularities from the environment. Given that, one might hypothesize that language is learned the way we learn any kind of algorithm: through trial and error. All that language learning amounts to is simply applying our high-level computational apparatus to linguistic input. According to this view, the child acquires language similarly to how she learns, say, doing division, the main difference being in the nature of the input. Learning division is of course riddled with all sorts of mistakes that the child goes through (typical ones involve keeping track of rests, misprocessing partial results, etc.). Consider, in this connection, the pattern of pronominalization in sentences (1) through (4). If we learn languages the way we learn division, the child ought to make mistakes in figuring out what can act as the antecedent of a reflexive and what cannot. In recent years there has been extensive empirical investigation of the behavior of pronominal elements in child language (see BINDING THEORY; SYN- TAX, ACQUISITION OF; SEMANTICS, ACQUISITION OF). And this was not what was found. The evidence goes in the opposite direction. As soon as reflexives and nonre- flexive pronouns make their appearance in the child’s speech, they appear to be used in an adult-like manner (cf. Crain and McKee 1985; Chien and Wexler 1990; Grodzin- sky and Reinhart 1993). Many of the ideas we find in generative grammar have antecedents throughout the history of thought (cf. LINGUISTICS, PHILOSOPHICAL ISSUES). One finds important debates on the “conventional” versus “natural” origins of language already among the presocratic philosophers. And many ancient grammarians came up with quite sophisti- cated analyses of key phenomena. For example the Indian grammarian Panini (fourth to third century B.C.) proposed an analysis of argument structure in terms of THEMATIC ROLES (like agent, patient, etc.), quite close in spirit to current proposals. The scientific study of language had a great impulse in the nineteenth century, when the historical links among the languages of the Indo-European family, at least in their general setup, were unraveled. A further fundamental development in our century was the structuralist approach, that is the attempt to characterize in explicit terms language structure as it manifests itself in sound patterns and in distributional patterns. The structuralist move- ment started out in Europe, thanks to F. DE SAUSSURE and the Prague School (which included among it protagonists N. Trubeckoj and R. JAKOBSON) and developed, then, in somewhat different forms, in the United States through the work of L. BLOOMFIELD, E. SAPIR, Z. Harris (who was Chomsky’s teacher), and others. Structuralism, besides leav- ing us with an accurate description of many important linguistic phenomena, consti- tuted the breeding ground for a host of concepts (like “morpheme,” “phoneme,” etc.) xciv Linguistics and Language that have been taken up and developed further within the generative tradition. It is against this general background that recent developments should be assessed. See also ALGORITHM; ANAPHORA; ARTICULATION; BINDING THEORY; BLOOMFIELD, LEONARD; CONNECTIONIST APPROACHES TO LANGUAGE; GENERATIVE GRAMMAR; GRAMMATICAL RELATIONS; INNATENESS OF LANGUAGE; JAKOBSON, ROMAN; LAN- GUAGE AND CULTURE; LANGUAGE AND GENDER; LANGUAGE VARIATION AND CHANGE; LINGUISTICS, PHILOSOPHICAL ISSUES; POVERTY OF THE STIMULUS ARGU- MENTS; SAPIR, EDWARD; SAUSSURE, FERDINAND DE; SEMANTICS, ACQUISITION OF; SIGN LANGUAGES; SYNTAX, ACQUISITION OF; THEMATIC ROLES; TYPOLOGY; WORD MEANING, ACQUISITION OF 2 Language Structure Our linguistic competence is made up of several components (or “modules,” see MOD- ULARITY AND LANGUAGE) that reflect the various facets of language, going from speech sounds to meaning. In this section we will review the main ones in a necessar- ily highly abbreviated from. Language can be thought of as a LEXICON and combinati- orial apparatus. The lexicon is constituted by the inventory of words (or morphemes) through which sentences and phrases are built up. The combinatorial apparatus is the set of rules and principles that enable us to put words together in well-formed strings, and to pronounce and interpret such strings. What we will see, as we go through the main branches of linguistics, is how the combinatorial machinery operates throughout the various components of grammar. Meanwhile, here is a rough road map of major modules that deal with language structure. MORPHOLOGY word structure [[quickADJ] lyAF]ADV PHONOLOGY SYNTAX Grammar of speech sounds Structure of phrases bug + s ⇒ /bugz/ John saw Eve *saw Eve John SEMANTICS PHONETICS Interpretation of linguistic signs Articulatory and John saw Bill ≈ acoustic properties Bill was seen by John of speech /kwIkli/ See also LEXICON; MODULARITY AND LANGUAGE Words and Sounds We already saw that the number of words we know is quite remarkable. But what do we mean by a “word”? Consider the verb “walk” and its past tense “walked.” Are these two different words? And how about “walk” versus “walker”? We can clearly detect some inner regular components to words like “walked” namely the stem “walk” (which is identical to the infinitival form) and the ending “-ed,” which signals “past.” These components are called “morphemes;” they constitute the smallest elements with an identifiable meaning we can recognize in a word. The internal structure of words is the object of the branch of linguistics known as MORPHOLOGY. Just like sen- tences are formed by putting words together, so words themselves are formed by put- ting together morphemes. Within the word, that is, as well as between words, we see a combinatorial machinery at work. English has a fairly simple morphological structure. Languages like Chinese have even greater morphological simplicity, while languages like Turkish or Japanese have a very rich morphological structure. POLYSYNTHETIC LANGUAGES are perhaps the most extreme cases of morphological complexity. The Linguistics and Language xcv following, for example, is a single word of Mohawk, a polysynthetic North American Indian language (Baker 1996: 22): (5) ni-mic-tomi-maka first person-second person-money-give ‘I’ll give you the money.’ Another aspect of morphology is compounding, which enables one to form complex words by “glomming” them together. This strategy is quite productive in English, for example, blackboard, blackboard design, blackboard design school, and so on. Com- pounds can be distinguished from phrases on the basis of a variety of converging crite- ria. For example, the main stress on compounds like “blackboard” is on “black,” while in the phrase “black board” it is on “board” (cf. STRESS, LINGUISTIC; METER AND POETRY). Moreover syntax treats compounds as units that cannot be separated by syn- tactic rules. Through morphological derivation and compounding the structure of the lexicon becomes quite rich. So what is a word? At one level, it is what is stored in our mental lexicon and has to be memorized as such (a listeme). This is the sense in which we know 45,000 (unre- lated) words. At another, it is what enters as a unit into syntactic processes. In this sec- ond sense (but not in the first) “walk” and “walked” count as two words. Words are formed by composing together smaller meaningful units (the morphemes) through spe- cific rules and principles. Morphemes are, in turn, constituted by sound units. Actually, speech forms a contin- uum not immediately analyzable into discrete units. When exposed to an unfamiliar language, we can not tell where, for example, the word boundaries are, and we have difficulty in identifying the sounds that are not in our native inventory. Yet speakers classify their speech sound stream into units, the phonemes. PHONETICS studies speech sounds from an acoustic and articulatory point of view. Among other things, it provides an alphabet to notate all of the sounds of the world’s languages. PHONOLOGY studies how the range of speech sounds are exploited by the grammars of different languages and the universal laws of the grammar of sounds. For example, we know from phonet- ics that back vowels (produced by lifting the rear of the tongue towards the palate) can be rounded (as in “hot”) or unrounded (as in “but”) and that this is so also for front vowels (produced by lifting the tongue toward the front of the vocal tact). The i-sound in “feet” is a high, front, unrounded vowel; the sound of the corresponding German word “füsse” is also pronounced raising the tongue towards the front, but is rounded. If a language has rounded front vowels it also has rounded back vowels. To illustrate, Ital- ian has back rounded vowels, but lacks altogether unrounded back vowels. English has both rounded and unrounded back vowels. Both English and Italian lack front rounded vowels. German and French, in contrast, have them. But there is no language that has in its sound inventory front rounded vowels without also having back rounded ones. This is the form that constraints on possible systems of phonemes often take. As noted in section 1, the type of sounds one finds in the world’s languages appear to be very varied. Some languages may have relatively small sound inventories consti- tuted by a dozen phonemes (as, for example, Polynesian); others have quite large ones with about 140 units (Khoisan). And there are of course intermediate cases. One of the most important linguistic discoveries of this century has been that all of the wide vari- ety of phonemes we observe can be described in terms of a small universal set of DIS- TINCTIVE FEATURES (i.e., properties like “front,” “rounded,” “voiced,” etc.). For example, /p/ and /b/ (bilabial stops) have the same feature composition except for the fact that the former is voiceless (produced without vibration of the vocal cords) while the latter is voiced. By the same token, the phoneme /k/, as in “bake,” and the final sound of the German word “Bach” are alike, except in one feature. In the former the air flux is completely interrupted (the sound is a stop) by lifting the back of the tongue up to the rear of the palate, while in the latter a small passage is left which results in a tur- bulent continuous sound (a fricative, notated in the phonetic alphabet as /x/). So all phonemes can be analyzed as feature structures. There is also evidence that features are not just a convenient way to classify pho- nemes but are actually part of the implicit knowledge that speakers have of their xcvi Linguistics and Language language. One famous experiment that provides evidence of this kind has to do with English plurals. In simplified terms, plurals are formed by adding a voiced alveolar fricative /z/ after a voiced sound (e.g., fad[z]) and its voiceless counterpart /s/ after a voiceless one (e.g., fat[s]). This is a form of assimilation, a very common phonologi- cal process (see PHONOLOGICAL RULES AND PROCESSES). If a monolingual English speaker is asked to form the plural of a word ending in a phoneme that is not part of his or her native inventory and has never been encountered before, that speaker will follow the rule just described; for example, the plural of the word “Bach” will be [baxs] not [baxz]. This means that in forming the plural speakers are actually access- ing the featural make up of the phonemes and analyzing phonemes into voiced verus voiceless sets. They have not just memorized after which sounds /s/ goes and after which /z/ goes (see Akmajian et al. 1990: chapter 3 and references therein). Thus we see that even within sound units we find smaller elements, the distinctive features, combined according to certain principles. Features, organized in phonemes, are manipulated by rule systems. Phonemes are in turn structured into larger prosodic constituents (see PROSODY AND INTONATION), which constitute the domains over which stress and TONE are determined. On the whole we see that the world of speech sounds is extremely rich in structure and its study has reached a level of remarkable theoretical sophistication (for recent important developments, see OPTIMALITY THEORY). See also DISTINCTIVE FEATURES; METER AND POETRY; MORPHOLOGY; OPTIMALITY THEORY; PHONETICS; PHONOLOGICAL RULES AND PROCESSES; PHONOLOGY; POLYSYN- THETIC LANGUAGES; PROSODY AND INTONATION; STRESS, LINGUISTIC; TONE Phrases The area where we perhaps most clearly see the power of the combinatorial machin- ery that operates in language is SYNTAX, the study of how words are composed into phrases. In constructing sentences, we don't merely put words into certain sequences, we actually build up a structure. Here is a simple illustration. English is an SVO language, whose basic word order in simple sentences is the one in (6a). (6) a. Kim saw Lee b. *saw Lee Kim b'. Ha visto Lee Kim (Italian) c. *Kim Lee saw c'. Kim-ga Lee-o mita (Japanese) Alternative orders, such as those in (6b–c), are ungrammatical in English. They are grammatical in other languages; thus (6b'), the word-by-word Italian translation of (6b), is grammatical in Italian; and so is (6c'), the Japanese translation of (6c). A pri- ori, the words in (6a) could be put together in a number of different ways, which can be represented by the following tree diagrams: (7) a. S b. S c. S N V N X N N X Kim saw Lee N V V N Kim saw Lee Kim saw Lee where: S = sentence, N = noun The structure in (7a) simply says that “Kim,” “Lee,” and “saw” are put together all at once and that one cannot recognize any subunit within the clause. Structure (7b) says that there is a subunit within the clause constituted by the subject plus the verb; (7c) that the phrasing actually puts together the verb plus the object. The right analysis for English turns out to be (7c), where the verb and the object form a unit, a constituent called the verb phrase (VP), whose “center,” or, in technical terms, whose “head” is the verb. Interestingly, such an analysis turns out to be right also for Japanese and Ital- ian, and, it seems, universally. In all languages, the verb and the object form a unit. Linguistics and Language xcvii There are various ways of seeing that it must be so. A simple one is the following: lan- guages have proforms that is elements that lack an inherent meaning and get their semantic value from a linguistic antecedent (or, in some cases, the extralinguistic con- text). Personal pronouns like “he” or “him” are a typical example: (8) A tall boy came in. Paul greeted him warmly. Here the antecedent of “him” is most naturally construed as “a tall boy”. “Him” is a noun phrase (NP), that is, it has the same behavior as things like “Kim” or “a tall boy,” which can act as its antecedent. Now English, as many other languages, also has pro- forms that clearly stand for V+object sequences: (9) Kim saw Lee. Mary swears that Paul did too “Did” in (9) is understood as “saw Kim.” This means that the antecedent of “did” in (9) is the verb+object sequence of the previous sentence. This makes sense if we assume that such sequences form a unit, the VP (just like “a tall boy” forms an NP). Notice that English does not have a proform that stands for the subject plus a tran- sitive verb. There is no construction of the following sort: (10) Kim saw Lee. Mary swears that PROed John too. [meaning: “Mary swears that Kim saw John too”] The hypothetical element “PROed” would be an overt morpheme standing for a sub- ject+transitive verb sequence. From a logical point of view, a verb+subject proform doesn't look any more complex than a verb+object proform. From a practical point of view, such a proform could be as useful and as effective for communication as the pro- form “did.” Yet there is nothing like “PROed,” and not just in English. In no known language does such a proform appear. This makes sense if we assume that proforms must be constituents of some kind and that verb + object (in whatever order they come) forms a constituent. If, instead, the structure of the clause were (7a) there would be no reason to expect such asymmetry. And if the structure were (7b), we would expect proforms such as “PROed” to be attested. A particularly interesting case is constituted by VSO languages, such as Irish, Bre- ton, and many African languages, etc. Here is an Irish example (Chung and McClos- key 1987: 218): (11) Ni olan se bainne ariamh Neg drink-PRES. he milk ever He never drinks milk. In this type of language the V surfaces next to the subject, separated from the object. If simple linear adjacency is what counts, one might well expect to find in some lan- guage of this form a verbal proform that stands for the verb plus the subject. Yet no VSO language has such a proform. This peculiar insistence on banning a potentially useful item even where one would expect it to be readily available can be understood if we assume that VSO structures are obtained by moving the verbal head out of a canonical VP as indicated in what follows: (12) S V S olan NP VP se V NP t bainne The process through which (11) is derived is called HEAD MOVEMENT and is analo- gous to what one observes in English alternations of the following kind: xcviii Linguistics and Language (13) a. Kim has seen Lee. b. Has Kim seen Lee? In English, yes-no questions are formed by fronting the auxiliary. This process that applies in English to questions applies in Irish more generally, and is what yields the main difference in basic word order between these languages (see Chung and McCloskey 1987 for evidence and references). Summing up, there is evidence that in sentences like (6a) the verb and the object are tied together by an invisible knot. This abstract structure in constituents manifests itself in a number of phenomena, of which we have discussed one: the existence of VP pro- forms, in contrast with the absence of subject+verb proforms. The latter appears to be a universal property of languages and constitutes evidence in favor of the universality of the VP. Along the way, we have also seen how languages can vary and what mechanisms can be responsible for such variations (cf. X-BAR THEORY). Generally speaking, words are put together into larger phrases by a computational device that builds up structures on the basis of relatively simple principles (like: “put a head next to its complement” or “move a head to the front of the clause”). Aspects of this computational device are uni- versal and are responsible for the general architecture that all languages share; others can vary (in a limited way) and are responsible for the final form of particular languages. There is converging evidence that confirms the psychological reality of constituent structure, that is, the idea that speakers unconsciously assign a structure in constitu- ents to sequences of words. A famous case that shows this is a series of experiments known as the “click” experiments (cf. Fodor, Bever, and Garret 1974). In these exper- iments, subjects were presented with a sentence through a headphone. At some stage during this process a click sound was produced in the headphone and subjects were then asked at which point of the presentation the click occurred. If the click occurred at major constituent breaks (such as the one between the subject and the VP) the sub- jects were accurate in recalling when it occurred. If, however, the click occurred within a constituent, subjects would make systematic mistakes in recalling the event. They would overwhelmingly displace the click to the closest constituent break. This behavior would be hard to explain if constituent structure were not actually computed by subjects in processing a sentence (see Clark and Clark 1977 for further discussion). Thus, looking at the syntax of languages we discover a rich structure that reveals fundamental properties of the computational device that the speaker must be endowed with in order to be able to speak (and understand). There are significant disagreements as to the specifics of how these computational devices are structured. Some frame- works for syntactic analysis (e.g., CATEGORIAL GRAMMAR; HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR; LEXICAL FUNCTIONAL GRAMMAR) emphasize the role of the lexicon in driving syntactic computations. Others, like MINIMALISM, put their empha- sis on the economical design of the principles governing how sentences are built up (see also OPTIMALITY THEORY). Other kinds of disagreement concern the choice of primitives (e.g., RELATIONAL GRAMMAR and COGNITIVE LINGUISTICS). In spite of the liveliness of the debate and of the range of controversy, most, maybe all of these frameworks share a lot. For one thing, key empirical generalizations and discoveries can be translated from one framework to the next. For example, all frameworks encode a notion of constituency and ways of fleshing out the notion of “relation at a distance” (such as the one we have described above as head movement). All frame- works assign to grammar a universal structural core and dimensions along which par- ticular languages may vary. Finally, all major modern frameworks share certain basic methodological tenets of formal explicitness, aimed at providing mathematical mod- els of grammar (cf. FORMAL GRAMMARS). See also CATEGORIAL GRAMMAR; COGNITIVE LINGUISTICS; FORMAL GRAMMARS; HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR; HEAD MOVEMENT; LEXICAL FUNC- TIONAL GRAMMAR; MINIMALISM; RELATIONAL GRAMMAR; SYNTAX; X-BAR THEORY Interfaces Syntax interacts directly with all other major components of grammar. First, it draws from the lexicon the words to be put into phrases. The lexical properties of words (e.g. Linguistics and Language xcix whether they are verbs or nouns, whether and how many complements they need, etc.) will affect the kind of syntactic structures that a particular selection of words can enter into. For example, a sentence like “John cries Bill” is ungrammatical because “cry” is intransitive and takes no complement. Second, syntax feeds into phonology. At some point of the syntactic derivation we get the words in the order that we want to pro- nounce them. And third, syntax provides the input to semantic interpretation. To illustrate these interfaces further, consider the following set of sentences: (14) a. John ignores that Mary saw who. b. John ignores who Mary saw t. c. who does John ignore that Mary saw t. Here we have three kinds of interrogative structures. Sentence (14a) is not acceptable as a genuine question. It is only acceptable as an “echo” question, for example in reaction to an utterance of the form “John ignores that Mary saw so and so” where we do not understand who “so and so” is. Sentence (14b) contains an embedded question. In it, the wh-pronoun appears in place of the complementizer “that;” in other terms, in (14b), the pronoun “who” has been dislocated to the beginning of the embedded clause and “t” marks the site that it was moved from. Finally, sentence (14c), with the wh-pronoun moved to the beginning, constitutes a canonical matrix question (see WH-MOVEMENT). Now, the interpretations of (14b) and (14c) can be given roughly as follows: (15) a. John ignores (the answer to the question) for which x Mary saw x. b. (tell me the answer to the question) for which x John ignores that Mary saw x. The interpretations in (15) are quite close in form to the overt structures of (14b) and (14c) respectively, while the “echo” question (14a) is interpreted roughly as (15b), mod- ulo the special contexts to which it is limited. Thus it seems that the structure of English (non-echo) questions reflects quite closely its interpretation. Wh-pronouns are interpreted as question-forming operators. To make sense of such operators we need to know their scope (i.e., what is being asked). English marks the scope of wh-operators by putting them at the beginning of the clause on which they operate: the embedded clause in (14b), the matrix one in (14c). Now, it is quite telling to compare this with what happens in other languages. A particularly interesting case is that of Chinese (see, in particular, Huang 1982; Cheng 1991) where there is no visible wh-movement. Chinese only has the equiva- lent of (14a) (Huang 1992). (16) Zhangsan xian-zhidao [Lisi kanjian shei] Zhangsan ignores [Lisi see who] Sentence (16) in Chinese is ambiguous. It can either be interpreted as (15b) or as (15c). One way of making sense of this situation is along the following lines. Wh- pronouns must be assigned scope to be interpreted. One of the strategies that gram- mar makes available is placing the wh-pronoun at the beginning of the clause on which it operates. English uses such a strategy overtly. First the wh-word is fronted, then the result is fed to phonology (and hence pronounced) and to semantics (and hence interpreted). In Chinese, instead, one feeds to phonology the base structure (16); then wh-movement applies, as a step toward the computation of meaning. This gives rise to two abstract structures corresponding to (14b) and (14c) respectively: (17) a. Zhangsan xian-zhidao shei [Lisi kanjian t] b. shei Zhangsan xian-zhidao [Lisi kanjian t] The structures in (17) are what is fed to semantic interpretation. The process just sketched can be schematized as follows: (18) John ignores [Mary saw who] Zhangsan ignores [Lisi saw who] Phonology John ignores who [Mary saw t] Zhangsan ignores who [Lisi saw t] Phonology Semantics Semantics c Linguistics and Language In rough terms, in Chinese one utters the sentence in its basic form (which is semanti- cally ambiguous—see AMBIGUITY) then one does scoping mentally. In English, one first applies scoping (i.e., one marks what is being asked), then utters the result. This way of looking at things enables us to see question formation in languages as diverse as English and Chinese in terms of a uniform mechanism. The only difference lies in the level at which scoping applies. Scope marking takes place overtly in English (i.e., before the chosen sequence of words is pronounced). In Chinese, by contrast, it takes place covertly (i.e., after having pronounced the base form). This is why sentence (16) is ambiguous in Chinese. There are other elements that need to be assigned a scope in order to be interpreted. A prime case is constituted by quantified NPs like “a student” or “every advisor” (see QUANTIFIERS). Consider (19): (19) Kim introduced a new student to every advisor. This sentence has roughly the following two interpretations: (20) a. There is a student such that Kim introduced him to every advisor. b. Every advisor is such that Kim introduced a (possibly different) student to him. With the help of variables, these interpretation can also be expressed as follows: (21) a. There is some new student y such that for every advisor x, Kim introduced y to x. b. For every advisor x, there is some new student y such that Kim introduced y to x. Now we have just seen that natural language marks scope in questions by overt or covert movement. If we assume that this is the strategy generally made available to us by grammar, then we are led to conclude that also in cases like (19) scope must be marked via movement. That is, in order to interpret (19), we must determine the scope of the quantifiers by putting them at the beginning of the clause they operate on. For (19), this can be done in two ways: (22) a. [a new studenti every advisorj [ Kim introduced ti to tj ]] b. [ every advisorj a new studenti [ Kim introduced ti to tj ]] Both (22a) and (22b) are obtained out of (19). In (22a) we move “a new student” over “every advisor.” In (22b) we do the opposite. These structures correspond to the inter- pretations in (21a) and (21b), respectively. In a more standard logical notation, they would be expressed as follows: (23) a. [∃xi xi a new student][∀ xj xj an advisor] [Kim introduces xi to xj ] b. [∀ xj xj an advisor] [∃xi xi a new student] [Kim introduces xi to xj ] So in the interpretation of sentences with quantified NPs, we apply scoping to such NPs. Scoping of quantifiers in English is a covert movement, part of the mental com- putation of MEANING, much like scoping of wh-words in Chinese. The result of scop- ing (i.e., the structures in [22], which are isomorphic to [23]) is what gets semantically interpreted and is called LOGICAL FORM. What I just sketched in very rough terms constitutes one of several views currently being pursued. Much work has been devoted to the study of scope phenomena, in sev- eral frameworks. Such study has led to a considerable body of novel empirical general- izations. Some important principles that govern the behavior of scope in natural language have been identified (though we are far from a definitive understanding). Phe- nomena related to scope play an important role at the SYNTAX-SEMANTICS INTERFACE. In particular, according to the hypothesis sketched previously, surface syntactic repre- sentations are mapped onto an abstract syntactic structure as a first step toward being interpreted. Such an abstract structure, logical form, provides an explicit representation of scope, anaphoric links, and the relevant lexical information. These are all key factors in determining meaning. The hypothesis of a logical form onto which syntactic struc- ture is mapped fits well with the idea that we are endowed with a LANGUAGE OF Linguistics and Language ci as our main medium for storing and retrieving information, reasoning, and THOUGHT, so on. The reason why this is so is fairly apparent. Empirical features of languages lead linguists to detect the existence of a covert level of representation with the properties that the proponents of the language of thought hypothesis have argued for on the basis of independent considerations. It is highly tempting to speculate that logical form actu- ally is the language of thought. This idea needs, of course, to be fleshed out much more. I put it forth here in this “naive” form as an illustration of the potential of interaction between linguistics and other disciplines that deal with cognition. See also AMBIGUITY; LANGUAGE OF THOUGHT; LOGICAL FORM, ORIGINS OF; LOGI- CAL FORM IN LINGUISTICS; MEANING; QUANTIFIERS; SYNTAX-SEMANTICS INTERFACE; WH-MOVEMENT Meaning What is meaning? What is it to interpret a symbolic structure of some kind? This is one of the hardest question across the whole history of thought and lies right at the center of the study of cognition. The particular form it takes within the picture we have so far is: How is logical form interpreted? A consideration that constrains the range of possible answers to these questions is that our knowledge of meaning enables us to interpret an indefinite number of sentences, including ones we have never encountered before. To explain this we must assume, it seems, that the interpretation procedure is compositional (see COMPOSITIONALITY). Given the syntactic structure to be interpreted, we start out by retrieving the meaning of words (or morphemes). Because the core of the lexicon is finite, we can memorize and store the meaning of the lexical entries. Then each mode of composing words together into phrases (i.e., each configuration in a syntactic analysis tree) corresponds to a mode of composing meanings. Thus, cycling through syntactic structure we arrive eventually at the mean- ing of the sentence. In general, meanings of complex structures are composed by put- ting together word (or morpheme) meanings through a finite set of semantic operations that are systematically linked to syntactic configurations. This accounts, in principle, for our capacity of understanding a potential infinity of sentences, in spite of the limits of our cognitive capacities. Figuring out what operations we use for putting together word meanings is one of the main task of SEMANTICS. To address it, one must say what the output of such oper- ations is. For example, what is it that we get when we compose the meaning of the NP “Pavarotti” with the meaning of the VP “sings ‘La Boheme’ well”? More generally, what is the meaning of complex phrases and, in particular, what is the meaning of clauses? Although there is disagreement here (as on other important topics) on the ultimate correct answer, there is agreement on what it is that such an answer must afford us. In particular, to have the information that Pavarotti sings “La Boheme” well is to have also the following kind of information: (24) a. Someone sings “La Boheme” well. b. Not everyone sings “La Boheme” poorly. c. It is not the case that nobody sings “La Boheme” well. Barring performance errors or specific pathologies, we do not expect to find a compe- tent speaker of English who sincerely affirms that Pavarotti sings “La Boheme” well and simultaneously denies that someone does (or denies any of the sentences in [24]). So sentence meaning must be something in virtue of which we can compute how the information associated with the sentence in question is related to the information of other sentences. Our knowledge of sentence meaning enables us to place sentences within a complex network of semantic relationships with other sentences. The relation between a sentence like “Pavarotti sings well” and “someone sings well” (or any of the sentences in [24]), is called “entailment”. Its standard definition involves the concept of truth: A sentence A entails a sentence B if and only if whenever A is true, then B must also be true. This means that if we understand under what condi- tions a sentence is true, we also understand what its entailments are. Considerations such as these have lead to a program of semantic analysis based on truth conditions. cii Linguistics and Language The task of the semantic component of grammar is viewed as that of recursively spell- ing out the truth conditions of sentences (via their logical form). The truth conditions of simple sentences like “Pavarotti sings” are given in terms of the reference of the words involved (cf. REFERENCE, THEORIES OF). Thus “Pavarotti sings” is true (in a certain moment t) if Pavarotti is in fact the agent of an action of singing (at t). Truth conditions of complex sentences (like “Pavarotti sings or Domingo sings”) involve figuring out the contributions to truth conditions of words like “or.” According to this program, giving the semantics of the logical form of natural language sentences is closely related to the way we figure out the semantics of any logical system. Entailment, though not the only kind of important semantic relation, is certainly at the heart of a net of key phenomena. Consider for example the following pair: (25) a. At least two students who read a book on linguistics by Chomsky were in the audience. b. At least two students who read a book by Chomsky were in the audience. Clearly, (25a) entails (25b). It cannot be the case that (25a) is true while simultaneously (25b) is false. We simply know this a priori. And it is perfectly general: if “at least two As B” is the case and if the Cs form a superset of the As (as the books by Chomsky are a superset of the books on linguistics by Chomsky), then “at least two Cs B” must also be the case. This must be part of what “at least two” means. For “at most two” the opposite is the case: (26) a. At most two students who read a book on linguistics by Chomsky were in the audience. b. At most two students who read a book by Chomsky were in the audience. Here, (26a) does not entail (26b). It can well be the case that no more than two stu- dents read a book on linguistics by Chomsky, but more than two read books (on, say, politics) by Chomsky. What happens is that (26b) entails (26a). That is, if (26b) is the case, then (26a) cannot be false. Now there must be something in our head that enables us to converge on these judgments. That something must be constitutive of our knowledge of the meaning of the sentences in (25) and (26). Notice that our entailment judgment need not be immediate. To see that in fact (26b) entails (26a) requires some reflection. Yet any normal speaker of English will eventually converge in judging that in any situation in which (26b) is true, (26a) has also got to be. The relevance of entailment for natural language is one of the main discoveries of modern semantics. I will illustrate it in what follows with one famous example, having to do with the distributional properties of words like “any” (cf. Ladusaw 1979, 1992 and references therein). A word like “any” has two main uses. The first is exemplified in (27a): (27) a. You may pick any apple. b. A: Can I talk to John or is he busy with students now? c. B: No, wait. *He is talking to any student. c'. B: No, wait. He is talking to every student. c''. B: Go ahead. He isn’t talking to any student right now. The use exemplified by (27a) is called free choice “any.” It has a universal interpre- tation: sentence (27a) says that for every apple x, you are allowed to pick x. This kind of “any” seems to require a special modality of some kind (see e.g., Dayal 1998 and references therein). Such a requirement is brought out by the strangeness of sen- tences like (27c) (the asterisk indicates deviance), which, in the context of (27b), clearly describes an ongoing happening with no special modality attached. Free choice “any” seems incompatible with a plain descriptive mode (and contrasts in this with “every;” cf. [27c']). The other use of “any” is illustrated by (27c"). Even though this sentence, understood as a reply to (27b), reports on an ongoing happening, it is perfectly grammatical. What seems to play a crucial role is the presence of negation. Nonfree choice “any” seems to require a negative context of some kind and is there- fore called a negative polarity item. It is part of a family of expressions that includes, Linguistics and Language ciii for example, things like “ever” or “give a damn”: (28) a. *John gives a damn about linguistics. b. John doesn't give a damn about linguistics. c. *For a long time John ever ate chicken. d. For a long time, John didn't ever eat chicken. In English the free choice and the negative polarity senses of “any” are expressed by the same morphemes. But in many languages (e.g., most Romance languages) they are expressed by different words (for example, in Italian free choice “any” translates as “qualunque,” and negative polarity “any” translates as “alcuno”). Thus, while the two senses might well be related, it is useful to keep them apart in investigating the behavior of “any.” In what follows, we will concentrate on negative polarity “any” (and thus the reader is asked to abstract away from imagining the following examples in contexts that would make the free choice interpretation possible). The main puzzle in the behavior of words like “any” is understanding what exactly constitutes a “negative” context. Consider for example the following set of sentences: (29) a. *Yesterday John read any book. b. Yesterday John didn't read any book. c. *A student who read any book by Chomsky will want to miss his talk. d. No student who read any book by Chomsky will want to miss his talk. In cases such as these, we can rely on morphology: we actually see there the negative morpheme “no” or some of its morphological derivatives. But what about the follow- ing cases? (30) a. *At least two students who read any book by Chomsky were in the audience. b. At most two students who read any book by Chomsky were in the audience. In (30b), where “any” is acceptable, there is no negative morpheme or morphological derivative thereof. This might prompt us to look for a different way of defining the notion of negative context, maybe a semantic one. Here is a possibility: A logical property of negation is that of licensing entailments from sets to their subsets. Con- sider for example the days in which John read a book by Chomsky. They must be sub- sets of the days in which he read. This is reflected in the fact that (31a) entails (31b): (31) a. It is not the case that yesterday John read a book. b. It is not the case that yesterday John read a book by Chomsky. In (30) the entailment goes from a set (the set of days in which John read book) to its subsets (e.g., the set of days in which John read a book by Chomsky). Now this seems to be precisely what sentential negation, negative determiners like “no” and determin- ers like “at most n” have in common: they all license inferences from sets to subsets thereof. We have already seen that “at most” has precisely this property. To test whether our hypothesis is indeed correct and fully general, we should find something seemingly utterly “non-negative,” which, however, has the property of licensing entailments from sets to subsets. The determiner “every” gives us what we need. Such a determiner does not appear to be in any reasonable sense “negative,” yet, within a noun phrase headed by “every,” the entailment clearly goes from sets to subsets: (32) a. Every employee who smokes will be terminated. b. Every employee who smokes cigars will be terminated. If (32a) is true, then (32b) must also be. And the set of cigar smokers is clearly a sub- set of the set of smokers. If “any” wants to be in an environment with these entailment properties, then it should be grammatical within an NP headed by “every.” This is indeed so: (33) Every student who read any book by Chomsky will want to come to his talk. So the principle governing the distribution of “any” seems to be: (34) “any” must occur in a context that licenses entailments from sets to their subsets. civ Linguistics and Language Notice that within the VP in sentences like (32), the entailment to subsets does not hold. (35) a. Every employee smokes. b. Every employee smokes cigars. Sentence (35a) does not entail sentence (35b); in fact the opposite is the case. And sure enough, within the VP “any” is not licensed (I give also a sentence with “at most n” for contrast): (36) a. *Every student came to any talk by Chomsky. b. At most two students came to any talk by Chomsky. Surely no one explicitly taught us these facts. No one taught us that “any” is accept- able within an NP headed by “every,” but not within a VP of which an “every”-headed NP is subject. Yet we come to have convergent intuitions on these matters. Again, something in our mental endowment must be responsible for such judgments. What is peculiar to the case at hand is that the overt distribution of a class of morphemes like “any” appears to be sensitive to the entailment properties of their context. In particular, it appears to be sensitive to a specific logical property, that of licensing inferences from sets to subsets, which “no,” “at most n” and “every” share with sentential negation. It is worth noting that most languages have negative polarity items and their properties tend to be the same as “any,” with minimal variations (corresponding to degrees of “strength” of negativity). This illustrates how there are specific architectural features of grammar that cannot be accounted for without a semantic theory of entailment for nat- ural language. And it is difficult to see how to build such a theory without resorting to a compositional assignment of truth conditions to syntactic structures (or something that enables to derive the same effects—cf. DYNAMIC SEMANTICS). The case of negative polarity is by no means isolated. Many other phenomena could be used to illustrate this point (e.g. FOCUS; TENSE AND ASPECT). But the illustration just given will have to suf- fice for our present purposes. It is an old idea that we understand each other because our language, in spite of its VAGUENESS, has a logic. Now this idea is no longer just an intriguing hypothesis. The question on the table is no more whether this is true. The question is what the exact syntactic and semantic properties of this logic are. See also COMPOSITIONALITY; DYNAMIC SEMANTICS; FOCUS; REFERENCE, THEORIES OF; SEMANTICS; TENSE AND ASPECT; VAGUENESS 3 Language Use Ultimately, the goal of a theory of language is to explain how language is used in con- crete communicative situations. So far we have formulated the hypothesis that at the basis of linguistic behavior there is a competence constituted by blocks of rules or sys- tems of principles, responsible for sound structure, morphological structure, and so on. Each block constitutes a major module of our linguistic competence, which can in turn be articulated into further submodules. These rule systems are then put to use by the speakers in speech acts. In doing so, the linguistic systems interact in complex ways with other aspects of our cognitive apparatus as well as with features of the envi- ronment. We now turn to a consideration of these dimensions. Language in Context The study of the interaction of grammar with the CONTEXT of use is called PRAGMAT- ICS. Pragmatics looks at sentences within both the extralinguistic situation and the DISCOURSE of which it is part. For example, one aspect of pragmatics is the study of INDEXICALS AND DEMONSTRATIVES (like “I,” “here,” “now,” etc.) whose meaning is fixed by the grammar but whose reference varies with the context. Another important area is the study of PRESUPPOSITION, that is, what is taken for granted in uttering a sentence. Consider the difference between (37a) and (37b): (37) a. John ate the cake. b. It is John that ate the cake. Linguistics and Language cv How do they differ? Sentence (37a) entails that someone ate a cake. Sentence (37b), instead, takes it for granted that someone did and asserts that that someone is John. Thus, there are grammatical constructs such as clefting, exemplified in (37b), that appear to be specially linked to presupposition. Just like we have systematic intuitions about entailments, we do about presuppositions and how they are passed from simple sentences to more complex ones. Yet another aspect of pragmatics is the study of how we virtually always go beyond what is literally said. In ordinary conversational exchanges, one and the same sen- tence, for example, “the dog is outside,” can acquire the illocutionary force of a com- mand (“go get it”), of a request (“can you bring it in?”), of an insult (“you are a servant; do your duty”), or can assume all sort of metaphorical or ironical colorings, and so on, depending on what the situation is, what is known to the illocutionary agents, and so on. A breakthrough in the study of these phenomena is due to the work of P. GRICE. Grice put on solid grounds the commonsense distinction between literal meaning, that is, the interpretation we assign to sentences in virtue of rules of gram- mar and linguistic conventions, and what is conveyed or implicated, as Grice puts it, beyond the literal meaning. Grice developed a theory of IMPLICATURE based on the idea that in our use of grammar we are guided by certain general conversational norms to which we spontaneously tend to conform. Such norms instruct us to be cooperative, truthful, orderly, and relevant (cf. RELEVANCE AND RELEVANCE THEORY). These are norms that can be ignored or even flouted. By exploiting both the norms and their vio- lations systematically, thanks to the interaction of literal meaning and mutually shared information present in the context, the speaker can put the hearer in the position of inferring his communicative intentions (i.e., what is implicated). Some aspects of pragmatics (e.g., the study of deixis or presupposition) appear to involve grammar- specific rule systems, others, such as implicature, more general cognitive abilities. All of them appear to be rule governed. See also CONTEXT AND POINT OF VIEW; DISCOURSE; GRICE, PAUL; IMPLICATURE; INDEXICALS AND DEMONSTRATIVES; PRAGMATICS; PRESUPPOSITION; RELEVANCE AND RELEVANCE THEORY Language in Flux Use of language is an important factor in language variation. Certain forms of varia- tion tend to be a constant and relatively stable part of our behavior. We all master a number of registers and styles; often a plurality of grammatical norms are present in the same speakers, as in the case of bilinguals. Such coexisting norms affect one another in interesting ways (see CODESWITCHING). These phenomena, as well as prag- matically induced deviations from a given grammatical norm, can also result in actual changes in the prevailing grammar. Speakers' creative uses can bring innovations about that become part of grammar. On a larger scale, languages enter in contact through a variety of historical events and social dynamics, again resulting in changes. Some such changes come about in a relatively abrupt manner and involve simulta- neously many aspects of grammar. A case often quoted in this connection is the great vowel shift which radically changed the vowel space of English toward the end of the Middle English period. The important point is that the dynamic of linguistic change seems to take place within the boundaries of Universal Grammar as charted through synchronic theory (cf. LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR). In fact, it was precisely the discovery of the regularity of change (e.g., Grimm’s laws) that led to the discovery of linguistic structure. A particularly interesting vantage point on linguistic change is provided by the study of CREOLES (Bickerton 1975, 1981). Unlike most languages that evolve from a common ancestor (sometimes a hypothesized protolanguage, as in the case of the Indoeuropean family), Creoles arise from communities of speakers that do not share a native language. A typical situation is that of slaves or workers brought together by a dominating group that develop an impoverished quasi-language (a pidgin) in order to communicate with one another. Such quasi-languages typically have a small vocabu- lary drawn from several sources (the language of the dominating group or the native cvi Linguistics and Language languages of the speakers), no fixed word order, no inflection. The process of cre- olization takes place when such a language starts having its own native speakers, that is, speakers born to the relevant groups that start using the quasi-language of their par- ents as a native language. What typically happens is that all of a sudden the character- istics of a full-blown natural language come into being (morphological markers for agreement, case endings, modals, tense, grammaticized strategies for focusing, etc.). This process, which in a few lucky cases has been documented, takes place very rap- idly, perhaps even within a single generation. This has led Bickerton to formulate an extremely interesting hypothesis, that of a “bioprogram,” that is, a species-specific acquisition device, part of our genetic endowment, that supplies the necessary gram- matical apparatus even when such an apparatus is not present in the input. This raises the question of how such a bioprogram has evolved in our species, a topic that has been at the center of much speculation (see EVOLUTION OF LANGUAGE). A much debated issue is the extent to which language has evolved through natural selection, in the ways complex organs like the eye have. Although not much is yet known or agreed upon on this score, progress in the understanding of our cognitive abilities and of the neurological basis of language is constant and is likely to lead to a better understand- ing of language evolution (also through comparisons of the communication systems of other species; see ANIMAL COMMUNICATION; PRIMATE LANGUAGE). See also ANIMAL COMMUNICATION; CODESWITCHING; CREOLES; EVOLUTION OF LANGUAGE; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; PRIMATE LAN- GUAGE Language in the Mind The cognitive turn in linguistics has brought together in a particularly fruitful manner the study of grammar with the study of the psychological processes at its basis on the one hand and the study of other forms of cognition on the other. PSYCHOLINGUISTICS deals with how language is acquired (cf. LANGUAGE ACQUISITION) and processed in its everyday uses (cf. NATURAL LANGUAGE PROCESSING; SENTENCE PROCESSING). It also deals with language pathology, such as APHASIA and various kinds of develop- mental impairments (see LANGUAGE IMPAIRMENT, DEVELOPMENTAL). With regard to acquisition, the available evidence points consistently in one direc- tion. The kind of implicit knowledge at the basis of our linguistic behavior appears to be fairly specialized. Among all the possible ways to communicate and all the possi- ble structures that a system of signs can have, those that are actualized in the lan- guages of the world appear to be fairly specific. Languages exploit only some of the logically conceivable (and humanly possible) sound patterns, morphological mark- ings, and syntactic and semantic devices. Here we could give just a taste of how remarkable the properties of natural languages are. And it is not obvious how such properties, so peculiar among possible semiotic systems, can be accounted for in terms of, say, pragmatic effectiveness or social conventions or cultural inventiveness (cf. SEMIOTICS AND COGNITION). In spite of this, the child masters the structures of her language without apparent effort or explicit training, and on the basis of an often very limited and impoverished input. This is clamorously so in the case of creolization, but it applies to a significant degree also to “normal” learning. An extensive literature documents this claim in all the relevant domains (see WORD MEANING, ACQUISITION OF; PHONOLOGY, ACQUISITION OF; SYNTAX, ACQUISITION OF; SEMANTICS, ACQUISI- TION OF). It appears that language “grows into the child,” to put it in Chomsky’s terms; or that the child “invents” it, to put it in Pinker’s words. These considerations could not but set the debate on NATIVISM on a new and exciting standing. At the center of intense investigations there is the hypothesis that a specialized form of knowledge, Universal Grammar, is part of the genetic endowment of our species, and thus consti- tutes the initial state for the language learner. The key to learning, then, consists in fix- ing what Universal Grammar leaves open (see PARAMETER SETTING APPROACHES TO ACQUISITION, CREOLIZATION AND DIACHRONY). On the one hand, this involves setting the parameters of variation, the “switches” made available by Universal Grammar. On the other hand, it also involves exploiting, for various purposes such as segmenting the Linguistics and Language cvii stream of sound into words, generalized statistical abilities that we also seem to have (see Saffran, Aslin, and Newport 1996). The interesting problem is determining what device we use in what domain of LEARNING. The empirical investigation of child lan- guage proceeds in interaction with the study of the formal conditions under which acquisition is possible, which has also proven to be a useful tool in investigating these issues (cf. ACQUISITION, FORMAL THEORIES OF). Turning now to processing, planning a sentence, building it up, and uttering it requires a remarkable amount of cognitive work (see LANGUAGE PRODUCTION). The same applies to going from the continuous stream of speech sounds (or, in the case of sign languages, gestures) to syntactic structure and from there to meaning (cf. PROS- ODY AND INTONATION, PROCESSING ISSUES; SPEECH PERCEPTION; SPOKEN WORD REC- OGNITION; VISUAL WORD RECOGNITION). The measure of the difficulty of this task can in part be seen by how partial our progress is in programming machines to accomplish related tasks such as going from sounds to written words, or to analyze an actual text, even on a limited scale (cf. COMPUTATIONAL LINGUISTICS; COMPUTATIONAL LEXI- CONS). The actual use of sentences in an integrated discourse is an extremely complex set of phenomena. Although we are far from understanding it completely, significant discoveries have been made in the last decades, also thanks to the advances in linguis- tic theory. I will illustrate it with one well known issue in sentence processing. As is well known, the recursive character of natural language syntax enables us to construct sentences of indefinite length and complexity: (38) a. The boy saw the dog. b. The boy saw the dog that bit the cat. c. The boy saw the dog that bit the cat that ate the mouse. d. The boy saw the dog that bit the cat that ate the mouse that stole the cheese. In sentence (38b), the object is modified by a relative clause. In (38c) the object of the first relative clause is modified by another relative clause. And we can keep doing that. The results are not particularly hard to process. Now, subjects can also be modi- fied by relative clauses: (39) The boy that the teacher called on saw the dog. But try now modifying the subject of the relative clause. Here is what we get: (40) The boy that the teacher that the principal hates called on saw the dog. Sentence (40) is hard to grasp. It is formed through the same grammatical devices we used in building (39). Yet the decrease in intelligibility from (39) to (40) is quite dra- matic. Only after taking the time to look at it carefully can we see that (40) makes sense. Adding a further layer of modification in the most embedded relative clause in (40) would make it virtually impossible to process. So there is an asymmetry between adding modifiers to the right (in English, the recursive side) and adding it to the center of a clause (center embedding). The phenomenon is very general. What makes it particularly interesting is that the oddity, if it can be called such, of sen- tences like (40) does not seem to be due to the violation of any known grammatical constraint. It must be linked to how we parse sentences, that is, how we attach to them a syntactic analysis as a prerequisite to semantic interpretation. Many theories of sentence processing address this issue in interesting ways. The phenomenon of center embedding illustrates well how related but autonomous devices (in this case, the design of grammar vis a vis the architecture of the parser) interact in determining our behavior. See also ACQUISITION, FORMAL THEORIES OF; APHASIA; COMPUTATIONAL LEXI- CONS; COMPUTATIONAL LINGUISTICS; LANGUAGE ACQUISITION; LANGUAGE PRODUC- TION, LEARNING; NATIVISM; NATURAL LANGUAGE PROCESSING; PARAMETER-SETTING APPROACHES TO ACQUISITION, CREOLIZATION, AND DIACHRONY; PHONOLOGY, ACQUI- SITION OF; PROSODY AND INTONATION, PROCESSING ISSUES; PSYCHOLINGUISTICS; SE- MANTICS, ACQUISITION OF; SEMIOTICS AND COGNITION; SENTENCE PROCESSING; SPEECH PERCEPTION; SPOKEN WORD RECOGNITION; SYNTAX, ACQUISITION OF; VISUAL WORD RECOGNITION; WORD MEANING, ACQUISITION OF cviii Linguistics and Language 4 Concluding Remarks Language is important for many fairly obvious and widely known reasons. It can be put to an enormous range of uses; it is the main tool through which our thought gets expressed and our modes of reasoning become manifest. Its pathologies reveal impor- tant aspects of the functioning of the brain (cf. LANGUAGE, NEURAL BASIS OF); its use in HUMAN-COMPUTER INTERACTION is ever more a necessity (cf. SPEECH RECOGNI- TION IN MACHINES; SPEECH SYNTHESIS). These are all well established motivations for studying it. Yet, one of the most interesting things about language is in a way indepen- dent of them. What makes the study of language particularly exciting is the identifica- tion of regularities and the discovery of the laws that determine them. Often unexpectedly, we detect in our behavior, in our linguistic judgments or through exper- imentation, a pattern, a regularity. Typically, such regularities present themselves as intricate, they concern exotic data that are hidden in remote corners of our linguistic practice. Why do we have such solid intuitions about such exotic aspects of, say, the functioning of pronouns or the distribution of negative polarity items? How can we have acquired such intuitions? With luck, we discover that at the basis of these intrica- cies there are some relative simple (if fairly abstract) principles. Because speaking is a cognitive ability, whatever principles are responsible for the relevant pattern of behav- ior must be somehow implemented or realized in our head. Hence, they must grow in us, will be subject to pathologies, and so on. The cognitive turn in linguistics, through the advent of the generative paradigm, has not thrown away traditional linguistic inquiry. Linguists still collect and classify facts about the languages of the world, but in a new spirit (with arguably fairly old roots)—that of seeking out the mental mecha- nisms responsible for linguistic facts. Hypotheses on the nature of such mechanisms in turn lead to new empirical discoveries, make us see things we had previously missed, and so on through a new cycle. In full awareness of the limits of our current knowledge and of the disputes that cross the field, it seems impossible to deny that progress over the last 40 years has been quite remarkable. For one thing, we just know more facts (facts not documented in traditional grammars) about more languages. For another thing, the degree of theoretical sophistication is high, I believe higher than it ever was. Not only for the degree of formalization (which, in a field traditionally so prone to bad philosophizing, has its importance), but mainly for the interesting ways in which arrays of complex properties get reduced to ultimately simple axioms. Finally, the cross-disciplinary interaction on language is also a measure of the level the field is at. Abstract modeling of linguistic structure leads quite directly to psycho- logical experimentation and to neurophysiological study and vice versa (see, e.g., GRAMMAR, NEURAL BASIS OF; LEXICON, NEURAL BASIS OF; BILINGUALISM AND THE BRAIN). As Chomsky puts it, language appears to be the first form of higher cognitive capacity that is beginning to yield. We have barely begun to reap the fruits of this fact for the study of cognition in general. See also BILINGUALISM AND THE BRAIN; GRAMMAR, NEURAL BASIS OF; HUMAN- COMPUTER INTERACTION; LANGUAGE, NEURAL BASIS OF; LEXICON, NEURAL BASIS OF; SPEECH RECOGNITION IN MACHINES; SPEECH SYNTHESIS References Akmajian, A., R. Demers, A. Farmer, and R. Harnish. (1990). Linguistics. An Introduction to Lan- guage and Communication. 4th ed. Cambridge, MA: MIT Press. Baker, M. (1996). The Polysynthesis Parameter. Oxford: Oxford University Press. Bickerton, D. (1975). The Dynamics of a Creole System. Cambridge: Cambridge University Press. Bickerton, D. (1981). Roots of Language. Ann Arbor, MI: Karoma. Chien, Y.-C., and K. Wexler. (1990). Children’s knowledge of locality conditions in binding as evi- dence for the modularity of syntax and pragmatics. Language Acquisition 1: 225–295. Crain, S., and C. McKee. (1985). The acquisition of structural restrictions on anaphora. In S. Ber- man, J. Choe, and J. McDonough, Eds., Proceedings of the Eastern States Conference on Lin- guistics. Ithaca, NY: Cornell University Linguistic Publications. Clark, H., and E. Clark. (1977). The Psychology of Language. New York: Harcourt Brace Jovanovich. Cheng, L. (1991). On the Typology of Wh-Questions. Ph.D. diss., MIT. Distributed by MIT Working Papers in Linguistics. Linguistics and Language cix Chung, S., and J. McCloskey. (1987). Government, barriers and small clauses in Modern Irish. Lin- guistic Inquiry 18: 173–238. Dayal, V. (1998). Any as inherent modal. Linguistics and Philosophy. Fodor, J. A., T. Bever, and M. Garrett. (1974). The Psychology of Language. New York: McGraw- Hill. Grodzinsky, Y., and T. Reinhart. (1993). The innateness of binding and coreference. Linguistic Inquiry 24: 69–101. Huang, J. (1982). Grammatical Relations in Chinese. Ph.D. diss., MIT. Distributed by MIT Working Papers in Linguistics. Ladusaw, W. (1979). Polarity Sensitivity as Inherent Scope Relation. Ph.D. diss., University of Texas, Austin. Distributed by IULC, Bloomington, Indiana (1980). Ladusaw, W. (1992). Expressing negation. SALT II. Ithaca, NY: Cornell Linguistic Circle. Pinker, S. (1994). The Language Instinct. New York: William Morrow. Saffran, J., R. Aslin, and E. Newport. (1996). Statistical learning by 8-month-old infants. Science 274: 1926–1928. Further Readings Aronoff, M. (1976). Word Formation in Generative Grammar. Cambridge, MA: MIT Press. Atkinson, M. (1992). Children’s Syntax. Oxford: Blackwell. Brent, M. R. (1997). Computational Approaches to Language Acquisition. Cambridge, MA: MIT Press. Chierchia, G., and S. McConnell-Ginet. (1990). Meaning and Grammar. An Introduction to Seman- tics. Cambridge, MA: MIT Press. Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. (1987). Language and Problems of Knowledge: The Managua Lectures. Cambridge, MA: MIT Press. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N., and M. Halle. (1968). The Sound Pattern of English. New York: Harper and Row. Elman, J. L., E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, MA: MIT Press. Gleitman, L., and B. Landau, Eds. (1994). The Acquisition of the Lexicon. Cambridge, MA: MIT Press. Haegeman, L. (1990). An Introduction to Government and Binding Theory. 2nd ed. Oxford: Black- well. Hauser, M. D. (1996). The Evolution of Communication. Cambridge, MA: MIT Press. Jusczyk, P. W. (1997). The Discovery of Spoken Language. Cambridge, MA: MIT Press. Kenstowicz, M., and C. Kisseberth. (1979). Generative Phonology: Description and Theory. New York: Academic Press. Ladefoged, P. (1982). A Course in Phonetics. 2nd ed. New York: Harcourt Brace Jovanovich. Levinson, S. (1983). Pragmatics. Cambridge: Cambridge University Press. Lightfoot, D. (1991). How to Set Parameters: Arguments from Language Change. Cambridge, MA: MIT Press. Ludlow, P., Ed. (1997). Readings in the Philosophy of Language. Cambridge, MA: MIT Press. Osherson, D., and H. Lasnik. (1981). Language: An Invitation to Cognitive Science. Cambridge, MA: MIT Press. Stevens, K. N. (1998). Acoustic Phonetics. Cambridge, MA: MIT Press. Culture, Cognition, and Evolution Dan Sperber and Lawrence Hirschfeld Most work in the cognitive sciences focuses on the manner in which an individual device—be it a mind, a brain, or a computer—processes various kinds of information. Cognitive psychology in particular is primarily concerned with individual thought and behavior. Individuals however belong to populations. This is true in two quite differ- ent senses. Individual organisms are members of species and share a genome and most phenotypic traits with the other members of the same species. Organisms essentially have the cognitive capacities characteristic of their species, with relatively superficial individual variations. In social species, individuals are also members of groups. An important part of their cognitive activity is directed toward other members of the group with whom they cooperate and compete. Among humans in particular, social life is richly cultural. Sociality and culture are made possible by cognitive capacities, contribute to the ontogenetic and phylogenetic development of these capacities, and provide specific inputs to cognitive processes. Although population-level phenomena influence the development and implementa- tion of cognition at the individual level, relevant research on these phenomena has not been systematically integrated within the cognitive sciences. In good part, this is due to the fact that these issues are approached by scholars from a wide range of disci- plines, working within quite different research traditions. To the extent that research- ers rely on methodological and theoretical practices that are sometimes difficult to harmonize (e.g., controlled laboratory versus naturalistic observations), the influence of these insights across disciplines and traditions of research is often unduly limited, even on scholars working on similar problems. Moreover, one of the basic notions that should bring together these researchers, the very notion of culture, is developed in radically different ways, and is, if anything, a source of profound disagreements. The whole area reviewed in this chapter is fraught with polemics and misunder- standings. No one can claim an ecumenical point of view or even a thorough compe- tence. We try to be fair to the many traditions of research we consider and to highlight those that seem to us most important or promising. We are very aware of the fact that the whole area could be reviewed no less fairly but from a different vantage point, yielding a significantly different picture. We hope, at least, to give some sense of the relevance of the issues, of the difficulty involved in studying them, and of the creativity of scholars who have attempted to do so. To better appreciate the combined importance of work on population-level phe- nomena, we sort relevant research into three categories: 1. Cognition in a comparative and evolutionary perspective 2. Culture in an evolutionary and cognitive perspective 3. Cognition in an ecological, social, and cultural perspective 1 Cognition in a Comparative and Evolutionary Perspective Humans spontaneously attribute to nonhuman animals mental states similar to their own, such as desires and beliefs. Nevertheless, it has been commonplace, grounded in Western religion and philosophy, to think of humans as radically different from other species, and as being unique in having a true mind and soul. Charles Darwin’s theory of EVOLUTION based on natural selection challenged this classical dichotomy between “man and beast.” In the controversies that erupted, anecdotal examples of animal intel- ligence were used by DARWIN and his followers to question the discontinuity between humans and other species. Since that time, the study of animal behavior has been pur- sued by zoologists working on specific species and using more and more rigorous methods of observation. However, until recently, and with some notable exceptions cxii Culture, Cognition, and Evolution such as the pioneering work of Wolfgang Köhler on chimpanzees (see GESTALT PSY- CHOLOGY), zoological observation had little impact on psychology. Psychologists too were influenced by Darwin and espoused, in an even more radi- cal form, the idea that fundamentally there is no difference between the psychology of humans and that of other animals. Drawing in particular on the work of Edward Thorndike and Ivan Pavlov on CONDITIONING, behaviorists developed the view that a single set of laws govern LEARNING in all animals. Whereas naturalists insisted that animal psychology was richer and more human-like than was generally recognized, behaviorist psychologists insisted that human psychology was poorer and much more animal-like than we would like to believe. In this perspective, the psychology of cats, rats, and pigeons was worth studying in order, not to understand better these individ- ual species, but to discover universal psychological laws that apply to humans as well, in particular laws of learning. COMPARATIVE PSYCHOLOGY developed in this behavior- istic tradition. It made significant contributions to the methodology of the experimen- tal study of animal behavior, but it has come under heavy criticism for its neglect of what is now called ECOLOGICAL VALIDITY and for its narrow focus on quantitative rather than qualitative differences in performance across species. This lack of interest in natural ecologies or species-specific psychological adaptations, in fact, is pro- foundly anti-Darwinian. For behaviorists, behavior is very much under the control of forces acting on the organism from without, such as external stimulations, as opposed to internal forces such as instincts. After 1940, biologically inspired students of animal behavior, under the influence of Konrad Lorenz, Karl von Frisch, and Niko Tinbergen, and under the label of ETHOLOGY, drew attention to the importance of instincts and species-specific “fixed action patterns.” In the ongoing debate on innate versus acquired components of behavior, they stressed the innate side in a way that stirred much controversy, espe- cially when Lorenz, in his book On Aggression (1966), argued that humans have strong innate dispositions to aggressive behavior. More innovatively, ethologists made clear that instinct and learning are not to be thought of as antithetic forces: various learning processes (such as “imprinting” or birds’ learning of songs) are guided by an instinct to seek specific information in order to develop specific competencies. By stressing the importance of species-specific psychological mechanisms, etholo- gists have shown every species (not just humans) to be, to some interesting extent, psychologically unique. This does not address the commonsense and philosophical interest (linked to the issue of the rights of animals) in the commonalties between human and other animals’ psyche. Do other animals think? How intelligent are they? Do they have conscious experiences? Under the influence of Donald Griffin, research- ers in COGNITIVE ETHOLOGY have tried to answer these questions (typically in the pos- itive) by studying animals, preferably in their natural environment, through observation complemented by experimentation. This has meant accepting some of what more laboratory-oriented psychologists disparagingly call “anecdotal evidence” and has led to methodological controversies. Work on PRIMATE COGNITION has been of special importance for obvious reasons: nonhuman primates are humans’ closest relatives. The search for similarities between humans and other animals begins, quite appropriately, with apes and monkeys. More- over, because these similarities are then linked to close phylogenetic relationships, they help situate human cognition in its evolutionary context. This phylogenetic approach has been popularized in works such as Desmond Morris’s The Naked Ape. There have been more scientifically important efforts to link work on apes and on humans. For instance, the study of naïve psychology in humans owes its label, THE- ORY OF MIND, and part of its inspiration to Premack and Woodruff’s famous article “Does the chimpanzee have a theory of mind?” (1978). As the long history of the study of apes’ linguistic capacities illustrate, however, excessive focalization on conti- nuities with the human case can, in the end, be counterproductive (see PRIMATE LAN- GUAGE). Primate psychology is rich and complex, and highly interesting in its own right. Different species rely to different degrees and in diverse ways on their psychologi- cal capacities. Some types of behavior provide immediate evidence of highly special- Culture, Cognition, and Evolution cxiii ized cognitive and motor abilities. ECHOLOCATION found in bats and in marine mammals is a striking example. A whole range of other examples of behavior based on specialized abilities is provided by various forms of ANIMAL COMMUNICATION. Communicating animals use a great variety of behaviors (e.g., vocal sounds, electric discharges, “dances,” facial expressions) that rely on diverse sensory modalities, as signals conveying some informational content. These signals can be used altruistically to inform, or selfishly to manipulate. Emitting, receiving, and interpreting these sig- nals rely on species-specific abilities. Only in the human case has it been suggested— in keeping with the notion of a radical dichotomy between humans and other ani- mals—that the species’ general intelligence provides all the cognitive capacities needed for verbal communication. This view of human linguistic competence has been strongly challenged, under the influence of Noam Chomsky, by modern approaches to LANGUAGE ACQUISITION. Important aspects of animal psychology are manifested in social behavior. In many mammals and birds, for instance, animals recognize one another individually and have different types of interactions with different members of their group. These relationships are determined not only by the memory of past interactions, but also by kinship relations and hierarchical relationships within the group (see DOMINANCE IN ANIMAL SOCIAL GROUPS). All this presupposes the ability to discriminate individuals and, more abstractly, types of social relationships. In the case of primates, it has been hypothesized that their sophisticated cognitive processes are adaptations to their social rather than their natural environment. The MACHIAVELLIAN INTELLIGENCE HYPOTHESIS, so christened by Richard Byrne and Andrew Whiten (1988), offers an explanation not only of primate intelligence, but also of their ability to enter into strategic interactions with one another, an ability hyperdeveloped in humans, of course. Many social abilities have fairly obvious functions and it is unsurprising, from a Darwinian point of view, that they should have evolved. (The adaptive value of SOCIAL PLAY BEHAVIOR is less evident and has given rise to interesting debates.) On the other hand, explaining the very existence of social life presents a major challenge to Dar- winian theorizing, a challenge that has been at the center of important recent develop- ments in evolutionary theory and in the relationship between the biological, the psychological, and the social sciences. Social life implies COOPERATION AND COMPETITION. Competition among organ- isms plays a central role in classical Darwinism, and is therefore not at all puzzling; but the very existence of cooperation is harder to accommodate in a Darwinian frame- work. Of course, cooperation can be advantageous to the cooperators. Once coopera- tion is established, however, it seems that it would invariably be even more advantageous for any would-be cooperator to “defect,” be a “free-rider,” and benefit from the cooperative behavior of others without incurring the cost of being coopera- tive itself (a problem known in GAME THEORY and RATIONAL CHOICE THEORY as the “prisoner’s dilemma”). Given this, it is surprising that cooperative behavior should ever stabilize in the evolution of a population subject to natural selection. The puzzle presented by the existence of various forms of cooperation or ALTRU- ISM in living species has been resolved by W. D. Hamilton’s (1964) work on kin selection and R. Trivers’s (1971) work on reciprocal altruism. A gene for altruism causing an individual to pay a cost, or even to sacrifice itself for the benefit of his kin may thereby increase the number of copies of this gene in the next generation, not through the descendents of the self-sacrificing individual (who may thereby lose its chance of reproducing at all), but through the descendents of the altruist’s kin who are likely to carry the very same gene. Even between unrelated individuals, ongoing reciprocal behavior may not only be advantageous to both, but, under some condi- tions, may be more advantageous than defecting. This may in particular be so if there are cheater-detection mechanisms that make cheating a costly choice. It is thus possi- ble to predict, in some cases with remarkable precision, under which circumstances kin selection or reciprocal altruism are likely to evolve. The study of such cases has been one of the achievements of SOCIOBIOLOGY. In general sociobiologists aim at explaining behavior, and in particular social behavior, cxiv Culture, Cognition, and Evolution on the assumption that natural selection favors behaviors of an organism that tends to maximize the reproductive success of its genes. Sociobiology, especially as expounded in E. O. Wilson’s book Sociobiology: The New Synthesis (1975) and in his On Human Nature (1978), has been the object of intense controversy. Although some social scientists have espoused a sociobiological approach, the majority have denounced the extension of sociobiological models to the study of human behavior as reductionist and naïve. Sociobiology has had less of an impact, whether positive or negative, on the cognitive sciences. This can probably be explained by the fact that sociobiologists relate behavior directly to biological fitness and are not primarily con- cerned with the psychological mechanisms that govern behavior. It is through the development of EVOLUTIONARY PSYCHOLOGY that, in recent years, evolutionary theory has had an important impact on cognitive psychology (Barkow, Cosmides, and Tooby 1992). Unlike sociobiology, evolutionary psychology focuses on what Cosmides and Tooby (1987) have described as the “missing link” (missing, that is, from sociobiological accounts) between genes and behavior, namely the mind. Evo- lutionary psychologists view the mind as an organized set of mental devices, each hav- ing evolved as an adaptation to some specific challenge presented by the ancestral environment. There is, however, some confusion of labels, with some sociobiologists now claiming evolutionary psychology as a subdiscipline or even describing them- selves as evolutionary psychologists. This perspective may help discover discrete mental mechanisms, the existence of which is predicted by evolutionary considerations and may help explain the struc- ture and function of known mental mechanisms. As an example of the first type of contribution, the evolutionary psychology of SEXUAL ATTRACTION has produced strong evidence of the existence of a special purpose adaptation for assessing the attractiveness of potential mates that uses subtle cues such as facial symmetry and waist-to-hips ratio (Symons 1979; Buss 1994). As an example of the second type of contribution, Steven Pinker has argued in The Language Instinct (1994) that the language faculty is an evolved adaptation, many aspects of which are best explained in evolutionary terms. Both types of contribution have stirred intense controversies. Evolutionary psychology has important implications for the study of culture, sig- nificantly different from those of sociobiology. Sociobiologists tend to assume that the behaviors of humans in cultural environments are adaptive. They seek therefore to demonstrate the adaptiveness of cultural patterns of behavior and see such demonstra- tions as explanations of these cultural patterns. Evolutionary psychologists, on the other hand, consider that evolved adaptations, though of course adaptive in the ances- tral environment in which they evolved, need not be equally adaptive in a later cultural environment. Slowly evolving adaptations may have neutral or even maladaptive behavioral effects in a rapidly changing cultural environment. For instance, the evolved disposition to automatically pay attention to sudden loud noises was of adaptive value in the ancestral environment where such noises were rare and very often a sign of danger. This disposition has become a source of distraction, annoyance, and even pathology in a modern urban environment where such noises are extremely common, but a reliable sign of danger only in specific circumstances, such as when crossing a street. This disposition to pay attention to sudden loud noises is also culturally exploited in a way that is unlikely to significantly affect biological fit- ness, as when gongs, bells, or hand-clapping are used as conventional signals, or when musicians derive special effect from percussion instruments. Such nonadaptive effects of evolved adaptations may be of great cultural significance. See also ALTRUISM; ANIMAL COMMUNICATION; COGNITIVE ETHOLOGY; COMPARA- TIVE PSYCHOLOGY; CONDITIONING; COOPERATION AND COMPETITION; DARWIN, CHARLES; DOMINANCE IN ANIMAL SOCIAL GROUPS; ECHOLOCATION; ECOLOGICAL VALIDITY; ETHOLOGY; EVOLUTION; EVOLUTIONARY PSYCHOLOGY; GAME THEORY; GESTALT PSYCHOLOGY; LANGUAGE ACQUISITION; LEARNING; MACHIAVELLIAN INTEL- LIGENCE HYPOTHESIS; PRIMATE COGNITION; PRIMATE LANGUAGE; RATIONAL CHOICE THEORY; SEXUAL ATTRACTION, EVOLUTIONARY PSYCHOLOGY OF; SOCIAL PLAY BEHAVIOR; SOCIOBIOLOGY Culture, Cognition, and Evolution cxv 2 Culture in an Evolutionary and Cognitive Perspective There are many species of social animals. In some of these species, social groups may share and maintain behaviorally transmitted information over generations. Examples of this are songs specific to local populations of some bird species or nut-cracking techniques among West African chimpanzees. Such populations can be said to have a “culture,” even if in a very rudimentary form. Among human ancestors, the archaeo- logical record shows the existence of tools from which the existence of a rudimentary technical culture can be inferred, for some two million years (see TECHNOLOGY AND HUMAN EVOLUTION), but the existence of complex cultures with rich CULTURAL SYM- BOLISM manifested through ritual and art is well evidenced only in the last 40,000 years. COGNITIVE ARCHAEOLOGY aims in particular at explaining this sudden explo- sion of culture and at relating it to its cognitive causes and effects. The study of culture is of relevance to cognitive science for two major reasons. The first is that the very existence of culture, for an essential part, is both an effect and a manifestation of human cognitive abilities. The second reason is that the human soci- eties of today culturally frame every aspect of human life, and, in particular, of cogni- tive activity. This is true of all societies studied by anthropologists, from New Guinea to Silicon Valley. Human cognition takes place in a social and cultural context. It uses tools provided by culture: words, concepts, beliefs, books, microscopes and comput- ers. Moreover, a great deal of cognition is about social and cultural phenomena. Thus two possible perspectives, a cognitive perspective on culture and a cultural perspective on cognition, are both legitimate and should be complementary. Too often, however, these two perspectives are adopted by scholars with different training, very different theoretical commitments, and therefore a limited willingness and ability to interact fruitfully. In this section, we engage the first, cognitive perspective on culture and in the next the second, cultural perspective on cognition, trying to highlight both the difficulties and opportunities for greater integration. Let us first underscore two points of general agreement: the recognition of cultural variety, and that of “psychic unity.” The existence of extraordinary cultural variety, well documented by historians and ethnographers, is universally acknowledged. The full extent of this variety is more contentious. For instance, although some would deny the very existence of interesting HUMAN UNIVERSALS in matters cultural, others have worked at documenting them in detail (Brown 1991). Until the early twentieth cen- tury, this cultural variation was often attributed to supposed biological variation among human populations. Coupled with the idea of progress, this yielded the view that, as biological endowment progressed, so did cultural endowment, and that some populations (typically Christian whites) were biologically and culturally superior. This view was never universally embraced. Adolf Bastian and Edward Tylor, two of the founders of anthropology in the nineteenth century, insisted on the “psychic unity” of humankind. FRANZ BOAS, one of the founders of American anthropology, in a reso- lute challenge to scientific racism, argued that human cultural variations are learned and not inherited. Today, with a few undistinguished exceptions, it is generally agreed among cognitive and social scientists that cultural variation is the effect, not of biolog- ical variation, but of a common biological, and more specifically cognitive endow- ment that, given different historical and ecological conditions, makes this variability possible. No one doubts that the biologically evolved capacities of humans play a role in their social and cultural life. For instance, humans are omnivorous and, sure enough, their diet varies greatly, both within and across cultures. Or to take another example, humans have poorly developed skills for tree climbing, and, not surprisingly, few human communities are tree-dwelling. But what are the human cognitive capacities actually relevant to understanding cultural variability and other social phenomena, and in which manner are they relevant? In the social sciences, it has long been a standard assumption that human learning abilities are general and can be applied in the same way to any empirical domain, and that reasoning abilities are equally general and can be brought to bear on any problem, whatever its content. The human mind, so conceived, is viewed as the basis for an cxvi Culture, Cognition, and Evolution extra somatic adaptation—culture—that has fundamentally changed the relationship between humans and their environment. Culture permits humans to transcend physical and cognitive limitations through the development and use of acquired skills and arti- facts. Thus, humans can fly, scale trees, echolocate, and perform advanced mathemat- ical calculus despite the fact that humans are not equipped with wings, claws, natural sonars, or advanced calculus abilities. Cultural adaptations trump cognitive ones in the sense that cultural skills and artifacts can achieve outcomes unpredicted by human cognitive architecture. Many social scientists have concluded from this that psychology is essentially irrel- evant to the social sciences and to the study of culture in particular. It is, however, pos- sible to think of the mind as a relatively homogeneous general-purpose intelligence, and still attribute to it some interesting role in the shaping of culture. For instance, Lucien Lévy-Bruhl assumed that there was a primitive mentality obeying specific intellectual laws and shaping religious and magical beliefs. BRONISLAW MALINOWSKI sought to explain such beliefs, and culture in general, as a response to biological and psychological needs. CLAUDE LÉVI-STRAUSS explicitly tried to explain culture in terms of the structure of the human mind. He developed the idea that simple cognitive dispositions such as a preference for hierarchical classifications or for binary opposi- tions played an important role in shaping complex social systems such as kinship and complex cultural representations such as myth. Most research done under the label COGNITIVE ANTHROPOLOGY (reviewed in D’Andrade 1995) accepts the idea that the human mind applies the same categoriza- tion and inference procedures to all cognitive domains. Early work in this field con- centrated on classification and drew its conceptual tools more from semantics and semiotics (see SEMIOTICS AND COGNITION) than from a cognitive psychology (which, at the time, was in its infancy). More recently, building on Shank and Abelson’s idea of scripts, cognitive anthropologists have begun to propose that larger knowledge structures—“cultural schema” or “cultural models”—guide action and belief, in part by activating other related cultural SCHEMATA or models, and as a whole encapsulate tenets of cultural belief. Some of this work has drawn on recent work on FIGURATIVE LANGUAGE, in particular, on METAPHOR (Lakoff and Johnson 1980; Lakoff 1987; Lakoff and Turner 1989) and has focused on cultural models structured in metaphori- cal terms (see METAPHOR AND CULTURE). In an extended analysis, Quinn (1987), for instance, identifies a number of inter- connecting metaphors for marriage in contemporary North America: marriage is enduring, marriage is mutually beneficial, marriage is unknown at the outset, marriage is difficult, marriage is effortful, marriage is joint, marriage may succeed or fail, mar- riage is risky. These conjoined metaphors—which together constitute a cultural model—in turn contain within them assumptions derived from models of other every- day domains: the folk physics of difficult activities, the folk social psychology of vol- untary relationships, the folk theory of probability, and the folk psychology of human needs. Through this embedding, cultural schema or models provide a continuity and coherency in a given culture’s systems of belief. Schema- and model-based analyses are intended to bridge psychological representations and cultural representations. They also provide a basis for relating MOTIVATION AND CULTURE. Not surprisingly, CONNECTIONISM, seen as a way to model the mind without attributing to it much inter- nal structure, is now popular in this tradition of cognitive anthropology (Strauss and Quinn 1998). Still, it is possible to acknowledge that culture has made the human condition pro- foundly different from that of any other animal species, and yet to question the image of the human mind as a general-purpose learning and problem-solving device. It is possible also to acknowledge the richness and diversity of human culture and yet to doubt that the role of human-evolved cognitive capacities has been merely to enable the development of culture and possibly shape the form of cultural representations, without exerting any influence on their contents. It is possible, in other terms, to rec- oncile the social sciences’ awareness of the importance of culture with the cognitive sciences’ growing awareness of the biological grounded complexity of the human mind. Culture, Cognition, and Evolution cxvii For example, cognitive scientists have increasingly challenged the image of the human mind as essentially a general intelligence. Arguments and evidence from evo- lutionary theory, developmental psychology, linguistics, and one approach in cogni- tive anthropology render plausible a different picture. It is being argued that many human cognitive abilities are not domain-general but specialized to handle specific tasks or domains. This approach (described either under the rubric of MODULARITY or DOMAIN SPECIFICITY) seeks to investigate the nature and scope of these specific abili- ties, their evolutionary origin, their role in cognitive development, and their effect on culture. The most important domain-specific abilities are evolved adaptations and are at work in every culture, though often with different effects. Some other domain-specific abilities are cases of socially developed, painstakingly acquired EXPERTISE, such as chess (see CHESS, PSYCHOLOGY OF), that is specific to some cultures. The relationship between evolved adaptations and acquired expertise has not been much studied but is of great interest, in particular for the articulation of the cognitive and the cultural per- spective. For instance, writing—which is so important to cognitive and cultural devel- opment (see WRITING SYSTEMS and LITERACY)—is a form of expertise, although it has become so common that we may not immediately think of it as such. It would be of the utmost interest to find out to what extent this expertise is grounded in specific psy- chomotor evolved adaptations. The first domain-specific mechanisms to be acknowledged in the cognitive litera- ture were input modules and submodules (see Fodor 1982). Typical examples are linked to specific perceptual modality. They include devices that detect edges, sur- faces, and whole objects in processing visual information; face recognition devices; and speech parsing devices; abilities to link specific outcomes (such as nausea and vomiting but not electric shock) to specific stimuli (such as eating but not light) through rapid, often single trial, learning. More recently, there has been a growing body of evidence suggesting that central (i.e., conceptual) mechanisms, as well as input-output processes, may be domain- specific. It has been argued, for instance, that the ability to interpret human action in terms of beliefs and desires is governed by a naive psychology, a domain-specific ability, often referred to as THEORY OF MIND; that the capacity to partition and explain living things in terms of biological principles like growth, inheritance, and bodily function is similarly governed by a FOLK BIOLOGY; and that the capacity to form consistent predictions about the integrity and movements of inert objects is gov- erned by a NAIVE PHYSICS. These devices are described as providing the basis for competencies that children use to think about complex phenomena in a coherent manner using abstract causal principles. Cultural competencies in these domains are seen as grounded in these genetically determined domain-specific dispositions, though they may involve some degree of CONCEPTUAL CHANGE. The study of folk biology provides a good example of how different views of the mind yield different accounts of cultural knowledge. A great deal of work in classical cognitive anthropology has been devoted to the study of folk classification of plants and animals (Berlin, Breedlove, and Raven 1973; Berlin 1992; Ellen 1993). This work assumed that the difference in organization between these biological classifications and classifications of say, artifacts or kinship relations had to do with differences in the objects classified and that otherwise the mind approached these domains in exactly the same way. Scott Atran’s (1990) cognitive anthropological work, drawing on developmental work such as that of Keil (1979), developed the view that folk- biological knowledge was based on a domain-specific approach to living things char- acterized by specific patterns of CATEGORIZATION and inference. This yields testable predictions regarding both the acquisition pattern and the cultural variability of folk biology. It predicts, for instance, that from the start (rather than through a lengthy learning process) children will classify animals and artifacts in quite different ways, will reason about them quite differently, and will do so in similar ways across cul- tures. Many of these predictions seem to be borne out (see Medin and Atran 1999). Generally, each domain-specific competence represents a knowledge structure that identifies and interprets a class of phenomena assumed to share certain properties and cxviii Culture, Cognition, and Evolution hence be of a distinct and general type. Each such knowledge structure provides the basis for a stable response to a set of recurring and complex cognitive or practical challenges. These responses involve largely unconscious dedicated perceptual, retrieval, and inferential processes. Evolutionary psychology interprets these domain- specific competencies as evolved adaptations to specific problems faced by our ances- tral populations. At first, there might seem to be a tension between the recognition of these evolved domain-specific competencies and the recognition of cultural variety. Genetically determined adaptations seem to imply a level of rigidity in cognitive performance that is contradicted by the extraordinary diversity of human achievements. In some domain, a relative degree of rigidity may exist. For example, the spontaneous expecta- tions of not only infants but also adults about the unity, boundaries, and persistence of physical objects may be based on a rather rigid naïve physics. It is highly probable that these expectations vary little across populations, although at present hardly any research speaks to this possibility, which thus remains an open empirical question. After all, evidence does exist suggesting that other nonconscious perceptual pro- cesses, such as susceptibility to visual illusions, do vary across populations (Hersko- vits, Campbell, and Segall 1969). Generally, however, it is a mistake to equate domain-specificity and rigidity. A genetically determined cognitive disposition may express itself in different ways (or not express itself at all) depending on the environmental conditions. For instance, even in a case such as fear of snakes and other predators, where a convincing argu- ment can be made for the existence, in many species, of evolved mechanisms that trigger an appropriate self-protection response, the danger cues and the fear are not necessarily directly linked. Marks and Nesse (1994: 255), following Mineka et al. (1984), describe such a case in which fear does not emerge instinctively but only after a specific sort of learning experience: “Rhesus monkeys are born without snake fear. Enduring fear develops after a few observations of another rhesus monkey tak- ing fright at a snake . . . Likewise, a fawn is not born with fear of a wolf, but lifelong panic is conditioned by seeing its mother flee just once from a wolf.” Thus, even low-level effects like primordial fears develop out of interactions between prepotentials for discriminating certain environmental conditions, a prepared- ness to fast learning, and actual environmental inputs. In general, domain-specific competencies emerge only after the competence’s initial state comes into contact with a specific environment, and, in some cases, with displays of the competence by older conspecifics. As the environmental inputs vary so does the outcome (within certain limits, of course). This is obviously the case with higher-level conceptual dispositions: It goes without saying, for instance, that even if there is a domain-specific disposition to classify animals in the same way, local faunas differ, and so does people’s involve- ment with this fauna. There is another and deeper reason why domain-specific abilities are not just com- patible with cultural diversity, but may even contribute to explaining it (see Sperber 1996: chap. 6). A domain-specific competence processes information that meets spe- cific input conditions. Normally, these input conditions are satisfied by information belonging to the proper domain of the competence. For instance, the face recognition mechanism accepts as inputs visual patterns that in a natural environment are almost exclusively produced by actual faces. Humans, however, are not just receivers of information, they are also massive producers of information that they use (or seek to use) to influence one another in many ways, and for many different purposes. A reli- able way to get the attention of others is to produce information that meets the input conditions of their domain-specific competencies. For instance, in a human cultural environment, the face recognition mechanism is stimulated not just by natural faces, but also by pictures of faces, by masks, and by actual faces with their features high- lighted or hidden by means of make-up. The effectiveness of these typically cultural artifacts is in part to be explained by the fact that they rely on and exploit a natural dis- position. Although the natural inputs of a natural cognitive disposition may not vary greatly across environments, different cultures may produce widely different artificial inputs Culture, Cognition, and Evolution cxix that, nevertheless, meet the input conditions of the same natural competence. Hence not all societies have cosmetic make-up, pictures of faces, or masks, and those that do exhibit a remarkable level of diversity in these artifacts. But to explain the very exist- ence of these artifacts and the range of their variability, it is important to understand that they all rely on the same natural mechanism. In the same way, the postulation of a domain-specific competence suggests the existence of a diversified range of possible exploitations of this competence. Of course these exploitations can also be enhance- ments: portraitists and make-up technicians contribute to culturally differentiated and enhanced capacities for face recognition (and aesthetic appraisal). Let us give three more illustrations of the relationship between a domain-specific competence and a cultural domain: color classification, mathematics, and social clas- sifications. Different languages deploy different systems of COLOR CATEGORIZATION, segment- ing the color spectrum in dramatically different ways. Some languages have only two basic color terms (e.g., Dani). Other languages (e.g., English) have a rich and varied color vocabulary with eleven basic color terms (and many nonbasic color terms that denote subcategories such as crimson or apply to specific objects such as a bay horse). Prior to Berlin and Kay’s (1969) now classic study, these color naming differences were accepted as evidence for the LINGUISTIC RELATIVITY HYPOTHESIS, the doctrine that different modes of linguistic representation reflect different modes of thought. Thus, speakers of languages with two-term color vocabularies were seen as conceptu- alizing the world in this limited fashion. Berlin and Kay found that although the boundaries of color terms vary across lan- guages, the focal point of each color category (e.g., that point in the array of reds that is the reddest of red) remains the same no matter how the color spectrum is segmented linguistically. There are, they argued, eleven such focal points, and therefore eleven possible basic color terms. Although there are over two thousand possible subsets of these eleven terms, only twenty-two of these subsets are ever encountered. Moreover, the sequence in which color terms enter a language is tightly constrained. Further research has led to minor revisions but ample confirmation of these findings. Here, then, we have a case where the evolved ability to discriminate colors both grounds culturally specific basic color vocabularies and constrains their variability. Further work by Kay and Kempton (1988) showed that linguistic classification could have some marginal effect on nonverbal classification of color. Nevertheless, once the para- digm example of linguistic relativity, the case of color classification, is now the para- digm illustration of the interplay between cognitive universals and cultural variations, variations that are genuine, but much less dramatic than was once thought. Naive mathematics provides another instance of the relationship between a domain-specific competence and cultural variation. It has been shown that human infants and some other animals can distinguish collections of objects according to the (small) number of elements in the collection. They also expect changes in the number of objects to occur in accordance with elementary arithmetic principles. All cultures of the world provide some system for counting (verbal and/or gestural), and people in all cultures are capable of performing some rudimentary addition or subtraction, even without the benefit of schooling. This suggests that humans are endowed with an evolved adaptation that can be called naive mathematics. Counting systems do vary from culture to culture. Some, like that of the Oksapmin of New Guinea, are extremely rudimentary, without base structure, and allow counting only up to some small number. Others are more sophisticated and allow, through combination of a few morphemes, the expression of any positive integer. These counting systems, drawing on the morpho-syntactic resources of language, provide powerful cultural tools for the use and enhancement of the naive mathematical ability. Cultural differences in count- ing largely reflect the degree of linguistic enhancement of this universal ability. There are mathematical activities that go beyond this intuitive counting ability. Their development varies considerably and in different directions across cultures. Concepts such as the zero, negative numbers, rational numbers, and variables; tech- niques such as written arithmetical operations; and artifacts such as multiplication tables, abacus, rulers, or calculators help develop mathematics far beyond its intuitive cxx Culture, Cognition, and Evolution basis. Some of these concepts and tools are relatively easy to learn and use, others require painstaking study in an educational setting. From a cognitive point of view, explaining these cultural developments and differences must include, among other things, an account of the cognitive resources they mobilize. For instance, given human cognitive dispositions, mathematical ideas and skills that are more intuitive, more eas- ily grasped, and readily accepted should have a wider and more stable distribution and a stronger impact on most people’s thinking and practice (see NUMERACY AND CUL- TURE). provides a third example of the relationship between a domain- NAIVE SOCIOLOGY specific cognitive disposition and a varying cultural domain. According to the stan- dard view, children learn and think about all human groupings in much the same way: they overwhelmingly attend to surface differences in forming categories and they interpret these categories virtually only in terms of these superficial features. Of course, knowledge of all social categories is not acquired at the same time. Children sort people by gender before they sort them by political party affiliation. The standard explanation is that children learn to pick out social groups that are visibly distinct and culturally salient earlier than they learn about other, less visually marked, groups. Recent research suggests that surface differences determine neither the develop- ment of categories nor their interpretation (Hirschfeld 1996). In North America and Europe one of the earliest-emerging social concepts is “race.” Surprisingly, given the adult belief that the physical correlates of “race” are extremely attention demanding, the child’s initial concept of “race” contains little perceptual information. Three-year- olds, for instance, recognize that “blacks” represent an important social grouping long before they learn which physical features are associated with being “black.” What little visual information they have is often inaccurate and idiosyncratic; thus, when one young child was asked to describe what made a particular person black, he responded that his teeth were longer. (Ramsey 1987.) Another set of studies suggests that even quite young children possess a deep and theory-like understanding of “race” (but not other similar groupings), expecting “race” to be a fundamental, inherited, and immutable aspect of an individual—that is, they expect it to be biological (Hirschfeld 1995). Conceptual development of this sort—in which specific concepts are acquired in a singular fashion and contain information far beyond what experience affords—are plausibly the output of a domain-specific disposition. Since the disappearance of the Neanderthals, humans are no longer divided into subspecies or races, and the very idea of “race” appeared only relatively recently in human history. So, although there may well exist an evolved domain-specific disposition that guides learning about social groupings, it is very unlikely that it would have evolved with the function of guiding learning about “race.” As noted previously, however, many cultural artifacts meet a device’s input conditions despite the fact that they did not figure in the evolu- tionary environment that gave rise to the device. “Race” might well be a case in point. As many have argued, “race” was initially a cultural creation linked to colonial and other overseas encounters with peoples whose physical appearance was markedly dif- ferent from Europeans. The modern concept of “race” has lost some of this historic specificity and is generally (mis)interpreted as a “natural” system for partitioning humans into distinct kinds. That this modern concept has stabilized and been sustained over time owes as much to cognitive as cultural factors (Hirschfeld 1996). On the one hand, it is sustainable because a domain-specific disposition guides children to sponta- neously adopt specific social representations, and “race” satisfies the input conditions of this disposition. On the other hand, it varies across cultures because each cultural environment guides children to a specific range of possible groupings. These possibil- ities, in turn, reflect the specific historical contexts in which colonial and other over- seas encounters occurred. It is worth bearing in mind that “race” is not the only cultural domain that is “naturalized” because it resonates with an evolved disposition. It is plausible that children in South Asia, guided by the same domain-specific disposi- tion but in another cultural context, find “caste” more biological than “race.” Similarly, children in some East-African societies may find “age-grades” more biological than Culture, Cognition, and Evolution cxxi either “race” or “caste.” In all such cases, the fact that certain social categories are more readily learned contributes to the social and cultural stability of these categories. The cases of color classification, mathematics, and naïve sociology illustrate a fairly direct relationship between a domain-specific ability and a cultural domain grounded in this ability, enhancing it, and possibly biasing it. Not all cultural domains correspond in this simple way to a single underlying domain-specific competence. For instance, are RELIGIOUS IDEAS AND PRACTICES grounded in a distinct competence, the domain of which would be supernatural phenomena? This is difficult to accept from the point of view of a naturalistic cognitive science. Supernatural phenomena cannot be assumed to have been part of the environment in which human psychological adap- tations evolved. Of course, it is conceivable that a disposition to form false or unevi- denced beliefs of a certain tenor would be adaptive and might have evolved. Thus Malinowski and many other anthropologists have argued that religious beliefs serve a social function. Nemeroff and Rozin (1994) have argued that much of MAGIC AND SUPERSTITION is based on intuitive ideas of contagion that have clear adaptive value. Another possibility is that domain-specific competencies are extended beyond their domain, in virtue of similarity relationships. Thus, Carey (1985) and Inagaki and Hatano (1987) have argued that ANIMISM results from an overextension of naïve psy- chology. The cultural prevalence of religious and magical beliefs may also be accounted for in terms of a domain-specific cognitive architecture without assuming that there is a domain-specific disposition to religious or magical beliefs (see Sperber 1975, 1996; Boyer 1990, 1994). Religious beliefs typically have a strong relationship with the principles of naïve physics, biology, psychology, and sociology. This relationship, however, is one of head-on contradiction. These are beliefs about creatures capable of being simultaneously in several places, of belonging to several species or of changing from one species to another, or of reading minds and seeing scenes distant in time or space. Apart from these striking departures from intuitive knowledge, however, the appearance and behavior of these supernatural beings is what intuition would expect of natural beings. Religious representations, as argued by Boyer (1994), are sustain- able to the extent that a balance between counterintuitive and intuitive qualities is reached. A supernatural being with too few unexpected qualities is not attention demanding and thus not memorable. One with too many unexpected qualities is too information rich to be memorable (see MEMORY). Thus, religious beliefs can be seen as parasitical on domain-specific competencies that they both exploit and challenge. So far in this section, we have illustrated how evolutionary and cognitive perspec- tives can contribute to our understanding of specific cultural phenomena. They can also contribute to our understanding of the very phenomenon of culture. Until recently, the evolutionary and the cognitive approaches to the characterization of cul- ture were very different and unrelated. In more recent developments, they have con- verged to a significant degree. From an evolutionary point of view, there are two processes to consider and articu- late: the biological evolution of the human species, and the CULTURAL EVOLUTION of human groups. There is unquestionably a certain degree of coevolution between genes and culture (see Boyd and Richerson 1985; William Durham 1991). But, given the very different rates of biological and cultural evolution—the latter being much more rapid than the former—the importance of cultural evolution to biological evolution, or equivalently its autonomy, is hard to assess. Sociobiologists (e.g., Lumsden and Wilson 1981) tend to see cultural evolution as being very closely controlled by biological evolution and cultural traits as being selected in virtue of their biological functionality. Other biologists such as Cavalli- Sforza and Feldman (1981) and Richard Dawkins (1976, 1982) have argued that cul- tural evolution is a truly autonomous evolutionary process where a form of Darwinian selection operates on cultural traits, favoring the traits that are more capable of gener- ating replicas of themselves (whether or not they contribute to the reproductive suc- cess of their carriers). Neither of these evolutionary approaches gives much place to cognitive mechanisms, the existence of which is treated as a background condition for the more or less autonomous selection of cultural traits. Both evolutionary approaches cxxii Culture, Cognition, and Evolution view culture as a pool of traits (mental representations, practices, or artifacts) present in a population. From a cognitive point of view, it is tempting to think of culture as an ensemble of representations (classifications, schemas, models, competencies), the possession of which makes an individual a member of a cultural group. In early cognitive anthropol- ogy, culture was often compared to a language, with a copy of it in the mind of every culturally competent member of the group. Since then, it has been generally recog- nized that cultures are much less integrated than languages and tolerate a much greater degree of interindividual variation (see CULTURAL CONSENSUS THEORY and CUL- TURAL VARIATION). Moreover, with the recent insistence on the role of artifacts in cognitive processes (see COGNITIVE ARTIFACTS), it has become common to acknowl- edge the cultural character of these artifacts: culture is not just in the mind. Still, in a standard cognitive anthropological perspective, culture is first and foremost some- thing in the mind of every individual. The fact that culture is a population-scale phe- nomenon is of course acknowledged, but plays only a trivial role in explanation. Some recent work integrates the evolutionary and cognitive perspectives. Sperber (1985, 1996) has argued for an “epidemiological” approach to culture. According to this approach, cultural facts are not mental facts but distributions of causally linked mental and public facts in a human population. More specifically, chains of interac- tion—of communication in particular—may distribute similar mental representations and similar public productions (such as behaviors and artifacts) throughout a popula- tion. Types of mental representations and public productions that are stabilized through such causal chains are, in fact, what we recognize as cultural. To help explain why some items stabilize and become cultural (when the vast majority of mental representations and public productions have no recognizable descendants), it is suggested that domain-specific evolved dispositions act as recep- tors and tend to fix specific kinds of contents. Many cultural representations stabilize because they resonate with domain-specific principles. Because such representations tend to be rapidly and solidly acquired, they are relatively inured to disruptions in the process of their transmission. Hence the epidemiological approach to culture dovetails with evolutionary psychology (see Tooby and Cosmides 1992) and with much recent work in developmental psychology, which has highlighted the role of innate prepared- ness and domain-specificity in learning (Hirschfeld and Gelman 1994; Sperber, Prem- ack, and Premack 1995). Children are not just the passive receptors of cultural forms. Given their cognitive dispositions, they spontaneously adopt certain cultural representations and accept oth- ers only through institutional support such as that provided by schools. The greater the dependence on institutional support, the greater the cultural lability and variability. Other inputs, children reject or transform. A compelling example is provided by the case of CREOLES. When colonial, commercial, and other forces bring populations together in linguistically unfamiliar contexts a common result is the emergence of a pidgin, a cobbled language of which no individual is a native speaker. Sometimes, children are raised in a pidgin. When pidgin utterances are the input of the language acquisition process, a creole, that is a natural and fully elaborated language, is the out- put. Children literally transform the contingent and incomplete cultural form into a noncontingent and fully articulated form. This happens because children are equipped with an evolved device for acquiring language (Bickerton 1990). Cultural forms stabilize because they are attention-grabbing, memorable, and sus- tainable with respect to relevant domain-specific devices. Of course, representations are also selected for in virtue of being present in any particular cultural environment. Domain-specific devices cannot attend to, act on, or elaborate representations that the organism does not come into contact with. For the development of culture, a cul- tural environment, a product of human history, is as necessary as a cognitive equip- ment, a product of biological evolution. See also ANIMISM; BOAS, FRANZ; CATEGORIZATION; CHESS, PSYCHOLOGY OF; COG- NITIVE ANTHROPOLOGY; COGNITIVE ARCHAEOLOGY; COGNITIVE ARTIFACTS; COLOR CATEGORIZATION; CONCEPTUAL CHANGE; CONNECTIONISM, PHILOSOPHICAL ISSUES OF; CREOLES; CULTURAL CONSENSUS THEORY; CULTURAL EVOLUTION; CULTURAL Culture, Cognition, and Evolution cxxiii SYMBOLISM; CULTURAL VARIATION; DOMAIN SPECIFICITY; EXPERTISE; FIGURATIVE LANGUAGE; FOLK BIOLOGY; HUMAN UNIVERSALS; LÉVI-STRAUSS, CLAUDE; LINGUIS- TIC RELATIVITY HYPOTHESIS; LITERACY; MAGIC AND SUPERSTITION; MALINOWSKI, BRONISLAW; MEMORY; METAPHOR; METAPHOR AND CULTURE; MODULARITY OF MIND; MOTIVATION AND CULTURE; NAIVE MATHEMATICS; NAIVE PHYSICS; NAIVE SOCIOL- OGY; NUMERACY AND CULTURE; RELIGIOUS IDEAS AND PRACTICES; SCHEMATA; SEMI- OTICS AND COGNITION; TECHNOLOGY AND HUMAN EVOLUTION; THEORY OF MIND; WRITING SYSTEMS 3 Cognition in an Ecological, Social, and Cultural Perspective Ordinary cognitive activity does not take place in a fixed experimental setting where the information available is strictly limited and controlled, but in a complex, information- rich, ever-changing environment. In social species, conspecifics occupy a salient place in this environment, and much of the individual-environment interaction is, in fact, interaction with other individuals. In the human case, moreover, the environment is densely furnished with cultural objects and events most of which have, at least in part, the function of producing cognitive effects. In most experimental psychology this ecological, social, and cultural dimension of human cognition is bracketed out. This practice has drawn strong criticisms, both from differently oriented psychologists and from social scientists. Clearly, there are good grounds for these criticisms. How damning they are remains contentious. After all, all research programs, even the most holistic ones, cannot but idealize their objects by abstracting away from many dimensions of reality. In each case, the issue is whether the idealization highlights a genuinely automous level about which interesting generalizations can be discovered, or whether it merely creates an artifi- cial pseudodomain the study of which does not effectively contribute to the knowl- edge of the real world. Be that as it may, in the debate between standard and more ecologically oriented approaches to cognition, there is no doubt that the latter have raised essential questions and developed a variety of interesting answers. It is to these positive contributions that we now turn. Issues of ecological validity arise not just when the social and cultural dimension of cognition is deployed, but at all levels of cognition. As argued by ECOLOGICAL PSY- CHOLOGY, even the perceptions of an individual organism should be understood in ecological terms. Based on the work of J. J. GIBSON, ecological psychology relates perception not to “stimuli” but to the layout of the environment, to the possibilities it opens for action (the AFFORDANCES), and to the perceiver’s own situation and motion in the environment. When the environment considered is social and cultural, there are further grounds to rethink even more basic tenets of cognitive science, particularly the notion that the individual mind is the site of cognitive processes. This is what recent work on SITUATED COGNITION AND LEARNING and on SITUATEDNESS/EMBEDDEDNESS has been doing. Many of the issues described today in terms of situated cognition were raised in the pioneering work of the Russian psychologist LEV VYGOTSKY (1896–1934), whose work was introduced to English readers in the 1970s (see Wertsch 1985b). Vygotsky saw cognitive activity as being social as well as mental. He stressed the importance of cultural tools for cognition. His insight that historical, cultural, and institutional con- texts condition learning by identifying and extending the child’s capacities animates several ecological approaches in psychology. Writing in the first half of the twentieth century, Vygotsky was not aiming at an explicit modeling of the processes he dis- cussed, nor were the first studies inspired by his work in the 1970s and 1980s (see Wertsch 1985a). Some of the more recent work about situated cognition, though inspired by Vygotsky, does involve modeling of cognitive processes, which means, of course, departing from Vygotsky’s original conceptual framework. To what extent is cognition in a social and cultural environment still an individual process? Regarding cognition in a social environment, James Wertsch raises the issue with a telling anecdote about helping his daughter remember where she left her shoes. When she was unable to remember, he began to pose questions that directed her recall cxxiv Culture, Cognition, and Evolution until she “remembered” where they were. Wertsch asks who remembered in this case: he didn’t since he had no prior information about the shoes’ location, nor did his daughter because she was unable to recall their location without his intervention. Regarding cognition in an environment containing cultural artifacts, a striking exam- ple is provided by Edwin Hutchins (1995), who has demonstrated how the cognitive processes involved in flying a plane do not take place just in the pilot’s head but are distributed throughout the cockpit, in the members of the crew, the control panel, and the manuals. This interpenetration of processes internal and external to the individual can be stud- ied in technologically rich environment such as that provided in HUMAN-COMPUTER INTERACTION, and also in more mundane circumstances such as finding one’s way with the help of a map (see HUMAN NAVIGATION), or shopping at the supermarket where the arrangement of the shelves serves as a kind of shopping list (Lave et al. 1984). This type of research is being applied in COGNITIVE ERGONOMICS, which helps design tech- nologies, organizations, and learning environments in a way informed by cognitive sci- ence. The study of cultural tools and the form of cognitive activity they foster is of importance for the historical and anthropological study of culture. It is an old com- monplace to contrast societies with and without writing systems. As Lévi-Strauss (1971) suggested, the very structure of oral narratives reflects an optimal form for memory unaided by external inscriptions. More recent work (e.g., Goody 1977, 1987; Rubin 1995; Bloch 1998) has attempted to elaborate and in part rethink this contrast by looking at the cognitive implications of orality and writing and of other systems for displaying information in the environment (see ARTIFACTS AND CIVILIZATION). EDU- CATION too has been approached in a Vygotskyan perspective, as a collaborative enterprise between teacher and learner using a specially designed environment with ad hoc props. Education is thus described at a level intermediary between individual cognitive development and cultural transmission, thus linking and perhaps locking together the psychological and the cultural level (Bruner 1996). From the point of view of the epidemiological approach to culture evoked in the preceding section, the situated cognition approach is quite congenial. The epidemio- logical approach insists on the fact that the causal chains of cultural distribution are complex cognitive and ecological processes that extend over time and across popula- tions. This, however, dedramatizes the contrast between a more individualistic and a more situated description of cognitive processes (see INDIVIDUALISM). Consider a sit- uated process such as a teacher-learner interaction, or the whole cockpit of a plane doing the piloting. These processes are not wholly autonomous. The teacher is a link in a wider process of transmission using a battery of artifacts, and the learner is likely to become a link, possibly of another kind, in the same process. Their interaction can- not be fully explained by abstracting away from this wider context. Similarly, the cockpit is far from being fully autonomous. It is linked to air control on the ground, through it to other aircrafts, but also, in time, to the engineering process that designed the plane, to the educational process that trained the pilot, and so on. Of course, both the teacher-learner interaction and the cockpit have enough autonomy to deserve being considered and studied on their own. But then so do the individual cognitive processes of the teacher, the learner, the pilot, and so on at a lower level, and the com- plex institutional networks in which all this take place at a higher level. Cognitive cul- tural causal chains extend indefinitely in all directions. Various sections of these chains of different size and structure are worth studying on their own. The study of psychological processes in their social context is traditionally the prov- ince of social psychology (see Ross and Nisbett 1991; Gilbert, Fiske, and Lindzey 1998). The contribution of this rich discipline to the cognitive sciences can be read in two ways. On the one hand, it can be pointed out that, at a time where mainstream psy- chologists were behaviorists and not interested in contentful cognitive processes, social psychologists were studying beliefs, opinions, prejudices, influence, motivation, or attitudes (e.g., Allport 1954). On the other hand, it could be argued that the interest of social psychologists for these mental phenomena is generally quite different from that of cognitive scientists. The goals of social psychologists have typically been to identify Culture, Cognition, and Evolution cxxv trends and their causal factors, rather than mechanisms and their parts, so that most of social psychology has never been “cognitive” in this strong sense. In the practice of standard cognitive psychology too, it is quite often the case that a trend, a tendency, a disposition is identified well before the underlying mechanisms are considered. Many of the phenomena identified by social psychologists could be further investi- gated in a more standardly cognitive way, and, more and more often, they are. For instance, according Festinger’s (1957) theory of cognitive DISSONANCE, people are emotionally averse to cognitive inconsistencies and seek to reduce them. Festinger investigated various ways in which such dissonances arise (in decision making or in forced compliance, for instance), and how they can be dealt with. Recently, computa- tional models of dissonance have been developed using artificial neural networks and relating dissonance to other psychological phenomena such as analogical reasoning. ATTRIBUTION THEORY, inspired by Heider (1958) and Kelley (1972), investigates causal judgments (see CAUSAL REASONING), and in particular interpretations of peo- ple’s behavior. Specific patterns have been identified, such as Ross’s (1977) “funda- mental attribution error” (i.e., the tendency to overestimate personality traits and underestimate the situation in the causing of behavior). As in the case of dissonance, there has been a growing interest for modeling the inferential processes involved in these attributions (e.g. Cheng and Novick 1992). STEREOTYPING of social categories, another typical topic of social psychology, is also approached in a more cognitive way by focusing on information processing and knowledge structures. The domain of social psychology where the influence of cognitive science is the most manifest is that of SOCIAL COGNITION (Fiske and Taylor 1991), that is the cogni- tion of social life, sometimes extended to cognition as shaped by social life. Social cognition so understood is the very subject matter of social psychology, or at least its central part (leaving out emotion), but the reference to cognition, rather than to psy- chology generally, signals the intent to join forces with mainstream cognitive psychol- ogy. With the development of the domain-specificity approach, however, social cognition so understood may be too broad an area. For instance, it does not distin- guish between naïve psychology and naïve sociology, when the trend may be rather toward distinguishing even more fine-grained mechanisms. One issue that has always been central to social psychology and that has become important in cognitive science only later is rationality. Social judgment exhibits bla- tant cases of irrationality, and their study by social psychologists (see Nisbett and Ross 1980) has contributed to the development of the study of reasoning in general (see JUDGMENT HEURISTICS; CAUSAL REASONING; PROBABILITIC REASONING; DEDUC- TIVE REASONING). One area of social life where rationality plays a special role is eco- nomics. It is within economics that RATIONAL CHOICE THEORY was initially developed (see also RATIONAL DECISION MAKING). The actual behavior of economic agents, however, does not fully conform to the normative theory. Drawing in particular on the work of Kahneman and TVERSKY (see Kahneman, Slovic, and Tversky 1982), experi- mental and behavioral economists explore and try to model the actual behavior of eco- nomic agents (see ECONOMICS AND COGNITIVE SCIENCE). In principle, economics should provide a paradigmatic case of fruitful interaction between the social and the cognitive sciences. The economic domain is quite specific, however, and it is an open question to know to what extent the cognitive approach to this area, based as it is on an abstract normative theory of rationality, can serve as a model in other areas (but see Becker 1976). From the points of view of evolutionary psychology and situated cognition, it is tempting to adopt an alternative approach by developing a notion of evolutionarily grounded BOUNDED RATIONALITY as a criterion for evaluating the manner in which human inferential mechanisms perform their functions. Such a criterion would involve not just considerations of epistemic reliability, but also of processing speed and cost. In this perspective, evolutionary psychologists have investigated how reasoning abili- ties may be adjusted to specific problems and domains, and how they may privilege information available in ordinary environments (see Cosmides and Tooby 1992; Gig- erenzer and Goldstein 1996; Gigerenzer and Hoffrage 1995). cxxvi Culture, Cognition, and Evolution We now turn to anthropological research on the role of culture in cognitive and more generally mental processes. It is hardly controversial that cultural factors enable, constrain, and channel the development of certain cognitive outcomes. Some cultural environments inhibit normal cognitive development (e.g., inequitable distributions of cultural resources underlie uneven performance on standardized tests). Other cultural environments promote the elaboration of complex knowledge structures such as mod- ern science by providing the appropriate artifactual and institutional support. In fact, it takes little more than a trip abroad to appreciate that our abilities to make the best use of the natural and artifactual environment and to interpret the behaviors of others is culture-bound. The social sciences, and anthropology in particular, tend to approach the relation- ship between culture and mind in a much more radical way. Quite commonly the claim made is not just that cultural factors affect mental activity, it is that the human mind is socially and culturally constituted. This could be understood as meaning just that human mental processes use at every moment and in every activity cultural tools, language to begin with, and also schemas, models, expertises, and values. This, surely, is correct, and makes human minds very complex and special. What is generally meant goes well beyond this triviality, however, and is part of an antinaturalistic approach common in the social sciences. On this view, there may be brains but there are no minds in nature, and, anyhow, there is no human nature. Minds are not natural systems informed and transformed by culture, they are made by culture, and differ- ently so by different cultures. From this point of view, naturalistic psychology, at least when it deals with true mental functions, with thinking in particular, is a Western eth- nocentric pseudoscience. Piaget’s study of the acculturation of Swiss children is mis- taken for the study of a universal human cognitive development; the study of American college students reasoning on laboratory tasks is mistaken for that of human (ir)rationality, and so on. Such culturalism—in this extreme or in more hedged forms—goes together with a specific view of culture. We saw in the last section how cognitive anthropology puts culture essentially in the mind and how evolutionary and epidemiological approaches treat culture in terms of population-wide distributions of individual mental and arti- factual phenomena. These are naturalistic views of culture, with little following in the social sciences. Much more characteristic are the influential views of the anthropolo- gist Clifford Geertz. He writes: “The concept of culture I espouse is essentially a semiotic one. Believing, with Max Weber, that man is an animal suspended in webs of significance he himself has spun, I take culture to be those webs, and the analysis of it to be therefore not an experimental science in search of law but an interpretive one in search of meaning” (Geertz 1973: 5). Attacking cognitive anthropology for placing culture in the mind, and drawing on Wittgenstein’s dismissal of the idea of a private meaning, Geertz (1973: 12) insists that “culture is public because meaning is.” This understanding of the notion of culture goes together with a strong individua- tion of individual cultures (comparable to the individuation of languages), each seen as a separate system of meanings. Cultures so understood are viewed as being not just different environments, but, literally, different worlds, differing from each other in arbitrary ways. This view, known as CULTURAL RELATIVISM, is, except in very watered-down versions, difficult to reconcile with any naturalistic approach to cogni- tive development. Given that the initial inputs to cognitive development are just myr- iad stimulations of nerve endings, the process of extracting from these inputs the objective regularities of a relatively stable world is already hard enough to explain. If, in fact, even the world in which cognitive development takes place is not given, if the child can draw neither from expectable environmental regularities nor from inter- nal preparedness to deal with just these regularities, then the process is a pure mys- tery. It is a sign of the lack of concern for psychological issues that this mystery seems never to have worried defenders of cultural relativism. In one area, anthropological linguistics, cultural relativism has guided positive research programs that continue to this day. The linguist and anthropologist Edward SAPIR and the linguist Whorf developed the thesis of linguistic relativity (the “Sapir- Whorf hypothesis”) according to which lexical and grammatical categories of lan- Culture, Cognition, and Evolution cxxvii guage determine the way the world is perceived and conceptualized, and each lan- guage is at the root of a different worldview (see also LANGUAGE AND COMMUNICATION). On this view, human cognition can be understood only through analysis of the linguistic and cultural structures that support it. The classical example is Whorf’s treatment of the Hopi notion of time. Noting that the Hopi language “con- tains no words, grammatical forms, construction or expressions that refer directly to what we call ‘time,’ or to the past, or future, or to enduring or lasting,” he concluded that the Hopi have “no general notion or intuition of time as a smooth flowing contin- uum” (Whorf 1956: 57). Subsequent research (see Brown 1991 for a review) tended to show that this radical linguistic relativity is not supported by closer analysis. However, less radical versions of linguistic relativity can be sustained (Lucy 1992; Gumperz and Levinson 1996). Recent comparative work on LANGUAGE AND CULTURE has been car- ried out with the methods of cognitive psycholinguistics at the Max Planck Institute for Psycholinguistics in Nijmegen. It has, in particular, gathered impressive evidence of the fact that the manner in which different languages encode spatial coordinates strongly affects people’s conceptualization of spatial relations and movements (see Levinson 1996). The standard anthropological characterization of cultures as relatively bounded, homogeneous, and coherent entities has repeatedly been challenged (e.g., Leach 1954; Fried 1975). The idea of discrete tribes each with its own culture was a colonial administrator’s dream—a dream they forced on people—before being an anthropolo- gist’s presupposition. In fact, different flows of cultural information—linguistic, reli- gious, technological—have different boundaries, or, quite often, do not even have proper boundaries, just zones of greater of lesser intensities. From an epidemiological point of view, of course, these ongoing cultural flows and the fuzziness of cultural boundaries are just what one should expect. From such a point of view, the notion of a culture should not have more of a theoretical status than that of a region in geography. Culture is best seen not as a thing, but as a property that representations, practices, and artifacts possess to the extent that they are caused by population-wide distribution pro- cesses. It is the standard notion of a culture as an integrated whole that has guided most anthropological research bearing, directly or indirectly, on psychological issues. Much early anthropology, notably in North America, focused on the social and cul- tural correlates of psychological phenomena. A major and influential program of research, pioneered by Margaret Mead and Ruth Benedict, and lasting well after World War II, examined the relationship between personality and culture. The “per- sonality and culture” school adapted the language of psychopathology to describe and analyze cultural phenomena. Still, the thrust of this approach was an abiding skepti- cism about psychological claims. Relying on ethnographic data, scholars assessed and critiqued universalist claims about the mind. Both Mead and Malinowski drew consid- erable attention from their challenges to several of Freud’s generalizations about human nature, particularly claims about the development of sexuality. Ultimately the appeal of the culture and personality school waned in part as national character stud- ies began more to resemble national stereotypes than cultural analysis, but also in part because the approach increasingly identified the sociocultural level with the psycho- logical level, a move that made most anthropologists uncomfortable. Much anthropological research, although deliberately apsychological, is neverthe- less of genuine cognitive interest in that it investigates knowledge structures, from specific notions to ideological systems. For example, much work has been devoted to examining different notions of person across cultures. In contrast to work in psychol- ogy that tends to take the person as a fundamental and invariant concept (see, e.g., Miller and Johnson-Laird 1976), anthropologists challenge the assumption that a per- son implies a bounded and unique sense of individuality and self. Rather the person is a socially situated concept that can only be understood from the perspective of social and cultural relations (Mauss 1985; Geertz 1973). For instance, Lutz (1988) argues that the Ifaluk of Melanesia do not conceive of emotions as something occurring with an individual person, but as a relation between several individuals in which the emo- tion exists independent of (and outside) the psyche of any one person. The notion of cxxviii Culture, Cognition, and Evolution persons as unique self-oriented entities, in its turn, has been analyzed as arising from the specific cultural and political-economic environments of North America and Europe (Bellah et al. 1985). Like all relativist ideas, these views are controversial. Notice, however, that, unlike the claim that the mind itself is a cultural product, the claim that the person, or the SELF, is socially and culturally constituted is compatible with a naturalistic cognitive science, and has been defended from a naturalistic point of view, for instance by Dennett (1991). Standard anthropological evidence for the cultural character and variability of notions like “person” consists of cultural narratives and expression of conventional wisdom. More recently, however, researchers in social psychology, CULTURAL PSY- CHOLOGY and ETHNOPSYCHOLOGY have used innovative experimental methods to support ethnographic findings (see Markus and Kityama 1991; Shweder 1991). Shweder and colleagues have made important contributions (in both method and the- ory) toward integrating ethnographic and experimental approaches. Work on moral development, especially the way culture may fundamentally shape it, has been influ- ential (Shweder, Mahapatra, and Miller 1990; see also Turiel 1983 for a carefully crafted and persuasive challenge to the antiuniversalist point of view). See also AFFORDANCES; ARTIFACTS AND CIVILIZATION; ATTRIBUTION THEORY; BOUNDED RATIONALITY; CAUSAL REASONING; COGNITIVE ERGONOMICS; CULTURAL PSYCHOLOGY; CULTURAL RELATIVISM; DEDUCTIVE REASONING; DISSONANCE; ECO- LOGICAL PSYCHOLOGY; ECONOMICS AND COGNITIVE SCIENCE; EDUCATION; ETHNOP- SYCHOLOGY; GIBSON, J. J.; HUMAN NAVIGATION; HUMAN-COMPUTER INTERACTION; INDIVIDUALISM; JUDGMENT HEURISTICS; LANGUAGE AND COMMUNICATION; LAN- GUAGE AND CULTURE; PROBABILISTIC REASONING; RATIONAL CHOICE THEORY; RATIONAL DECISION MAKING; SAPIR, EDWARD; SELF; SITUATED COGNITION AND LEARNING; SITUATEDNESS/EMBEDDEDNESS; SOCIAL COGNITION IN ANIMALS; STEREO- TYPING; TVERSKY; VYGOTSKY, LEV Conclusion The various strains of research rapidly reviewed in this last section—the Vygotskian, the social-psychological and the anthropological—are extremely fragmented, diverse, and embattled. This should not obscure the fact that they all deal with important and difficult issues, and provide extremely valuable insights. It is encouraging to observe that, in all these approaches, there is a growing concern for explicit theorizing and sound experimental testing. More generally, it seems obvious to us that the various perspectives we have considered in this chapter should be closely articulated, and we have attempted to highlight the works that particularly contribute to this articulation. We are still far from the day when the biological, the cognitive, and the social sciences will develop a common conceptual framework and a common agenda to deal with the major issues that they share. References Allport, G. (1954). The Nature of Prejudice. Reading, MA: Addison-Wesley. Atran, S. (1990). Cognitive Foundation of Natural History. New York: Cambridge University Press. Barkow, J., L. Cosmides, and J. Tooby. (1992). The Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York: Oxford University Press. Becker, G. (1976). The Economic Approach to Human Behavior. Chicago: University of Chicago Press. Bellah, R., R. Madsen, W. Sullivan, A. Swidler, and S. Tipton. (1985). Habits of the Heart: Individ- ualism and Commitment in American Life. Berkeley: University of California Press. Berlin, B. (1992). Ethnobiological Classification: Principles of Categorization of Plants and Ani- mals in Traditional Societies. Princeton: Princeton University Press. Berlin, B., D. Breedlove, and P. Raven. (1973). General principles of classification and nomenclature in folk biology. American Anthropologist 75: 214–242. Berlin, B., and P. Kay. (1969). Basic Color Terms: Their Universality and Growth. Berkeley: Univer- sity of California Press. Bickerton, D. (1990). Language and Species. Chicago: University of Chicago Press. Bloch, M. (1998). How We Think They Think: Anthropological Approaches to Cognition, Memory, and Literacy. Boulder: Westview Press. Culture, Cognition, and Evolution cxxix Boyd, R., and P. Richerson. (1985). Culture and the Evolutionary Process. Chicago: University of Chicago Press. Boyer, P. (1990). Tradition as Truth and Communication. New York: Cambridge University Press. Boyer, P. (1994). The Naturalness of Religious Ideas: Outline of a Cognitive Theory of Religion. Los Angeles: University of California Press. Brown, D. (1991). Human Universals. New York: McGraw-Hill. Bruner, J. (1996). The Culture of Education. Cambridge, MA: Harvard University Press. Buss, D. (1994). The Evolution Of Desire: Strategies Of Human Mating. New York: Basic Books. Byrne, R., and A. Whiten. (1988). Machiavellian Intelligence: Social Expertise and the Evolution of Intellect in Monkeys, Apes, and Humans. New York: Oxford University Press. Carey, S. (1985). Conceptual Change in Childhood. Cambridge, MA: MIT Press. Cavalli-Sforza, L. L., and M. W. Feldman. (1981). Cultural Transmission and Evolution: A Quanti- tative Approach. Princeton: Princeton University Press. Cheng, P. W., and L. R. Novick. (1992). Covariation in natural causal induction. Psychological Review 99: 595–602. Cosmides, L., and J. Tooby. (1987). From evolution to behavior: Evolutionary psychology as the missing link. In J. Dupré, Ed., The Latest on the Best: Essays on Evolution and Optimality. Cam- bridge, MA: MIT Press. Cosmides, L., and J. Tooby. (1992). Cognitive adaptations for social exchange. In J. Barkow, L. Cosmides, and J. Tooby, Eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York: Oxford University Press. D’Andrade, R. (1995). The Development of Cognitive Anthropology. New York: Cambridge Univer- sity Press. Dawkins, R. (1976). The Selfish Gene. New York: Oxford University Press. Dawkins, R. (1982). The Extended Phenotype. San Francisco: W. H. Freeman. Dennett, D. (1991). Consciousness Explained. Boston: Little, Brown. Durham, W. (1991). Coevolution: Genes, Cultures, and Human Diversity. Stanford: Stanford Uni- versity Press. Ellen, R. (1993). The Cultural Relations of Classification. Cambridge: Cambridge University Press. Festinger, L. (1957). A Theory of Cognitive Dissonance. Palo Alto: Stanford University Press. Fiske, S., and S. Taylor. (1991). Social Cognition. New York: McGraw-Hill. Fodor, J. (1982). The Modularity of Mind. Cambridge, MA: MIT Press. Fried, M. (1975). The Notion of Tribe. Menlo Park: Cummings. Geertz, C. (1973). The Interpretation of Cultures. New York: Basic Books. Gigerenzer, G., and D. G. Goldstein. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review 103: 650–669. Gigerenzer, G., and U. Hoffrage. (1995). How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review 102: 684–704. Gilbert, D. T., S. T. Fiske, and G. Lindzey, Eds. (1998). The Handbook of Social Psychology. 4th ed. New York: McGraw-Hill. Goody, J. (1977). The Domestication of the Savage Mind. Cambridge: Cambridge University Press. Goody, J. (1987). The Interface between the Written and the Oral. Cambridge: Cambridge Univer- sity Press. Gumperz, J., and S. Levinson. (1996). Rethinking Linguistic Relativity. New York: Cambridge Uni- versity Press. Hamilton, W. D. (1964). The genetical theory of social behaviour. Journal of Theoretical Biology 7: 1–52. Heider, F. (1958). The Psychology of Interpersonal Relations. New York: Wiley. Herskovits, M., D. Campbell, and M. Segall. (1969). A Cross-Cultural Study of Perception. India- napolis: Bobbs-Merrill. Hirschfeld, L. (1995). Do children have a theory of race? Cognition 54: 209–252. Hirschfeld, L. (1996). Race in the Making: Cognition, Culture, and the Child’s Construction of Human Kinds. Cambridge, MA: MIT Press. Hirschfeld, L., and S. Gelman. (1994). Mapping the Mind: Domain-specificity in Cognition and Cul- ture. New York: Cambridge University Press. Hutchins, E. (1995). How a cockpit remembers its speed. Cognitive Science 19: 265–288. Inagaki, K., and G. Hatano. (1987). Young children’s spontaneous personification and analogy. Child Development 58: 1013–1020. Kahneman, D., P. Slovic, and A. Tversky. (1982). Judgment under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University Press. Kay, P., and W. M. Kempton. (1988). What is the Sapir-Whorf Hypothesis? American Anthropolo- gist 86: 65–79. Keil, F. (1979). Semantic and Conceptual Development: An Ontological Perspective. Cambridge, MA: Harvard University Press. Kelley, H. (1972). Attribution in social interaction. In E. Jones, D. Kanouse, H. Kelley, R. Nisbett, S. Valins, and B. Weiner, Eds., Attribution: Perceiving the Causes of Behavior. Morristown, PA: General Learning Press. cxxx Culture, Cognition, and Evolution Lakoff, G. (1987). Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: Chicago University Press. Lakoff, G., and M. Johnson. (1980). Metaphors We Live By. Chicago: University of Chicago Press. Lakoff, G., and M. Turner. (1989). More Than Cool Reason: A Field Guide to Poetic Metaphor. Chi- cago: University of Chicago Press. Lave, J., M. Murtaugh, and O. de la Rocha. (1984). The dialectic of arithmetic in grocery shopping. In B. Rogoff and J. Lave, Eds., Everyday Cognition: Its Development in Social Context. Cam- bridge, MA: Harvard University Press. Leach, E. (1954). Political Systems of Highland Burma: A Study of Kachin Social Structure. Cam- bridge, MA: Harvard University Press. Levinson, S. (1996). Language and space. Annual Review of Anthropology 25: 353–382. Palo Alto: Academic Press. Lévi-Strauss, C. (1971). L’homme nu. Paris: Plon. Levy-Bruhl, L. (1922). La Mentalité Primitive. Paris: Libraire Felix Alcan. Lorenz, K. (1966). On Aggression. Translated by M. K. Wilson. New York: Harcourt, Brace and World. Lucy, J. (1992). Language Diversity and Thought. New York: Cambridge University Press. Lumsden, C., and E. Wilson. (1981). Genes, Minds, and Culture. Cambridge, MA: Harvard Univer- sity Press. Lutz, C. (1988). Unnatural Emotions: Everyday Sentiments on a Micronesian Atoll and Their Chal- lenge to Western Theory. Chicago: University of Chicago Press. Marks, I., and R. Nesse. (1994). Fear and fitness: An evolutionary analysis of anxiety disorders. Ethology and Sociobiology 15: 247–261. Markus, H., and S. Kityama. (1991). Culture and self: Implications for cognition, emotion, and moti- vation. Psychological Review 98: 224–253. Mauss, M. (1985). A category of the human mind: The notion of person; the notion of self. In M. Carrithers, S. Collins, and S. Lukes, Eds., The Category of Person. New York: Cambridge Uni- versity Press. Medin, D., and S. Atran. (1999). Folk Biology. Cambridge, MA: MIT Press. Miller, G., and P. Johnson-Laird. (1976). Language and Perception. New York: Cambridge Univer- sity Press. Mineka, S., M. Davidson, M. Cook, and R. Keir. (1984). Observational conditioning of snake fear in rhesus monkeys. Journal of Abnormal Psychology 93: 355–372. Nemeroff, C., and P. Rozin. (1994). The contagion concept in adult thinking in the United States: Transmission of germs and interpersonal influence. Ethos 22: 158–186. Nisbett, R., and L. Ross. (1980). Human Inference: Strategies and Shortcomings of Social Judgment. Englewoods Cliffs, NJ: Prentice-Hall. Pinker, S. (1994). The Language Instinct. New York: William Morrow. Premack, D., and G. Woodruff. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences 1: 516–526. Quinn, N. (1987). Convergent evidence for a cultural model of American marriage. In D. Holland and N. Quinn, Eds., Cultural Models in Language and Thought. New York: Cambridge Univer- sity Press. Ramsey, P. (1987). Young children’s thinking about ethnic differences. In J. Phinney and M. Rotheram, Eds., Children’s Ethnic Socialization. Newbury Park, CA: Sage. Ross, L. (1977). The intuitive psychologist and his shortcomings: Distortions in the attribution pro- cess. In L. Berkowitz, Ed., Advances in Experimental and Social Psychology, vol. 10. New York: Academic Press. Ross, L., and R. Nisbett. (1991). The Person and the Situation: Perspectives of Social Psychology. Philadelphia: Temple University Press. Rubin, D. (1995). Memory in Oral Traditions. New York: Oxford University Press. Shweder, R. (1991). Thinking Through Cultures: Expeditions in Cultural Psychology. Cambridge, MA: Harvard University Press. Shweder, R., A. Mahapatra, and J. Miller. (1990). Culture and moral development. In J. Kagan and S. Lamb, Eds., The Emergence of Morality in Young Children. Chicago: University of Chicago Press. Sperber, D. (1975). Rethinking Symbolism. New York: Cambridge University Press. Sperber, D. (1985). Anthropology and psychology: Towards an epidemiology of representations. Man 20: 73–89. Sperber, D. (1996). Explaining Culture: A Naturalistic Approach. London: Blackwell. Sperber, D., D. Premack, and A. Premack, Eds. (1995). Causal Cognition. Oxford: Oxford Univer- sity Press. Strauss, C., and N. Quinn. (1998). A Cognitive Theory of Cultural Meaning. New York: Cambridge University Press. Symons, D. (1979). The Evolution of Human Sexuality. New York: Oxford University Press. Tooby, J., and L. Cosmides. (1992). The psychological foundations of culture. In J. Barkow, L. Cosmides, and J. Tooby, Eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York: Oxford University Press. Culture, Cognition, and Evolution cxxxi Trivers, R. L. (1971). The evolution of reciprocal altruism. Quarterly Review of Biology 46: 35–57. Turiel, E. (1983). The Development of Social Knowledge: Morality and Convention. New York: Cambridge University Press. Wertsch, J. (1985a). Culture, Communication, and Cognition: Vygotskian Perspectives. New York: Cambridge University Press. Wertsch, J. (1985b). Vygotsky and the Social Formation of the Mind. Cambridge, MA: Harvard Uni- versity Press. Whorf, B. (1956). Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf. J. Carroll, Ed. New York: Free Press. Wilson, E. O. (1975). Sociobiology: The New Synthesis. Cambridge, MA: Belknap Press of Harvard University Press. Wilson, E. O. (1978). On Human Nature. Cambridge, MA: Harvard University Press. Further Readings Barrett, J., and F. Keil. (1996). Conceptualizing a nonnatural entity: Anthromorphism in God con- cepts. Cognitive Psychology 31: 219–247. Boas, F. (1911). The Mind of Primitive Man. New York: Macmillan. Bock, P. (1994). Handbook of Psychological Anthropology. Westport, CT: Greenwood Press. Bogdan, R. J. (1997). Interpreting Minds: The Evolution of a Practice. Cambridge, MA: MIT Press. Boster, J. (1985). Requiem for the omniscient informant: There’s life in the old girl yet. In J. Dough- erty, Ed., Directions in Cognitive Anthropology. Urbana: University of Illinois Press. Boyer, P., Ed. (1993). Cognitive Aspects of Religious Symbolism. Cambridge, MA: Cambridge Uni- versity Press. Bronfenbrenner, U. (1979). The Ecology of Human Development. Cambridge, MA: Harvard Univer- sity Press. Bruner, J. (1990). Acts of Meaning. Cambridge: Harvard University Press. Bullock, M. (1985). Animism in childhood thinking: A new look at an old question. Developmental Psychology 21: 217–225. Carey, D., and E. Spelke. (1994). Doman-specific knowledge and conceptual change. In L. Hir- schfeld and S. Gelman, Eds., Mapping the Mind: Domain-Specificity in Cognition and Culture. New York: Cambridge University Press. Cole, M. (1996). Cultural Psychology: A Once and Future Discipline. Cambridge, MA: Harvard University Press. D’Andrade, R. (1981). The cultural part of cognition. Cognitive Science 5: 179–195. Dehaene, S. (1997). The Number Sense: How the Mind Creates Mathematics. New York: Oxford University Press. Donald, M. (1991). Origins of the Modern Mind. Cambridge, MA: Harvard University Press. Dougherty, J. (1985). Directions in Cognitive Anthropology. Urbana: University of Illinois Press. Fiske, A. (1992). The Four Elementary Forms of Sociality: Framework for a Unified Theory of Social Relations. New York: Free Press. Gallistel, C. (1990). The Organization of Learning. Cambridge, MA: MIT Press. Gallistel, C., and R. Gelman. (1992). Preverbal and verbal counting and computation. Special issues: Numerical cognition. Cognition 44: 43–74. Gelman, R., E. Spelke, and E. Meck. (1983). What preschoolers know about animate and inanimate objects. In D. Rogers and J. Sloboda, Eds., The Acquisition of Symbolic Skills. New York: Plenum. Goodenough, W. (1981). Culture, Language, and Society. Menlo Park: Benjamin/Cummings. Gumperz, J., and S. Levinson. (1996). Rethinking Linguistic Relativity. New York: Cambridge Uni- versity Press. Hamilton, D. (1981). Cognitive Processes in Stereotypes and Stereotyping. Hillsdale, NJ: Erlbaum. Holland, D., and N. Quinn. (1987). Cultural Models in Language and Thought. Cambridge: Cam- bridge University Press. Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT Press. Ingold, T. (1986). Evolution and Social Life. New York: Cambridge University Press. Jones, E., and R. Nisbett. (1972). The actor and the observer: Divergent perceptions of the causes of behavior. In E. Jones, D. Kanouse, H. Kelly, R. Nisbett, S. Valins, B. Weiner, Eds., Attribution: Perceiving the Causes of Behavior. Morristown: General Learning Press. Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmental Perspective on Cognitive Sci- ence. Cambridge: MIT Press. Lave, J. (1988). Cognition in Practice: Mind, Mathematics, and Culture in Everyday Life. New York: Cambridge University Press. Lawson, E. T., and R. McCauley. (1990). Rethinking Religion. Cambridge: Cambridge University Press Lévi-Strauss, C. (1966). The Savage Mind. Chicago: University of Chicago Press. Liberman, P. (1984). The Biology and Evolution of Language. Cambridg, MAe: Harvard University Press. Marler, P. (1991). The instinct to learn. In S. Carey and R. Gelman, Eds., The Epigenesis of Mind: Essays on Biology and Cognition. Hillsdale, NJ: Erlbaum. cxxxii Culture, Cognition, and Evolution McCloskey, M., A. Washburn, and L. Felch. (1983). Intuitive physics: The straight-down belief and its origin. Journal of Experimental Psychology: Learning, Memory, and Cognition 9: 636–649. Nisbett, R., and D. Cohen. (1995). The Culture of Honor: The Psychology of Violence in the South. Boulder: Westview Press. Norman, D. (1987). The Psychology of Everyday Things. Reading: Addison-Wesley. Olson, D. (1994). The World on Paper. New York: Cambridge University Press. Pinker, S. (1994). The Language Instinct. New York: Penguin Books. Premack, D., and A. Premack. (1983). The Mind of an Ape. New York: Norton. Rogoff, B., and J. Lave. (1984). Everyday Cognition: Its Development in Social Contexts. Cam- bridge, MA: Harvard University Press. Romney, A., S. Weller, and W. Batchelder. (1986). Culture as consensus: A theory of culture and accuracy. American Anthropologist 88: 313–338. Rozin, P., and J. Schull. (1988). The adaptive-evolutionary point of view in experimental psychol- ogy. In R. Atkinson, R. Herrnstein, G. Lindzey, and R. Luce, Eds., Steven’s Handbook of Experi- mental Psychology. New York: Wiley. Sapir, E. (1949). The Selected Writings of Edward Sapir in Language, Culture and Personality. D. Mandelbaum, Ed. Berkeley: University of California Press. Saxe, G. (1985). Developing forms of arithmetic operations among the Oksapmin of Papua New Guinea. Journal of Educational Psychology 77: 503–513. Saxe, G. (1991). Culture and Cognitive Development: Studies in Mathematical Understanding. Hillsdale, NJ: Erlbaum. Scribner, S., and M. Cole. (1981). The Psychology of Literacy. Cambridge, MA: Harvard University Press. Shore, B. (1996). Culture in Mind: Cognition, Culture and the Problem of Meaning. New York: Oxford University Press. Shweder, R., and R. LeVine. (1987). Culture Theory: Essays on Mind, Self, and Emotion. New York: Cambridge University Press. Slobin, D. (1985). The Crosslinguistic Study of Language Acquisition, vols. 1 and 2. Hillsdale, NJ: Erlbaum. Spears, R., P. Oakes, N. Ellemars, and S. Haslam. (1997). The Social Psychology of Stereotyping and Group Life. Cambridge: Blackwell. Spiro, M. (1986). Cultural relativism and the future of anthropology. Cultural Anthropology 1: 259– 286. Stigler, J., R. Shweder, and G. Herdt. (1989). Cultural Psychology: The Chicago Symposia on Cul- ture and Development. New York: Cambridge University Press. Suchman, L. (1987). Plans and Situated Action. New York: Cambridge University Press. Tomasello, M., A. Kruger, and H. Ratner. (1993). Cultural learning. Behavioral and Brain Sciences 16: 495–552. Tyler, S. (1969). Cognitive Anthropology. New York: Holt, Rinehart, and Winston. Vygotsky, L. (1986). Thought and Language. Cambridge, MA: MIT Press. Wagner, D. (1994). Literacy, Culture, and Development. New York: Cambridge University Press. Acquisition, Formal Theories of 1 sentences the same meanings). Identification in the limit, Aboutness however, can be argued to be too strict and too liberal a criterion at the same time. The criterion is too strict See INTENTIONALITY; NARROW CONTENT; REFERENCE, THE- because the evolution of languages over time (see LAN- ORIES OF GUAGE VARIATION AND CHANGE) would appear to be problematic, barring language contact, if each generation Acquisition, Formal Theories of acquired exactly the language of the previous one, as required by the criterion of identification in the limit. The criterion is too weak because children appear to learn A formal theory of language acquisition (FTLA) can be their target language(s) in a very short time, whereas iden- defined as a mathematical investigation of the learnability tification in the limit considers successful any learner that properties of the class of human languages. Every FTLA eventually stabilizes on a correct grammar, however long can therefore be seen as an application of COMPUTATIONAL this might take. LEARNING THEORY to the problem of LANGUAGE ACQUISI- These considerations seem to recommend as a plausi- TION, one of the core problems of LEARNING (see also ble alternative the PAC criterion (Probably Approximately LEARNING SYSTEMS). Correct; Valian 1984): a learner is successful if and only The need for FTLAs stems from one of the standard if, for every language in L, it is very likely (but not cer- assumptions of linguistics: A successful theory must prove tain) to produce a grammar that is very close (but not nec- that the grammars proposed by linguists not only account essarily equivalent) to the target grammar and do so not in for all linguistic data (descriptive adequacy) but also are the the limit but in a very short time, measured as a function kind of objects that can be acquired on the kind of data and of how close it gets and how likely it is to do so (see COM- with the kind of cognitive resources that are typical of PUTATIONAL LEARNING THEORY). As an element of a human language learning (explanatory adequacy). FTLA, however, the PAC criterion is not without prob- In order to be properly stated, every FTLA requires four lems of its own. For example, if the error of a conjecture distinct components: with respect to a target language is measured as the prob- ability of the environment in which the language is exhib- 1. A formal characterization of the class L of languages to be learned (see FORMAL GRAMMARS) ited, presenting a string that the conjecture misclassifies, 2. A formal characterization of the criterion of success C then the assumption that children only receive positive 3. A formal characterization of the class of algorithms A evidence has as a consequence that their conjectures have that one is willing to consider as possible learners error zero even if they overgeneralize. In this respect, PAC 4. An explicit characterization M of how linguistic infor- would appear to be too weak a criterion because, empiri- mation is presented to the learner. cally, human learners do not appear to overgeneralize in Given this characterization, every FTLA consists of either a this fashion. proof that there is at least a learner in A that successfully The L and the A components have traditionally been the acquires every language in L when success is defined as in locus of the most important differences among alternative C and data are presented as prescribed by M (a positive FTLAs. Common restrictions on A include memory limita- result) or a proof that there is at least a language in L that no tions, smoothness (successive hypotheses must not be very learner in A can successfully acquire according to C and M different from one another), continuity (every hypothesis is (a negative result). a possible adult grammar), maturation (some possible adult Although the importance of a positive result (typically grammars cannot be part of the child’s early hypotheses), the presentation of a model shown as the proof of the exist- and so on. A principled investigation of the effects of such ence of a learning algorithm with the desired properties) is restrictions on identification in the limit can be found in Jain obvious, it must not be overlooked that negative results can et al. (forthcoming). At the time of writing, however, devel- be just as useful. In fact, as explained earlier, such results opmental psycholinguists have not reached the kind of con- can be used to eliminate whole classes of theories that are sensus on A that was reached on M. descriptively but not explanatorily adequate. As for the L component, it must be noted that no existing Most recent FTLAs assume that, in human language FTLA is based on a formal definition of the class of human learning, M consists of unordered and simple positive evi- languages quite simply because such a definition is cur- dence. This assumption rests on twenty years of research in rently unavailable. Indeed, some have even argued against developmental psycholinguistics (reviewed in Marcus the scientific relevance of formally defining a language as a 1993), pointing to the conclusion that children receive a set of strings (Chomsky 1986). In practice, although ulti- largely grammatical set of simple sentences from their tar- mately an FTLA would have to explain the child’s ability to get language with very little or no reliable instruction on learn every aspect of a target language, most existing what sentences are ungrammatical. FTLAs have respected the division of labor that is tradi- The criterion of success C that has been most com- tional in linguistics, so that there now are formal theories of monly adopted is identification in the limit (Gold 1967): a the acquisition of SYNTAX, acquisition of PHONOLOGY and learner is successful if and only if, for every language in acquisition of word meaning. L, it eventually stabilizes on a grammar that is equivalent Within the domain of syntax, for example, several very to that of all the other speakers of that language (i.e., it broad results have been established with respect to classes yields the same grammaticality judgments and assigns to of languages generated by formal grammars. Positive 2 Acquisition, Formal Theories of learnability results have been established for the class of mar in Early Acquisition. Hillsdale, NJ: Erlbaum, pp. 313– 330. languages generated by suitably restricted Transforma- Pinker, S. (1987). The bootstrapping problem in language acqui- tional Grammars (Wexler and Culicover 1980), the class sition. In B. MacWhinney, Ed., Mechanisms of Language generated by rigid CATEGORIAL GRAMMARS (Kanazawa Acquisition. Hillsdale, NJ: Erlbaum, pp. 399–441. 1994), and the class generated by a recently introduced Prince, A., and P. Smolensky. (1993). Optimality Theory: Con- formalism based on Chomsky’s MINIMALISM (Stabler straint Interaction in Generative Grammar. Technical Report, 1997). Center for Cognitive Science, Rutgers University, New Brun- It is an open question whether this division of labor can swick, NJ. be recommended. Indeed, several nonformal theories have Stabler, E. (1997). Acquiring and Parsing Languages with Move- advocated one form or other of bootstrapping, the view that ment. Unpublished manuscript, University of California, Los the acquisition of any one of these domains aids and must be Angeles. Tesar, B., and P. Smolensky. (Forthcoming). The learnability of aided by the acquisition of the other domains (see Pinker Optimality Theory: an algorithm and some complexity results. 1987; Gleitman 1990; Mazuka 1996 for semantic, syntactic, Linguistic Inquiry. and prosodic bootstrapping respectively). Valiant, L. G. (1984). A theory of the learnable. Communications Many current FTLAs try to sidestep the problem of the of the ACM 27: 1134–1142. unavailability of a formal characterization of L in two Wexler, K., and P. W. Culicover. (1980). Formal Principles of Lan- ways, either by explicitly modeling only the fragments of guage Acquisition. Cambridge, MA: MIT Press. their intended domain (syntax, phonology, SEMANTICS) for which a formal grammar is available or by providing a meta-analysis of the learnability properties of every class Further Readings of languages that can be generated, assuming various kinds Berwick, R. (1985). The Acquisition of Syntactic Knowledge. of innate restrictions on the possible range of variation of Cambridge, MA: MIT Press. human languages (as dictated, for example, by POVERTY OF Bertolo, S. (1995). Maturation and learnability in parametric sys- THE STIMULUS ARGUMENTS; see also LINGUISTIC UNIVER- tems. Language Acquisition 4(4): 277–318. SALS and INNATENESS OF LANGUAGE). Most such meta- Brent, M. R. (1996). Advances in the computational study of lan- analyses are based either on the Principles and Parameters guage acquisition. Cognition 61: 1–38. Hypothesis (Chomsky 1981) or on OPTIMALITY THEORY Clark, R. (1992). The selection of syntactic knowledge. Language (Prince and Smolensky 1993). For reviews of such analy- Acquisition 2(2): 83–149. ses, see Bertolo (forthcoming) and Tesar and Smolensky Clark, R., and I. Roberts. (1993). A computational model of lan- (forthcoming), respectively. It is instructive to note that guage learnability and language change. Linguistic Inquiry exactly the same kind of meta-analysis can be achieved 24(2): 299–345. Clark, R. (1993). Finitude, boundedness and complexity. Learn- also in connectionist models (see NEURAL NETWORKS and ability and the study of first language acquisition. In B. Lust, G. CONNECTIONIST APPROACHES TO LANGUAGE) when certain Hermon, and J. Kornfilt, Eds., Syntactic Theory and First Lan- principled restrictions are imposed on their architecture guage Acquisition: Cross Linguistic Perspectives. Hillsdale, (Kremer 1996). NJ: Erlbaum, pp. 473–489. Clark, R. (1996a). Complexity and the Induction of Tree Adjoin- —Stefano Bertolo ing Grammars. TechReport IRCS-96-14, University of Penn- sylvania. Clark, R. (1996b). Learning First Order Quantifier Denotations. References An Essay in Semantic Learnability. TechReport IRCS-96-19, Bertolo, S., Ed. (Forthcoming). Principles and Parameters and University of Pennsylvania. Learnability. Cambridge: Cambridge University Press. Dresher, E., and J. Kaye. (1990). A computational learning model Chomsky, N. (1981). Lectures on Government and Binding. Dor- for metrical phonology. Cognition 34: 137–195. drecht: Foris. Fodor, J. (Forthcoming). Unambiguous triggers. Linguistic Inquiry. Chomsky, N. (1986). Knowledge of Language: Its Nature, Origins Frank, R., and S. Kapur. (1996). On the use of triggers in parame- and Use. New York: Praeger. ter setting. Linguistic Inquiry 27(4): 623–660. Gleitman, L. (1990). The structural sources of verb meaning. Lan- Gibson, E., and K. Wexler. (1994). Triggers. Linguistic Inquiry guage Acquisition 1(1): 3–55. 25(3): 407–454. Gold, M. E. (1967). Language identification in the limit. Informa- Niyogi, P., and R. Berwick. (1995). The Logical Problem of Lan- tion and Control 10: 447–474. guage Change. A.I. Memo no. 1516, MIT. Jain, S., D. Osherson, J. Royer, and A. Sharma. (Forthcoming). Niyogi, P., and R. Berwick. (1996). A language learning model for Systems That Learn. 2nd ed. Cambridge, MA: MIT Press. finite parameter spaces. Cognition 61: 161–193. Kanazawa, M. (1994). Learnable Classes of Categorial Gram- Osherson, D., and S. Weinstein. (1984). Natural languages. Cogni- mars. Ph.D. diss., Stanford University. tion 12(2): 1–23. Kremer, S. (1996). A Theory of Grammatical Induction in the Con- Osherson, D., M. Stob, and S. Weinstein. (1986). Systems that nectionist Paradigm. Ph.D. diss., University of Alberta. Learn. Cambridge, MA: MIT Press. Marcus, G. (1993). Negative evidence in language acquisition. Osherson, D., D. de Jongh, E. Martin, and S. Weinstein. (1997). Cognition 46(1): 53–85. Formal learning theory. In J. van Benthem and A. ter Meulen, Mazuka, R. (1996). Can a grammatical parameter be set before Eds., Handbook of Logic and Language. Amsterdam: North the first word? Prosodic contributions to early setting of a Holland, pp. 737–775. grammatical parameter. In J. L. Morgan and K. Demuth, Pinker, S. (1979). Formal models of language learning. Cognition Eds., Signal to Syntax: Bootstrapping from Speech to Gram- 7: 217–283. Adaptation and Adaptationism 3 appendix is a vestigial trait—a trace of an earlier process of Pinker, S. (1984). Language Learnability and Language Develop- ment. Cambridge, MA: Harvard University Press. adaptation. Wacholder, N. (1995). Acquiring Syntactic Generalizations from Some authors distinguish adaptations from exaptations Positive Evidence: An HPSG Model. Ph.D. diss., City Univer- (Gould and Vrba 1982). A trait is an exaptation if it is an sity of New York. adaptation for one purpose but is now used for a different Wexler, K., and R. Manzini. (1987). Parameters and learnability in purpose. It is unlikely that feathers evolved from scales binding theory. In T. Roeper and E. Williams, Eds., Parameter because they helped the ancestors of birds to fly better. It is Setting. Dordrecht: Reidel. thought that they evolved as insulation and were only later Wu, A. (1994). The Spell-Out Parameters: A Minimalist Approach exapted for flight. Other authors doubt the value of the to Syntax. Ph.D. diss., University of California, Los Angeles. exaptation concept (Griffiths 1992; Reeve and Sherman 1993). The importance of the concept of adaptation in biol- Acquisition of Language ogy is that it explains many traits of the organisms we see around us. It explains not only how traits first arose but also why they persisted and why they are still here. If we want to See INNATENESS OF LANGUAGE; LANGUAGE ACQUISITION understand why there are so many feathers in the world, their later use in flight is as relevant as their earlier use in thermoregulation. Acquisition of Phonology The adaptation concept underwrites the continuing use of teleology in biology, something that distinguishes life sci- ences from the physical sciences (Allen, Bekoff, and Lauder See PHONOLOGY, ACQUISITION OF 1997). Adaptations have biological purposes or functions— the tasks for which they are adaptations. Hemoglobin is Acquisition of Semantics meant to carry oxygen to the tissues. It is not meant to stain the carpet at murder scenes, although it does this just as reli- ably. Some authors use the term teleonomy to distinguish See SEMANTICS, ACQUISITION OF adaptive purposes from earlier concepts of natural purpose. The fact that the adaptation concept can create naturalistic distinctions between function and malfunction or normal Acquisition of Syntax and abnormal has made it of the first interest to cognitive science. Several authors have used the adaptation concept in analyses of INTENTIONALITY. See SYNTAX, ACQUISITION OF To identify an adaptation it is necessary to determine the selective forces responsible for the origins and/or mainte- Action nance of a trait. This requires understanding the relationship between organism and environment, something more oner- ous than is typically recognized (Brandon 1990). Some biol- See EPIPHENOMENALISM; MOTOR CONTROL; MOTOR LEARN- ogists think this task so onerous that we will frequently be ING; WALKING AND RUNNING MACHINES unable to determine whether traits are adaptations and for what (Gould and Lewontin 1978; Reeve and Sherman 1993). Others argue that we can successfully engage in both Adaptation and Adaptationism reverse engineering—inferring the adaptive origins of observed traits—and adaptive thinking—inferring what adaptations will be produced in a particular environment In current usage a biological adaptation is a trait whose (Dennett 1995; Dawkins 1996). Many advocates of EVOLU- form can be explained by natural selection. The blink reflex, TIONARY PSYCHOLOGY believe that adaptive thinking about for example, exists because organisms with the reflex were the human mind has heuristic value for those who wish to fitter than organisms without this adaptation to protect the know how the mind is structured (Cosmides, Tooby, and eyes. Biological adaptation must be distinguished from Barkow 1992). physiological adaptation. The fact that human beings can Adaptationism is the name given by critics to what they form calluses when their skin is subjected to friction is see as the misuse of the adaptation concept. Steve Orzack probably a biological adaptation, but the particular callus and Elliot Sober (1994) distinguish three views about adap- caused by my hedge-trimmers is not. The location and form tation: first, that adaptation is ubiquitous, meaning that most of this particular callus cannot be explained by the differen- traits are subject to natural selection; second, that adaptation tial reproduction of heritable variants, that is, by natural is important: a “censored” model that deliberately left out selection. Adaptation is still used in its nonevolutionary the effects of natural selection would make seriously mis- sense in disciplines such as exercise physiology. An adap- taken predictions about evolution; and third, that adaptation tive trait is one that currently contributes to the fitness of an is optimal: a model censored of all evolutionary mecha- organism. The ability to read is highly adaptive, but is nisms except natural selection could predict evolution accu- unlikely to be an adaptation. Reading is probably a side rately. Most biologists accept that natural selection is effect of other, more ancient cognitive abilities. There are ubiquitous and important. In Orzack and Sober’s view the also adaptations that are no longer adaptive. The human 4 Affordances distinctive feature of adaptationism is the thesis that organ- Barkow, L. Cosmides, and J. Tooby, Eds., The Adapted Mind: Evolutionary Psychology and the Generation of Culture. isms are frequently optimal. They argue that the adaptation- Oxford: Oxford University Press, pp. 3–15. ist thesis should be empirically tested rather than assumed. Dawkins, R. (1996). Climbing Mount Improbable. London: Other authors, however, argue that adaptationism is not an Viking. empirical thesis, but a methodological one. Optimality con- Dennett, D. C. (1995). Darwin’s Dangerous Idea. New York: cepts provide a well defined goal which it is equally illumi- Simon and Schuster. nating to see organisms reach or to fall short of. Building Goodwin, B. C. (1994). How the Leopard Changed its Spots: The models that yield the observed phenotype as an optimum is Evolution of Complexity. New York: Charles Scribner and Sons. the best way to identify all sorts of factors acting in the evo- Gould, J. A., and E. S. Vrba. (1982). Exaptation—a missing term lutionary process (Maynard-Smith 1978). in science of form. Paleobiology 8: 4–15. There are several strands to antiadaptationism. One is Gould, S. J., and R. Lewontin. (1978). The Spandrels of San Marco and the Panglossian Paradigm: a critique of the adaptationist the claim that many adaptive explanations have been programme. Proceedings of the Royal Society of London 205: accepted on insufficient evidence. Adaptationists claim that 581–598. the complexity and functionality of traits is sufficient to Gray, R. D. (1987). Faith and foraging: a critique of the “paradigm establish both that they are adaptations and what they are argument from design.” In A. C. Kamil, J. R. Krebs and H. R. adaptations for (Williams 1966). Antiadaptationists argue Pulliam, Eds., Foraging Behavior. New York: Plenum Press, pp. that adaptive scenarios do not receive confirmation merely 69–140. from being qualitatively consistent with the observed trait. Griffiths, P. E. (1992). Adaptive explanation and the concept of a Some are also unsatisfied with quantitative fit between an vestige. In P. E. Griffiths, Ed., Essays on Philosophy of Biology. adaptive model and the observed trait when the variables Dordrecht: Kluwer, pp. 111–131. used to obtain this fit cannot be independently tested (Gray Griffiths, P. E. (1996). The historical turn in the study of adapta- tion. British Journal for the Philosophy of Science 47(4): 511– 1987). Many antiadaptationists stress the need to use quan- 532. titative comparative tests. Independently derived evolution- Harvey, P. H., and M. D. Pagel. (1991). The Comparative Method ary trees can be used to test whether the distribution of a in Evolutionary Biology. Oxford: Oxford University Press. trait in a group of species or populations is consistent with Lewontin, R. C. (1982). Organism and environment. In H. Plotkin, the adaptive hypothesis (Brooks and McLennan 1991; Har- Ed., Learning, Development, Culture. New York: Wiley, pp. vey and Pagel 1991). Other strands of antiadaptationism are 151–170. concerned with broader questions about what biology Lewontin, R. C. (1983). The organism as the subject and object of should be trying to explain. Biology might focus on evolution. Scientia 118: 65–82. explaining why selection is offered a certain range of alter- Maynard Smith, J. (1978). Optimisation theory in evolution. natives rather than explaining why a particular alternative is Annual Review of Ecology and Systematics 9: 31–56. Orzack, S. E., and E. Sober. (1994). Optimality models and the test chosen. This would require greater attention to develop- of adaptationism. American Naturalist 143: 361–380. mental biology (Smith 1992; Amundson 1994; Goodwin Reeve, H. K., and P. W. Sherman. (1993). Adaptation and the goals 1994). Another antiadaptationist theme is the importance of of evolutionary research. Quarterly Review of Biology 68 (1): history. The outcome of an episode of selection reflects the 1–32. resources the organism brings with it from the past, as well Schank, J. C., and W. C. Wimsatt. (1986). Generative entrench- as the “problem” posed by the environment (Schank and ment and evolution. Proceedings of the Philosophy of Science Wimsatt 1986; Griffiths 1996). Finally, antiadaptationists Association.Vol 2: 33–60. have questioned whether the environment contains adaptive Smith, K. C. (1992). Neo-rationalism versus neo-Darwinism: inte- problems that can be characterized independently of the grating development and evolution. Biology and Philosophy 7: organisms that confront them (Lewontin 1982; Lewontin 431–452. Williams, G. C. (1966). Adaptation and Natural Selection. Prince- 1983). ton: Princeton University Press. See also ALTRUISM; EVOLUTION; SEXUAL ATTRACTION, EVOLUTIONARY PSYCHOLOGY OF; SOCIOBIOLOGY Affordances —Paul Griffiths The term affordance was coined by JAMES JEROME GIBSON References to describe the reciprocal relationship between an animal and its environment, and it subsequently became the cen- Allen, C., M. Bekoff, and G. V. Lauder, Eds. (1997). Nature’s Pur- tral concept of his view of psychology, the ecological poses: Analyses of Function and Design in Biology. Cam- bridge, MA: MIT Press. approach (Gibson 1979; Reed 1996; see ECOLOGICAL PSY- Amundson, R. (1994). Two concepts of constraint: adaptationism CHOLOGY). An affordance is a resource or support that the and the challenge from developmental biology. Philosophy of environment offers an animal; the animal in turn must pos- Science 61(4): 556–578. sess the capabilities to perceive it and to use it. “The affor- Brandon, R. (1990). Adaptation and Environment. Princeton: Prin- dances of the environment are what it offers animals, what ceton University Press. it provides or furnishes, for good or ill” (Gibson 1977). Brooks, D. R., and D. A. McLennan. (1991). Phylogeny, Ecology Examples of affordances include surfaces that provide and Behavior. Chicago: University of Chicago Press. support, objects that can be manipulated, substances that Cosmides, L., J. Tooby, and J. H. Barkow. (1992). Introduction: can be eaten, climatic events that afford being frozen, like evolutionary psychology and conceptual integration. In J. H. Affordances 5 mal to contact a surface or a target object was described by a blizzard, or being warmed, like a fire, and other animals Lee (1980), who showed that such information for an that afford interactions of all kinds. The properties of these observer approaching a surface could be expressed as a con- affordances must be specified in stimulus information. stant, τ. The information is used in controlling locomotion Even if an animal possesses the appropriate attributes and during braking and imminent collision by humans (Lee equipment, it may need to learn to detect the information 1976) and by other animals (Lee and Reddish 1981; Lee, and to perfect the activities that make the affordance use- Reddish, and Rand 1991). Effective information for heading ful—or perilous if unheeded. An affordance, once (the direction in which one is going) has been described by detected, is meaningful and has value for the animal. It is Warren (1995) in terms of the global radial structure of the nevertheless objective, inasmuch as it refers to physical velocity field of a layout one is moving toward. properties of the animal’s niche (environmental con- Research on the action systems called into play and their straints) and to its bodily dimensions and capacities. An controllability in utilizing an affordance has been the sub- affordance thus exists, whether it is perceived or used or ject of study for reaching, standing upright, locomotion, not. It may be detected and used without explicit aware- steering, and so on. (Warren 1995). The control of reaching ness of doing so. and grasping by infants presented with objects of diverse Affordances vary for diverse animals, depending on the sizes shows accommodation of action to the object’s size animal’s evolutionary niche and on the stage of its develop- and shape by hand shaping, use of one or both arms, and so ment. Surfaces and substances that afford use or are danger- forth (see Bertenthal and Clifton 1997 for many details). ous for humans may be irrelevant for a flying or swimming Research of the specification of the affordance in stimulus species, and substances that afford eating by an adult of the information, and on control of action in realizing the affor- species may not be appropriate for a member in a larval dance, converges in demonstrating that behavior is prospec- stage. The reciprocal relationship between the environmen- tive (planned and intentional) and that stimulus information tal niche and a certain kind of animal has been dubbed the permits this anticipatory feature. “animal-environment fit.” 3. How do affordances develop cognitively and behav- Utilization of an affordance implies a second reciprocal iorally? Developmental studies of affordances, especially relationship between perception and action. Perception pro- during the first year, abound (Adolph, Eppler, and Gibson vides the information for action, and action generates conse- 1993). The behavior of crawling infants on a visual cliff quences that inform perception. This information may be (Gibson and Walk 1960) suggests that even infants perceive proprioceptive, letting the animal know how its body is per- the affordances of a surface of support and avoid traversal of forming; but information is also exteroceptive, reflecting the an apparent drop at an edge. Subsequent research has shown way the animal has changed the environmental context with that duration of crawling experience is significantly related respect to the affordance. Perceiving this relationship allows to dependable avoidance of the cliff, supporting other adaptive control of action and hence the possibility of con- research demonstrating that LEARNING plays a role in trolling environmental change. It is the functioning and description of the animal-envi- detecting and responding effectively to many affordances, at ronment encounter that is at the heart of research on affor- least in humans. Development of action systems and dances. Research has addressed three principal questions: increased postural control instigate the emergence of new 1. Do human adults actually perceive affordances in affordance-related behavior. Babies begin to pay ATTENTION terms of task constraints and bodily requirements? The real- to objects and make exploratory reaches toward them as ity of perceptual detection of an animal-environment fit has posture gradually enables reaching out and making contact been verified in experiments on adult humans passing with their surfaces (Eppler 1995). through an aperture, reaching for objects with their own As an infant learns about the constraints involved in the limbs or tools, judging appropriate stair heights for climb- use of some affordance, learning may at first be relatively ing, chair heights for sitting, and so forth. J. J. Gibson said domain specific. Research by Adolph (1997) on traversal of that “to perceive the world is to coperceive oneself.” In that sloping surfaces by crawling infants demonstrates that case, actors should perceive their own body dimensions and learning which slopes are traversable and strategies for suc- powers in relation to the requirements of the relevant envi- cessful traversal of them is not transferred automatically to ronmental resource or support. Warren and Whang (1987) traversal of the same slopes when the infant first begins investigated adults’ judgments of aperture widths relative to upright locomotion. New learning is required to control the their own body dimensions. Both wide- and narrow-shoul- action system for walking and to assess the safety of the dered adults rotated their shoulders when doorways were degree of slope. Learning about affordances is a kind of per- less than 1.3 times their own shoulder width. Scaling of the ceptual learning, entailing detection of both proprioceptive environment in terms of the natural yardstick of eye-height and exteroceptive information. The learning process (Mark 1987; Warren 1984) has also been demonstrated. involves exploratory activity, observation of consequences, 2. Can stimulus information specifying an affordance be and selection for an affordance fit and for economy of both described and measured? Can controlled actions of an ani- specifying information and action. mal preparing for and acting on an affordance be observed The concept of affordance is central to a view of psy- and measured? Gibson (1950) paved the way for this chology that is neither mentalism nor stimulus-response research by describing the optic flow field created by one’s BEHAVIORISM, focusing instead on how an animal interacts own locomotion when flying a plane. The specification by with its environment. Furthermore, the concept implies nei- optical stimulus information of the time for a moving ani- ther nativism nor empiricism. Rather, genetic constraints 6 Agency characteristic of any particular animal instigate exploratory Aging and Cognition activity that culminates in learning what its environment affords for it. Any discussion of the relations between aging and cognition See also COGNITIVE ARTIFACTS; DOMAIN-SPECIFICITY; must acknowledge a distinction between two types of cogni- HUMAN NAVIGATION; INFANT COGNITION; PERCEPTUAL tion that are sometimes referred to as fluid and crystallized DEVELOPMENT; SITUATEDNESS/EMBEDDEDNESS cognitive abilities (Cattell 1972) or INTELLIGENCE. Fluid abilities include various measures of reasoning (including —Eleanor J. Gibson, Karen Adolph, and Marion Eppler both CAUSAL REASONING and DEDUCTIVE REASONING), MEMORY, and spatial performance, and can be characterized References as reflecting the efficiency of processing at the time of Adolph, K. E. (1997). Learning in the Development of Infant Loco- assessment. In contrast, crystallized abilities are evaluated motion. Monographs of the Society for Research in Child with measures of word meanings, general information, and Development. other forms of knowledge, and tend to reflect the accumu- Adolph, K. E., M. A. Eppler, and E. J. Gibson. (1993). Develop- lated products of processing carried out in the past. ment of perception of affordances. In C. Rovee-Collier and L. The distinction between these two types of abilities is P. Lipsitt, Eds., Advances in Infancy Research. Norwood, NJ: important because the relations of age are quite different for Ablex Publishing Co., pp. 51–98. the two forms of cognition. That is, performance on crystal- Bertenthal, B. I., and R. Clifton. (1997). Perception and action. In lized measures tends to remain stable, or possibly even D. Kuhn and R. Siegler, Eds., Handbook of Child Psychology increase slightly, across most of the adult years, whereas Vol. 2. New York: Wiley. Eppler, M. A. (1995). Development of manipulatory skills and increased age is associated with decreases in many measures deployment of attention. Infant Behavior and Development 18: of fluid cognition. In large cross-sectional studies age-related 391–404. declines in fluid abilities are often noticeable as early as the Gibson, E. J., and R. D. Walk. (1960). The “visual cliff.” Scientific decade of the thirties, and the magnitude of the difference American 202: 64–71. across a range from twenty to seventy years of age is fre- Gibson, J. J. (1950). The Perception of the Visual World. Boston: quently one to two standard deviation units. Although the Houghton Mifflin. average trends can be quite large, it is also important to point Gibson, J. J. (1977). The theory of affordances. In R. Shaw and J. out that individual differences are substantial because chro- Bransford, Eds., Perceiving, Acting and Knowing. New York: nological age by itself seldom accounts for more than 20 to Wiley, pp. 67–82. 30 percent of the total variance in the scores. Gibson, J. J. (1979). The Ecological Approach to Visual Percep- tion. Boston: Houghton Mifflin. The vast majority of the research in the area of aging and Lee, D. N. (1976). A theory of visual control of braking based on cognition has focused on fluid abilities. There appear to be information about time-to-collision. Perception 5: 437–459. two primary reasons for this emphasis. First, many research- Lee, D. N. (1980). The optic flow field: the foundation of vision. ers probably believe that explanations are clearly needed to Philosophical Transactions of the Royal Society B290: 169– account for the differences that have been reported (as in 179. fluid abilities), but that a lack of a difference (as in crystal- Lee, D. N., and P. E. Reddish. (1981). Plummeting gannets: para- lized abilities) does not necessarily require an explanation. digm of ecological optics. Nature 293: 293–294. And second, because fluid abilities are assumed to reflect Lee, D. N., P. E. Reddish, and D. T. Rand. (1991). Aerial docking the individual’s current status, they are often considered to by hummingbirds. Naturwissenschafften 78: 526–527. be of greater clinical and practical significance than crystal- Mark, L. S. (1987). Eyeheight-scaled information about affor- dances: a study of sitting and stair climbing. Journal of Experi- lized abilities that are assumed to represent the highest level mental Psychology: Human Perception and Performance 13: the individual achieved at an earlier stage in his or her life. 683–703. Both distal and proximal interpretations of the age- Reed, E. (1996). Encountering the World. New York: Oxford Uni- related decline in fluid cognitive abilities have been pro- versity Press. posed. Distal interpretations focus on factors from earlier Warren, W. H., Jr. (1984). Perceiving affordances: visual guidance periods in the individual’s life that may have contributed to of stair climbing. Journal of Experimental Psychology: Human his or her level of performance at the current time. Exam- Perception and Performance 10: 683–703. ples are speculations that the age-related declines are attrib- Warren, W. H., Jr. (1995). Self-motion: visual perception and utable to historical changes in the quantity or quality of visual control. In W. Epstein and S. Rogers, Eds., Perception of education, or to various unspecified cultural characteristics Space and Motion. New York: Academic Press, pp. 263–325. Warren, W. H., Jr., and S. C. Whang. (1987). Visual guidance of that affect cognitive performance. In fact, comparisons of walking through apertures: body-scaled information for affor- the scores of soldiers in World War II with the norms from dances. Journal of Experimental Psychology: Human Percep- World War I (Tuddenham 1948), and a variety of time-lag tion and Performance 13: 371–383. comparisons reported by Flynn (1987), suggest that the average level of cognitive ability has been improving across successive generations. However, the factors responsible for Agency these improvements have not yet been identified (see Neisser 1997), and questions still remain about the implica- tions of the positive time-lag effects for the interpretation of See INTELLIGENT AGENT ARCHITECTURE; RATIONAL cross-sectional age differences in cognitive functioning (see AGENCY Aging, Memory, and the Brain 7 Salthouse 1991). Hypotheses based on differential patterns Flynn, J. R. (1987). Massive IQ gains in 14 nations: what IQ tests really measure. Psychological Bulletin 101: 171–191. of activity and the phenomenon of disuse can also be classi- Neisser, U. (1997). Rising scores on intelligence tests. American fied as distal because they postulate that an individual’s cur- Scientist 85: 440–447. rent level of performance is at least partially affected by the Salthouse, T. A. (1991). Theoretical Perspectives on Cognitive nature and amount of activities in which he or she has Aging. Mahwah, NJ: Erlbaum. engaged over a period of years. Although experientially Salthouse, T. A. (1996). Constraints on theories of cognitive aging. based interpretations are very popular among the general Psychonomic Bulletin and Review 3: 287–299. public (as exemplified in the cliché “Use it or lose it”) and Tuddenham, R. D. (1948). Soldier intelligence in World Wars I and among many researchers, there is still little convincing evi- II. American Psychologist 3: 54–56. dence for this interpretation. In particular, it has been sur- Further Readings prisingly difficult to find evidence of interactions of age and quality or quantity of experience on measures of fluid cog- Blanchard-Fields, F., and T. M. Hess. (1996). Perspectives on Cog- nitive abilities that would be consistent with the view that nitive Change in Adulthood and Aging. New York: McGraw- age-related declines are minimized or eliminated among Hill. individuals with extensive amounts of relevant experience. Craik, F. I. M., and T. A. Salthouse. (1992). Handbook of Aging Proximal interpretations of age-related differences in and Cognition. Mahwah, NJ: Erlbaum. cognitive functioning emphasize characteristics of process- ing at the time of assessment that are associated with the Aging, Memory, and the Brain observed levels of cognitive performance. Among the proxi- mal factors that have been investigated in recent years are Memory is not a unitary function but instead encompasses differences in the choice or effectiveness of particular strate- a variety of dissociable processes mediated by distinct gies, differences in the efficiency of specific processing brain systems. Explicit or declarative memory refers to components, and alterations in the quantity of some type of the conscious recollection of facts and events, and is processing resource (such as WORKING MEMORY, ATTEN- known to critically depend on a system of anatomically TION, or processing speed), presumed to be required for related structures that includes the HIPPOCAMPUS and many different types of cognitive tasks. Hypotheses based adjacent cortical regions in the medial temporal lobe. This on speculations about the neuroanatomical substrates of domain of function contrasts with a broad class of mem- cognitive functioning (such as dopamine deficiencies or ory processes involving the tuning or biasing of behavior frontal lobe impairments) might also be classified as proxi- as a result of experience. A distinguishing feature of these mal because they have primarily focused on linking age- implicit or nondeclarative forms of memory is that they related changes in biological and cognitive characteristics, do not rely on conscious access to information about the and not in speculating about the origins of either of those episodes that produced learning. Thus, implicit memory differences. However, not all neurobiological mechanisms proceeds normally independent of the medial temporal are necessarily proximal because some may operate to lobe structures damaged in amnesia. Although many affect the susceptibility of structures or processes to changes important issues remain to be resolved concerning the that occur at some time in the future. organization of multiple memory systems in the brain, A fundamental issue relevant to almost all proximal this background of information has enabled substantial interpretations concerns the number of distinct influences progress toward defining the neural basis of age-related that are contributing to age-related differences in cognitive cognitive decline. functioning. Moderate to large age-related differences have Traditionally, moderate neuron death, distributed dif- been reported on a wide variety of cognitive variables, and fusely across multiple brain regions, was thought to be an recent research (e.g., Salthouse 1996) indicates that only a inevitable consequence of aging. Seminal studies by Brody relatively small proportion of the age-related effects on a (1955) supported this view, indicating neuron loss given variable are independent of the age-related effects on progresses gradually throughout life, totaling more than 50 other cognitive variables. Findings such as these raise the percent in many cortical areas by age ninety-five (Brody possibility that a fairly small number of independent causal 1955, 1970). Although not all regions of the brain seemed to factors may be responsible for the age-related differences be affected to the same degree, significant decreases in cell observed in many variables reflecting fluid aspects of cogni- number were reported for both primary sensory and associa- tion. However, there is still little consensus regarding the tional areas of cortex. Thus, the concept emerged from early identity of those factors or the mechanisms by which they observations that diffusely distributed neuron death might exert their influence. account for many of the cognitive impairments observed See also AGING AND MEMORY; COGNITIVE DEVELOP- during aging (Coleman and Flood 1987). MENT; EXPERTISE; INFANT COGNITION; WORD MEANING, In recent years, the application of new and improved ACQUISITION OF methods for estimating cell number has prompted substan- —Timothy Salthouse tial revision in traditional views on age-related neuron loss. A primary advantage of these modern stereological tech- References niques, relative to more traditional approaches, is that they are specifically designed to yield estimates of total neuron Cattell, R. B. (1972). Abilities: Their Structure, Growth, and number in a region of interest, providing an unambiguous Action. Boston: Houghton Mifflin. 8 Aging, Memory, and the Brain measure for examining potential neuron loss with age (West appears to preferentially affect subcortical brain struc- 1993a). Stereological tools have been most widely applied tures, sparing many cortical regions. Defining the cell bio- in recent studies to reevaluate the effects of age on neuron logical mechanisms that confer this regional vulnerability number in the hippocampus. In addition to the known or protection remains a significant challenge. importance of this structure for normal explicit memory, Research on the neuroanatomy of cognitive aging has early research using older methods suggested that the hip- also examined the possibility that changes in connectivity pocampus is especially susceptible to age-related cell death, might contribute to age-related deficits in learning and and that this effect is most pronounced among aged subjects memory supported by the hippocampus. The entorhinal cor- with documented deficits in hippocampal-dependent learn- tex originates a major source of cortical input to the hippoc- ing and memory (Issa et al. 1990; Meaney et al. 1988). The ampus, projecting via the perforant path to synapse on the surprising conclusion from investigations using stereologi- distal dendrites of the dentate gyrus granule cells, in outer cal techniques, however, is that the total number of principal portions of the molecular layer. Proximal dendrites of the neurons (i.e., the granule cells of the dentate gyrus, and granule cells, in contrast, receive an intrinsic hippocampal pyramidal neurons in the CA3 and CA1 fields) is entirely input arising from neurons in the hilar region of the dentate preserved in the aged hippocampus. Parallel results have gyrus. This strict laminar segregation, comprised of non- been observed in all species examined, including rats, mon- overlapping inputs of known origin, provides an attractive keys and humans (Peters et al. 1996; Rapp 1995; Rapp and model for exploring potential age-related changes in hip- Gallagher 1996; Rasmussen et al. 1996; West 1993b). pocampal connectivity. Ultrastructural studies, for example, Moreover, hippocampal neuron number remains normal have demonstrated that a morphologically distinct subset of even among aged individuals with pronounced learning and synapses is depleted in the dentate gyrus molecular layer memory deficits indicative of hippocampal dysfunction during aging in the rat (Geinisman et al. 1992). Moreover, (Peters et al. 1996; Rapp 1995; Rapp and Gallagher 1996; the magnitude of this loss in the termination zone of the Rasmussen et al. 1996). Contrary to traditional views, these entorhinal cortex is greatest among aged subjects with docu- findings indicate that hippocampal cell death is not an inevi- mented deficits on tasks sensitive to hippocampal damage, table consequence of aging, and that age-related learning and in older animals that display impaired cellular plasticity and memory impairment does not require the presence of in the hippocampus (de Toledo-Morrell, Geinisman, and frank neuronal degeneration. Morrell 1988; Geinisman, de Toledo-Morrell, and Morrell Quantitative data on neuron number in aging are not 1986). yet available for all of the brain systems known to partici- The same circuitry has been examined in the aged mon- pate in LEARNING and MEMORY. However, like the hippoc- key using confocal laser microscopy to quantify the density of N-methyl-D-aspartate (NMDA) and non-NMDA receptor ampus, a variety of other cortical regions also appear to subunits. Aged monkeys display a substantial reduction in maintain a normal complement of neurons during non- NMDA receptor labeling that is anatomically restricted to pathological aging. This includes dorsolateral aspects of outer portions of the molecular layer that receive entorhinal the prefrontal cortex that participate in processing spa- cortical input (Gazzaley et al. 1996). The density of non- tiotemporal attributes of memory (Peters et al. 1994), and NMDA receptor subunits is largely preserved. Although the unimodal visual areas implicated in certain forms of impact of this change on cognitive function has not been implicit memory function (Peters, Nigro, and McNally evaluated directly, the findings are significant because 1997). By contrast, aging is accompanied by substantial NMDA receptor activity is known to play a critical role in subcortical cell loss, particularly among neurochemically cellular mechanisms of hippocampal plasticity (i.e., LTP). specific classes of neurons that originate ascending pro- Thus, a testable prediction derived from these observations jections to widespread regions of the cortex. Acetylcho- is that the status of hippocampal-dependent learning and line containing neurons in the basal forebrain have been memory may vary as a function of the magnitude of NMDA studied intensively in this regard, based partly on the receptor alteration in the aged monkey. Studies of this sort, observation that this system is the site of profound degen- combining behavioral and neurobiological assessment in the eration in pathological disorders of aging such as Alzhei- same individuals, are a prominent focus of current research mer’s disease. A milder degree of cholinergic cell loss is on normal aging. also seen during normal aging, affecting cell groups that A solid background of evidence now exists concerning project to the hippocampus, AMYGDALA, and neocortex the nature, severity and distribution of structural alterations (Armstrong et al. 1993; de Lacalle, Iraizoz, and Ma in the aged brain. The mechanisms responsible for these Gonzalo 1991; Fischer et al. 1991; Stroessner-Johnson, changes, however, are only poorly understood. Molecular Rapp, and Amaral 1992). Information processing func- biological techniques are increasingly being brought to bear tions mediated by these target regions might be substan- on this issue, revealing a broad profile of age-related effects tially disrupted as a consequence of cholinergic with significant implications for cell structure and function degeneration, and, indeed, significant correlations have (Sugaya et al. 1996). Although incorporating these findings been documented between the magnitude of cell loss and within a neuropsychological framework will undoubtedly behavioral impairment in aged individuals (Fischer et al. prove challenging, current progress suggests that molecular, 1991). Together with changes in other neurochemically neural-systems, and behavioral levels of analysis may soon specific projection systems, subcortical contributions to converge on a more unified understanding of normal cogni- cognitive aging may be substantial. These findings also tive aging. highlight the concept that neuron loss during normal aging AI and Education 9 See also Rapp, P. R. (1995). Cognitive neuroscience perspectives on aging AGING AND COGNITION; IMPLICIT VS. EXPLICIT in nonhuman primates. In T. Nakajima and T. Ono, Eds., Emo- MEMORY; LONG-TERM POTENTIATION; WORKING MEMORY, tion, Memory and Behavior. Tokyo: Japan Scientific Societies NEURAL BASIS OF Press, pp 197–211. —Peter Rapp Rapp, P. R., and M. Gallagher. (1996). Preserved neuron number in the hippocampus of aged rats with spatial learning deficits. Pro- References ceedings of the National Academy of Science, USA 93: 9926– 9930. Armstrong, D. M., R. Sheffield, G. Buzsaki, K. Chen, L. B. Hersh, Rasmussen, T., T. Schliemann, J. C. Sorensen, J. Zimmer and M. J. B. Nearing, and F. H. Gage. (1993). Morphologic alterations of West. (1996). Memory impaired aged rats: no loss of principal choline acetyltransferase-positive neurons in the basal forebrain hippocampal and subicular neurons. Neurobiology of Aging of aged behaviorally characterized Fisher 344 rats. Neurobiol- 17(1): 143–147. ogy of Aging 14: 457–470. Stroessner-Johnson, H. M., P. R. Rapp, and D. G. Amaral. (1992). Brody, H. (1955). Organization of the cerebral cortex. III. A study Cholinergic cell loss and hypertrophy in the medial septal of aging in the human cerebral cortex. Journal of Comparative nucleus of the behaviorally characterized aged rhesus monkey. Neurology 102: 511–556. Journal of Neuroscience 12(5): 1936–1944. Brody, H. (1970). Structural changes in the aging nervous system. Sugaya, K., M. Chouinard, R. Greene, M. Robbins, D. Personett, Interdisciplinary Topics in Gerontology 7: 9–21. C. Kent, M. Gallagher, and M. McKinney. (1996). Molecular Coleman, P. D., and D. G. Flood. (1987). Neuron numbers and indices of neuronal and glial plasticity in the hippocampal for- dendritic extent in normal aging and Alzheimer's disease. Neu- mation in a rodent model of age-induced spatial learning robiology of Aging 8(6): 521–545. impairment. Journal of Neuroscience 16(10): 3427–3443. de Lacalle, S., I. Iraizoz, and L. Ma Gonzalo. (1991). Differential West, M. J. (1993a). New stereological methods for counting neu- changes in cell size and number in topographic subdivisions of rons. Neurobiology of Aging 14: 275–285. human basal nucleus in normal aging. Neuroscience 43(2/3): West, M. J. (1993b). Regionally specific loss of neurons in the 445–456. aging human hippocampus. Neurobiology of Aging 14: 287– de Toledo-Morrell, L., Y. Geinisman, and F. Morrell. (1988). Indi- 293. vidual differences in hippocampal synaptic plasticity as a func- tion of aging: Behavioral, electrophysiological and Further Readings morphological evidence. Neural Plasticity: A Lifespan Approach. Alan R. Liss, Inc., pp. 283–328. Barnes, C. A. (1990). Animal models of age-related cognitive Fischer, W., K. S. Chen, F. H. Gage, and A. Björklund. (1991). decline. In F. Boller and J. Grafman, Eds., Handbook of Neu- Progressive decline in spatial learning and integrity of forebrain ropsychology, vol. 4. Amsterdam: Elsevier Science Publishers cholinergic neurons in rats during aging. Neurobiology of Aging B.V., pp. 169–196. 13: 9–23. Barnes, C. A. (1994). Normal aging: regionally specific changes in Gazzaley, A. H., S. J. Siegel, J. H. Kordower, E. J. Mufson, and J. hippocampal synaptic transmission. Trends in Neuroscience H. Morrison. (1996). Circuit-specific alterations of N-methyl- 17(1): 13–18. D-aspartate receptor subunit 1 in the dentate gyrus of aged Gallagher, M., and P. R. Rapp. (1997). The use of animal models to monkeys. Proceedings of the National Academy of Science, study the effects of aging on cognition. Annual Review of Psy- USA 93: 3121–3125. chology 339–370. Geinisman, Y., L. de Toledo-Morrell, and F. Morrell. (1986). Loss Grady, C. L., A. R. McIntosh, B. Horwitz, J. Ma. Maisog, L. G. of perforated synapses in the dentate gyrus: morphological sub- Ungerleider, M. J. Mentis, P. Pietrini, M. B. Schapiro, and J. V. strate of memory deficit in aged rats. Proceedings of the Haxby. (1995). Age-related reductions in human recognition National Academy of Science USA 83: 3027–3031. memory due to impaired encoding. Science 269: 218–221. Geinisman, Y., L. de Toledo-Morrell, F. Morrell, I. S. Persina, and Salthouse, T. A. (1991). Theoretical Perspectives on Cognitive M. Rossi. (1992). Age-related loss of axospinous synapses Aging. Hillsdale, NJ: Erlbaum. formed by two afferent systems in the rat dentate gyrus as Schacter, D. L., C. R. Savage, N. M. Alpert, S. L. Rauch, and M. S. revealed by the unbiased stereological dissector technique. Hip- Albert. (1996). The role of hippocampus and frontal cortex in pocampus 2: 437–444. age-related memory changes: a PET study. NeuroReport 11: Issa, A. M., W. Rowe, S. Gauthier, and M. J. Meaney. (1990). 1165–1169. Hypothalamic-pituitary-adrenal activity in aged, cognitively impaired and cognitively unimpaired rats. Journal of Neuro- AI science 10(10): 3247–3254. Meaney, M. J., D. H. Aitken, C. van Berkel, S. Bhatnagar, and R. M. Sapolsky. (1988). Effect of neonatal handling on age-related See INTRODUCTION: COMPUTATIONAL INTELLIGENCE; AI impairments associated with the hippocampus. Science 239: AND EDUCATION; COGNITIVE MODELING, SYMBOLIC 766–768. Peters, A., D. Leahy, M. B. Moss, and K. J. McNally. (1994). The effects of aging on area 46 of the frontal cortex of the rhesus AI and Education monkey. Cerebral Cortex 4(6): 621–635. Peters, A., N. J. Nigro, and K. J. McNally. (1997). A further evalu- ation of the effect of age on striate cortex of the rhesus monkey. Perhaps computers could educate our children as well as the Neurobiology of Aging 18: 29–36. best human tutors. This dream has inspired decades of work Peters, A., D. L. Rosene, M. B. Moss, T. L. Kemper, C. R. Abra- in cognitive science. The first generation of computer tutor- ham, J. Tigges, and M. S. Albert. (1996). Neurobiological bases ing systems (called Computer Aided Instruction or Com- of age-related cognitive decline in the rhesus monkey. Journal puter Based Instruction) were essentially hypertext. They of Neuropathology and Experimental Neurology 55: 861–874. 10 AI and Education mostly just presented material, asked multiple-choice ques- tutoring systems, many intelligent environments have tions, and branched to further presentations depending on been built and used for real educational and training the student’s answer (Dick and Carey 1990). needs. The next generation of tutoring systems (called Intelli- Other applications of AI in education include (1) using gent CAI or Intelligent Tutoring Systems) were based on AI planning technology to design instruction; (2) using building knowledge of the subject matter into the com- student modeling techniques to assess students’ knowl- puter. There were two types. One coached students as edge on the basis of their performance on complex tasks, a they worked complex, multiminute problems, such as welcome alternative to the ubiquitous multiple-choice test; troubleshooting an electronic circuit or writing a com- and (3) using AI techniques to construct interesting simu- puter program. The other type attempted to carry on a lated worlds (often called “microworlds”) that allow stu- Socratic dialog with students. The latter proved to be very dents to discover important domain principles. difficult, in part due to the problem of understanding Cognitive studies are particularly important in develop- unconstrained natural language (see NATURAL LANGUAGE ing AI applications to education. Developing the expert PROCESSING). Few Socratic tutors have been built. module of a tutoring system requires studying experts as Coached practice systems, however, have enjoyed a long they solve problems in order to understand and formalize and productive history. their knowledge (see KNOWLEDGE ACQUISITION). Develop- A coached practice system usually contains four basic ing an effective pedagogical module requires understanding components: how students learn so that the tutor’s comments will prompt students to construct their own understanding of the subject 1. An environment in which the student works on complex matter. An overly critical or didactic tutor may do more tasks. For instance, it might be a simulated piece of harm than good. A good first step in developing an applica- electronic equipment that the student tries to trouble- tion is to study the behavior of expert human tutors in order shoot. to see how they increase the motivation and learning of stu- 2. An expert system that can solve the tasks that the student dents. works on (see KNOWLEDGE-BASED SYSTEMS). However, AI applications often repay their debt to empir- 3. A student modeling module that compares the student’s ical cognitive science by contributing results of their own. It behavior to the expert system’s behavior in order to both is becoming common to conduct rigorous evaluations of the recognize the student’s current plan for solving the prob- lem and determine what pieces of knowledge the student educational effectiveness of AI-based applications. The is probably using. evaluations sometimes contrast two or more versions of the 4. A pedagogical module that suggests tasks to be solved, same system. Such controlled experiments often shed light responds to the students’ requests for help and points out on important cognitive issues. mistakes. Such responses and suggestions are based on At this writing, there are no current textbooks on AI and the tutoring system’s model of the student’s knowledge education. Wenger (1987) and Polson and Richardson and plans. (1988) cover the fundamental concepts and the early sys- tems. Recent work generally appears first in the proceedings Any of these components may utilize AI technology. For of the AI and Education conference (e.g., Greer 1995) or the instance, the environment might contain a sophisticated Intelligent Tutoring Systems conference (e.g., Frasson, simulation or an intelligent agent (see INTELLIGENT AGENT Gauthier, and Lesgold 1996). Popular journals for this work ARCHITECTURE), such as a simulated student (called co- include The International Journal of AI and Education learners) or a wily opponent. The student modeling mod- (http://cbl.leeds.ac.uk/ijaied/), The Journal of the Learning ule’s job includes such classic AI problems as plan recog- Sciences (Erlbaum) and Interactive Learning Environments nition and uncertain reasoning (see UNCERTAINTY). The (Ablex). pedagogical module’s job includes monitoring an instruc- See also EDUCATION; HUMAN-COMPUTER INTERACTION; tional plan and adapting it as new information about the READING student’s competence is observed. Despite the immense potential complexity, many intelligent tutoring systems —Kurt VanLehn have been built, and some are in regular use in schools, industry, and the military. Although intelligent tutoring systems are perhaps the References most popular use of AI in education, there are other appli- cations as well. A common practice is to build an environ- Dick, W., and S. Carey. (1990). The Systematic Design of Instruc- tion. 3rd ed. New York: Scott-Foresman. ment without the surrounding expert system, student Frasson, C., G. Gauthier, and A. Lesgold, Eds. (1996). Intelligent modeling module, or pedagogical module. The environ- Tutoring Systems: Third International Conference, ITS96. New ment enables student activities that stimulate learning and York: Springer. may be impossible to conduct in the real world. For Greer, J., Ed. (1995). Proceedings of AI-Ed 95. Charlottesville, instance, an environment might allow students to conduct NC: Association for the Advancement of Computing in Educa- simulated physics experiments on worlds where gravity is tion. reduced, absent, or even negative. Such environments are Polson, M. C., and J. J. Richardson. (1988). Foundations of Intelli- called interactive learning environments or microworlds. gent Tutoring Systems. Hillsdale, NJ: Erlbaum. A new trend is to use networking to allow several students Wenger, E. (1987). Artificial Intelligence and Tutoring Systems. to work together in the same environment. Like intelligent San Mateo, CA: Morgan Kaufmann. Algorithm 11 case, at least in theory; indeed these sets can be continuous Algorithm (see Blum, Shub, and Smale 1989). An algorithm describes a process that the tokens participate in. This process (a com- An algorithm is a recipe, method, or technique for doing putation) is either in a certain state or it is not, it may either something. The essential feature of an algorithm is that it is go to a certain next state from its current state or it may not, made up of a finite set of rules or operations that are unam- and any transition taken is finite. (One way to relax this def- biguous and simple to follow (computer scientists use tech- inition is to allow the state transitions to be probabilistic, but nical terms for these two properties: definite and effective, that doesn’t affect their finiteness.) respectively). It is obvious from this definition that the And finally, in more technical parlance, an algorithm is notion of an algorithm is somewhat imprecise, a feature it an intensional definition of a special kind of function— shares with all foundational mathematical ideas—for namely a computable function. The intensional definition instance, the idea of a set. This imprecision arises because contrasts with the extensional definition of a computable being unambiguous and simple are relative, context-depen- function, which is just the set of the function’s inputs and dent terms. However, usually algorithms are thought of as outputs. Hence, an algorithm describes how the function is recipes, methods, or techniques for getting computers to do computed, rather than merely what the function is. The con- something, and when restricted to computers, the term nection with computable functions is crucial. A function F “algorithm” becomes more precise, because then “unam- is computable if and only if it is describable as an algorithm. biguous and simple to follow” means “a computer can do The relationship between the extensional definition of F and it.” The connection with computers is not necessary, how- an intensional definition of it is interesting. There is one ever. If a person equipped only with pencil and paper can extensional definition of F, but there are an infinite number complete the operations, then the operations constitute an of intensional definitions of it, hence there are an infinite algorithm. number of algorithms for every extensionally-described A famous example of an algorithm (dating back at least computable function F. (Proof: you can always construct a to Euclid) is finding the greatest common divisor (GCD) of new, longer algorithm by adding instructions that essentially two numbers, m and n. do nothing. Of course, usually we seek the canonical algo- rithm—the shortest and most efficient one.) Step 1. Given two positive integers, set m to be the larger of A computable function is a function whose inputs and the two; set n to be the smaller of the two. outputs can be produced by a Turing machine. Church’s the- Step 2. Divide m by n. Save the remainder as r. sis states that all the computable functions can be computed Step 3. If r = 0, then halt; the GCD is n. by a Turing machine (see CHURCH-TURING THESIS). The Step 4. Otherwise, set m to the old value of n, and set n to best way to understand Church’s thesis is to say that Turing the value of r. Then go to step 2. computability exhausts the notion of computability. Impor- A tax form is also a relatively good example of an algo- tantly, not all functions are computable, so not all functions rithm because it is finite and the instructions for complet- are algorithmically describable (this was a profound discov- ing it are mechanically and finitely completable (at least ery, first proved by TURING 1936; Church 1936a; and that is the intention). The recipe for cowboy chocolate cake Kleene 1936; it and related results are among the greatest (with two mugs of coffee) is not really an algorithm achievements of twentieth-century mathematics). because its description is not definite enough (how much is Sometimes algorithms are simply equated with Turing a mug of coffee?). Of course, all computer programs are machines. The definition given here is logically prior to the algorithms. notion of a Turing machine. This latter notion is intended to It should also be noted that algorithms are by no means formally capture the former. Gödel (1931; 1934), among his restricted to numbers. For example, alphabetizing a list of other achievements, was the first to do this, to link a formal words is also an algorithm. And, one interpretation of the definition with an intuitive one: he identified a formally computational hypothesis of the mind is that thinking itself defined class of functions, the recursive functions, with the is an algorithm—or perhaps better, the result of many algo- functions that are computable, that is, with the functions for rithms working simultaneously. which algorithms can be written. Now to flesh out the definition. An algorithm is an unam- This completes the definition of “algorithm.” There are a biguous, precise, list of simple operations applied mechani- few loose ends to tie up, and connections to be made. First, cally and systematically to a set of tokens or objects (e.g., mathematicians and computer scientists sometimes sharply configurations of chess pieces, numbers, cake ingredients, restrict the definition of “algorithm.” They take the defini- etc.). The initial state of the tokens is the input; the final tion of “algorithm” given here, and use it to define the state is the output. The operations correspond to state tran- notion of an effective procedure. Then they define an algo- sitions where the states are the configuration of the tokens, rithm as an effective procedure that always halts or termi- which changes as operations are applied to them. Almost nates (not all procedures or computer programs do everything in sight is assumed to be finite: the list of opera- terminate—sometimes on purpose, sometimes acciden- tions itself is finite (there might be a larger but still finite set tally). from which the operations are drawn) and each token is Second, in common parlance, an algorithm is a recipe itself finite (or, more generally, a finitely determinable ele- for telling a computer what to do. But this definition does ment in the set). Usually, the input, output, and intermediate more harm than good because it obliterates the crucial sets of tokens are also finite, but this does not have to be the notion of a virtual machine, while it subtly reinforces the 12 A-Life idea that a homunculus of some sort is doing all the work, Further Readings and this in turn can reinforce the idea of a “ghost in the Church, A. (1936b). An unsolvable problem of elementary number machine” in the COMPUTATIONAL THEORY OF MIND. So this theory. American Journal of Mathematics 58: 345–363. folk definition should be avoided when precision is needed. Dietrich, E. (1994). Thinking computers and the problem of inten- Another problem with the folk definition is that it does not tionality. In E. Dietrich, Ed., Thinking Computers and Virtual do justice the profundity of the notion of an algorithm as a Persons: Essays on the Intentionality of Machines. San Diego: description of a process. It is fair to regard algorithms as Academic Press, pp. 3–34. being as crucial to mathematics as sets. A set is a collection Fields, C. (1989). Consequences of nonclassical measurement for of objects. An intentional definition of a set describes all the algorithmic description of continuous dynamical systems. and only the objects in the set. An algorithm describes a Journal of Experimental and Theoretical Artificial Intelligence 1: 171–189. collection of objects that does something. It would be Knuth, D. (1973). The Art of Computer Programming, vol. 1: Fun- impossible to overstate the importance of this move from damental Algorithms. Reading, MA: Addison-Wesley. statics to dynamics. Rogers, H. (1967). The Theory of Recursive Functions and Effec- Third, the connection between algorithms and COMPU- tive Computability. New York: McGraw-Hill. TATION is quite tight. Indeed, some mathematicians regard algorithms as abstract descriptions of computing A-Life devices. When implemented on a standard computer, such descriptions cease to be abstract and become real comput- ing devices known as virtual machines (virtual does not See ARTIFICIAL LIFE; EVOLUTIONARY COMPUTATION mean not real, here). A virtual machine is the machine that does what the algorithm specifies. A virtual machine Altruism exists at some level higher than the machine on which the algorithm is implemented. For example: a word processor In biology, altruism has a purely descriptive economic is a virtual machine that exists on top of the hardware meaning: the active donation of resources to one or more machine on which it is implemented. The notion of a vir- individuals at cost to the donor. Moral values or conscious tual machine is very important to cognitive science motivations are not implied, and the ideas are as applicable because it allows us to partition the study of the mind into to plants as to animals. Four evolutionary causes of altruism levels, with neurochemistry at the bottom (or near the bot- will be considered here: kin selection, reciprocation, manip- tom) and cognitive psychology near the top. At each level, ulation, and group selection. Each implies demonstrably different methods and technical vocabularies are used. different patterns for what individuals donate what One of the crucial facts about virtual machines is that no resources to whom and under what circumstances and may one machine is more important than the rest. So it is with suggest different motivational and emotional experiences by the brain and the rest of the nervous system that make up both donor and recipient. thinking things. Theories at all levels are going to be It may seem that Darwinian EVOLUTION, directed by nat- needed if we are to completely and truly understand the ural selection, could never favor altruism. Any avoidable mind. activity that imposes a cost, always measured as reduced See also COMPUTATION AND THE BRAIN; FORMAL SYS- reproductive fitness, would be eliminated in evolution. This TEMS, PROPERTIES OF view is too simple. Natural selection should minimize costs —Eric Dietrich whenever possible, but successful reproduction always requires donation of resources to offspring, at least by References females putting nutrients into eggs. A closer look shows that offspring are important to natural selection only because Blum, L., M. Shub, and S. Smale. (1989). On a theory of computa- they bear their parents’ genes, but this is true of all relatives. tion and complexity over the real numbers: NP-completeness, From the perspective of genetics and natural selection, the recursive functions, and universal machines. Bulletin of the survival and reproduction of any relative are partly equiva- American Mathematical Society 21 (1): 1–46. lent to one’s own survival and reproduction. So there must Church, A. (1936a). A note on the entscheidungsproblem. Journal be an evolutionary force of kin selection that favors altruism of Symbolic Logic 1: 40–41, 101–102. between associated relatives. Gödel, K. (1931). On formally undecidable propositions of Prin- Kin selection, first clearly formulated by Hamilton cipia Mathematica and related systems I. Monatschefte für Mathematick und Physik 38: 173–198. (Reprinted in J. Heije- (1964), can be defined as selection among individuals for noort, Ed., (1967), From Frege to Gödel. Cambridge, MA: Har- the adaptive use of cues indicative of kinship. The products vard University Press, pp. 592–617.) of mitotic cell division are exactly similar genetically, and Gödel, K. (1934/1965). On undecidable propositions of formal their special physical contact is reliable evidence of full kin- mathematical systems. In M. Davis, Ed., The Undecidable. ship. This accounts for the subservience of somatic cells in a New York: Raven Press, pp. 41–71. multicellular organism to the reproductive interests of the Kleene, S. C. (1936). General recursive functions of natural num- germ cells. Kin selection also accounts for the generally bers. Mathematishe Annelen 112: 727–742. benign relations among young animals in the same nest. Turing, A. (1936). On computable numbers with an application to Such early proximity is often a cue indicative of close kin- the entscheidungsproblem. Proceedings of the London Mathe- ship. Nestmates are often full sibs, with a genetic relation- matical Society series 2, 42: 230–265 and 43: 544–546. Altruism 13 tion can favor by operating at the level of competing groups ship of 0.50. They could also be half sibs if the mother rather than their competing members. A group of individu- mated with more than one male. They may not be related at als that aid each other may prevail over a more individually all if one or more eggs were deposited by females other than selfish group. A difficulty here is that if selfishness is advan- the apparent mother. Such nest parasites are often of the tageous within a group, that group is expected to evolve a same species in birds, but some species, such as the Euro- higher level of individual selfishness, no matter what the pean cuckoo and the American cowbird, reproduce exclu- effect on group survival. The original concept of group sively by parasitizing other species. Their young’s selection focused on separate populations within a species competition with nest mates has not been tempered by kin (Wynne-Edwards 1962; Wade 1996). This idea has few selection, and this accounts for their lethal eviction of the adherents, because of the paucity of apparent population- offspring of the parasitized pair. Many sorts of cues other level adaptations (Williams 1996: 51–53), because altruistic than early proximity can be used to assess kinship, such as populations are readily subverted by the immigration of the odors used by mammals and insects to recognize rela- selfish individuals, and because the low rate of proliferation tives and make genetically appropriate adjustments in altru- and extinction of populations, compared to the reproduction ism. The classic work on mechanisms of kin recognition is and death of individuals, would make selection among pop- Fletcher and Michener (1987); see Slater (1994) for a criti- ulations a relatively weak force. cal updating. More recently attention has been given to selection In the insect order Hymenoptera (ants, bees, and among temporary social groupings or trait groups (Wilson wasps), a male has only one chromosome set, from an 1980), such as fish schools or flocks of birds. Trait groups unfertilized egg of his mother, and his sperm are all exactly with more benign and cooperative members may feed more the same genetically. So his offspring by a given female efficiently and avoid predators more effectively. The more have a relationship of 0.75 to one another. This factor has selfish individuals still thrive best within each group, and been used to explain the multiple independent instances, in the evolutionary result reflects the relative strengths of this insect order, of the evolution of sterile worker castes selection within and between groups. In human history, that are entirely female. These workers derive greater groups with more cooperative relations among members genetic success by helping their mothers produce sisters must often have prevailed in conflicts with groups of more than they would by producing their own offspring, which consistently self-seeking individuals (Wilson and Sober would have only a 0.50 genetic similarity. These special 1994). The resulting greater prevalence of human altruism relationships are not found in termites (order Isoptera) and, would be more likely to result from culturally transmitted as expected, both males and females form the termite than genetic differences. It should be noted that any form of worker castes. group selection can only produce modifications that benefit Reciprocity is another evolutionary factor that can favor the sorts of groups among which selection takes place. It altruism. The basic theory was introduced by Trivers (1971) need not produce benefits for whole species or more inclu- and refined by Axelrod and Hamilton (1980). One organism sive groups. has a net gain by helping another if the other reciprocates A given instance of altruistic behavior may, of course, with benefits (simultaneous or delayed) that balance the result from more than one of these four evolutionary causes. donor’s cost. Cleaning symbiosis between a large fish and a Genealogical relatives are especially likely to indulge in small one of a different species may provide simultaneous both reciprocation and manipulation. If reproductive pro- reciprocal benefits: the large fish gets rid of parasites; the cesses result in stable associations of relatives, these kin- small one gets food. This reciprocation implies that the groups are inevitably subject to natural selection. The most small fish is more valuable as a cleaner to the large fish than extreme examples of altruism, those of social insects, proba- it would be as food. Reciprocity is a pervasive factor in the bly resulted from the operation of all the factors discussed socioeconomic lives of many species, especially our own. It here, and social insect colonies may aptly be termed super- requires safeguards, often in the form of evolved adapta- organisms (Seeley 1989). Excellent detailed discussions of tions for the detection of cheating (Wright 1994). altruism in the animal kingdom and in human evolution, and Manipulation is another source of altruism. The donation of the history of thought on these topics, are available (Rid- results from actual or implied threat or deception by the ley 1996; Wright 1994). recipient. In any social hierarchy, individuals of lower rank will often yield to the higher by abandoning a food item or See also ADAPTATION AND ADAPTATIONISM; CULTURAL possible mate, thereby donating the coveted resource to the EVOLUTION; DARWIN dominant individual. Deception often works between spe- —George C. Williams cies: a snapper may donate its body to an anglerfish that tempts it with its lure; some orchids have flowers that References resemble females of an insect species, so that deceived males donate time and energy transporting pollen with no Axelrod, R., and W. D. Hamilton. (1980). The evolution of cooper- payoff to themselves. The nest parasitism discussed above is ation. Science 211: 1390–1396. another example. Our own donations of money or labor or Fletcher, D. J. C., and C. D. Michener. (1987). Kin Recognition in blood to public appeals can be considered manipulation of Animals. New York: Wiley-Interscience. donors by those who make the appeals. Hamilton, W. D. (1964). The genetical theory of social behaviour, Group selection is another possibility. Individuals may parts 1 and 2. Journal of Theoretical Biology 7:1–52. donate resources as a group-level adaptation, which evolu- Ridley, M. (1996). The Origins of Virtue. New York: Viking Press. 14 Ambiguity similar ambiguities to argue for the necessity of abstract Seeley, T. D. (1989). The honey bee as a superorganism. American Scientist 77: 546–553. syntactic structure. Slater, P. J. B. (1994). Kinship and altruism. In P. J. B. Slater and T. The different underlying relationships in ambiguous sen- R. Halliday, Eds., Behavior and Evolution. Cambridge Univer- tences can frequently be observed directly by manipulating sity Press. the form of the ambiguous sentence. When the order of the Trivers, R. L. (1971). The evolution of reciprocal altruism. Quar- string smart women and men is reversed to men and smart terly Review of Biology 46: 35–57. women, the sentence can only be understood as involving Wade, M. J. (1996). Adaptation in subdivided populations: kin modification of women but not men. When the adverb wildly selection and interdemic selection. In M. R. Rose and G. V. is inserted before with the knife, only the reading in which Lauder (Eds.), Adaptation. San Diego: Academic Press. the burglar is using the knife remains possible. Williams, G. C. (1996). Plan and Purpose in Nature. London: In its most technical sense, the term ambiguity is used to Weidenfeld and Nicholson. Wilson, D. S. (1980). Natural Selection of Populations and Com- describe only those situations in which a surface linguistic munities. Boston: Benjamin/Cummings. form corresponds to more than one linguistic representation. Wilson, D. S., and E. Sober. (1994). Re-introducing group selec- In lexical ambiguities, one surface phonetic form has multi- tion to the human behavioral sciences. Behavioral and Brain ple independent lexical representations. For syntactic ambi- Sciences 17: 585–654. guities, one surface string has different underlying syntactic Wright, R. (1994). The Moral Animal: Why We Are the Way We structures. A subtler and more controversial example is the Are. Vintage Books. phenomenon of scope ambiguity, exemplified in: Wynne-Edwards, V. C. (1961). Animal Dispersion in Relation to Social Behavior. London: Oliver and Boyd. (2) a. Some woman tolerates every man. b. John doesn’t think the King of France is bald. Ambiguity In (2a), the sentence can be understood as referring to a sin- gle woman who tolerates each and every man, or alterna- A linguistic unit is said to be ambiguous when it is associ- tively, it can mean that every man is tolerated by at least one ated with more than one MEANING. The term is normally woman (not necessarily the same one). Sentence (2b) can reserved for cases where the same linguistic form has mean either that John believes that the King of France is not clearly differentiated meanings that can be associated with bald, or that John does not hold the particular belief that the distinct linguistic representations. Ambiguity is thus distin- King of France is bald. It is difficult to find clear syntactic guished from general indeterminacy or lack of specificity. tests for scope ambiguities that would demonstrate different Ambiguity has played an important role in developing underlying structures. For instance, reversing the order of theories of syntactic and semantic structure, and it has been some woman and every man does not eliminate the ambigu- the primary empirical testbed for developing and evaluating ity (although it may affect the bias towards one reading): models of real-time language processing. Within artificial (3) Every man is tolerated by some woman. intelligence and COMPUTATIONAL LINGUISTICS, ambiguity is considered one of the central problems to be solved in May (1977) argues that sentences such as (2) reflect differ- developing language understanding systems (Allen 1995). ent underlying structures at a level of linguistic representa- Lexical ambiguity occurs when a word has multiple tion corresponding to the LOGICAL FORM of a sentence. independent meanings. “Bank” in the sentence “Jeremy Subsequently, much of the linguistic literature has consid- went to the bank” could denote a riverbank or a financial ered scope ambiguities as genuine ambiguities. institution. Ambiguous words may differ in syntactic cate- A broader notion of ambiguity includes a pervasive gory as well as meaning (e.g., “rose,” “watch,” and ambiguity type that involves not multiple possible struc- “patient”). True lexical ambiguity is typically distinguished tures, but rather, multiple associations between linguistic from polysemy (e.g., “the N.Y. Times” as in this morning’s expressions and specific entities in the world. The sentence edition of the newspaper versus the company that publishes in (4) is an example of referential ambiguity: the newspaper) or from vagueness (e.g., “cut” as in “cut the lawn” or “cut the cloth”), though the boundaries can be (4) Mark told Christopher that he had passed the exam. fuzzy. The ambiguity resides in the understanding of the pronoun Syntactic ambiguities arise when a sequence of unambig- he, which could refer to either Mark, Christopher, or some uous words reflects more than one possible syntactic rela- other salient entity under discussion. tionship underlying the words in the sentence, as in: Language processing necessarily involves ambiguity res- (1) a. The company hires smart women and men. olution because even unambiguous words and sentences are b. The burglar threatened the student with the knife. briefly ambiguous as linguistic input is presented to the pro- cessing system. Local ambiguities arise in spoken language In (1a), the ambiguity lies in whether the adjective smart because speech unfolds over time and in written language modifies (provides information about) both women and men because text is processed in successive eye fixations (cf. resulting in a practice not to hire unintelligent people of Tanenhaus and Trueswell 1995). either sex, or whether smart modifies women only. In (1b), The sentence in (5) illustrates how globally unambiguous the phrase with the knife could be used to describe the man- sentences may contain local ambiguities. ner in which the burglar threatened the student, or to indi- cate which student was threatened. Chomsky (1957) used (5) The pupil spotted by the proctor was expelled. Amygdala, Primate 15 The pupil is the object of a relative clause. However, the Bever, T. (1970). The cognitive basis for linguistic structures. In J. R. Hayes, Ed., Cognition and the Development of Language. underlined sequence is also consistent with the pupil being New York: Wiley. the subject of a main clause, as in “The pupil spotted the Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton. proctor.” The ambiguity arises because the morphological Frazier, L. (1987). Sentence processing: a tutorial review. In M. form “-ed” is used for both the simple past and for the pas- Coltheart, Ed., Attention and Performance XII: The Psychology sive participle, illustrating the interdependence of ambiguity of Reading. London: Erlbaum. at multiple levels. MacDonald, M., N. Pearlmutter, and M. Seidenberg. (1994). Lexi- Laboratory studies have established that multiple senses cal nature of syntactic ambiguity resolution. Psychological of words typically become activated in memory with rapid Review 4: 676–703. resolution based on frequency and context. For example, Marlsen-Wilson, W. D. (1987). Functional parallelism in spoken when “pupil” is heard or read, both the “eye part” and the word recognition. Cognition 25:71–102. May, R. (1977). The Grammar of Quantification. Ph.D. diss., MIT. “student” senses become briefly active (Simpson 1984). Distributed by the Indiana University Linguistics Club, Bloom- Similarly, “elevator,” “elegant,” and “eloquent” are briefly ington. activated as the unambiguous word “elephant” is heard Pritchett, B. (1992). Grammatical Competence and Parsing Per- because they are consistent with the initial phonemic formance. Chicago: University of Chicago Press. sequence “eluh” (Marslen-Wilson 1987). Simpson, G. (1984). Lexical ambiguity and its role in models of Syntactic ambiguities exhibit consistent preferences. word recognition. Psychological Bulletin 96: 316–340. Readers and listeners experience processing difficulty and Small, S., G. Cottrell, and M. Tanenhaus, Eds. (1988). Lexical sometimes a conscious feeling of confusion when the sen- Ambiguity Resolution: Perspectives from Psycholinguistics, tence becomes inconsistent with the preferred structure. The Neuropsychology, and Artificial Intelligence. San Mateo, CA: example in (6), from Bever (1970), is a classic example of a Morgan Kaufmann Publishers. Tanenhaus, M., and J. Trueswell. (1995). Sentence comprehension. so-called garden-path, illustrating that the main clause is the In J. Miller and P. Eimas, Eds., Handbook of Cognition and preferred structure for the main clause/relative clause ambi- Perception. New York: Academic Press. guity. Zwicky, A., and J. Sadock. (1975). Ambiguity tests and how to fail (6) The raft floated down the river sank. them. In J. Kimball, Ed., Syntax and Semantics, vol. 4. New York: Academic Press. (7) The land mine buried in the sand exploded. In (5), resolution in favor of the relative clause does not cause Amygdala, Primate conscious confusion. Nonetheless, processing difficulty at by the proctor can be observed using sensitive measures, for instance the duration of eye-fixations in reading. Theoretical For more than one century there has been evidence that the explanations for syntactic preferences can be roughly divided amygdala is involved in emotional behavior. Experimental into structural and constraint-based approaches. In structural lesion studies in monkeys demonstrated that large temporal theories, principles defined over syntactic configurations lobe lesions that included the amygdala resulted in dramatic determine an initial structure, which is then evaluated, and if postoperative changes in behavior, including flattened necessary, revised. Different principles may apply for differ- affect, visual agnosia, hyperorality, and hypersexuality ent classes of ambiguities (e.g., Frazier 1987). In constraint- (Brown and Schaefer 1888; Klüver and Bucy 1938). Similar based theories, preferences arise because a conspiracy of behaviors have also been observed in humans with large probabilistic constraints, many of them lexically based, tem- temporal lobe lesions that include the amygdala (Terzian porarily make the ultimately incorrect interpretation the more and Dalle Ore 1955). The amygdala was more formally likely one (MacDonald, Pearlmutter, and Seidenberg 1994; linked to emotional behavior in 1949, when Paul MacLean Tanenhaus and Trueswell 1995). The example in (7) illus- expanded Papez’s notion of the LIMBIC SYSTEM to include trates a sentence with the same structure as (6) in which the this region, based on neuroanatomical criteria. Over the past probabilistic constraints initially favor the relative clause several decades, converging results of neuroanatomical, because “buried” is typically used as a passive participle with- behavioral, and physiological studies in macaque monkeys out an agent, and land mines are more typically themes than along with neuropsychological and neuroimaging studies in agents of burying events. Whether or not constraint-based sys- humans have firmly established a role for the amygdala in tems can provide a unified account of ambiguity resolution in emotional processing. However, a number of important language using general principles that hold across other per- questions remain regarding the nature and specificity of this ceptual domains remains a central unresolved issue. role. In addition, the amygdala has also been linked to social See also FIGURATIVE LANGUAGE; LEXICON; NATURAL behavior but these data will not be reviewed here. In this LANGUAGE PROCESSING; PSYCHOLINGUISTICS; SPOKEN article, a brief neuroanatomical review will be followed by a WORD RECOGNITION; SYNTAX survey of two areas of particularly active research involving the primate amygdala: expression of emotional behavior —Michael K. Tanenhaus and Julie C. Sedivy and the recognition of facial emotion. The unique neuroanatomical profile of the amygdala References illustrates why this structure has been referred to as the “sensory gateway to the emotions” (Aggleton and Mishkin Allen, J. (1995). Natural Language Understanding. Redwood City, 1986). The amygdala comprises at least thirteen distinct CA: Benjamin/Cummings. 16 Amygdala, Primate nuclei, each with a rich pattern of intrinsic connections. As the recognition of fearful versus happy faces (Morris et al. to extrinsic connections, the amygdala is directly intercon- 1996). However, the results from other studies are not con- nected with unimodal and polymodal sensory cortical areas sistent with this notion. First, recognition of facial emotion, as well as with subcortical structures such as the basal fore- including fear, may occur even in the absence of the brain, THALAMUS, hypothalamus, striatum, and brainstem amygdala (Hamann et al. 1996). Second, single neurons in areas. Thus, the amygdala is centrally positioned to receive the human amygdala respond to particular facial expres- convergent cortical and thalamic sensory information and sions but do not respond exclusively to fearful expressions subsequently to direct the appropriate survival-oriented (Fried, MacDonald, and Wilson 1997). In monkeys, there is response via its brainstem and hypothalamic projections. a population of amygdala neurons that respond selectively Moreover, the significant neuroanatomical connections to faces (Leonard et al. 1985), but there is little or no evi- between the amygdala and nearby medial temporal lobe dence to support the idea that these neurons respond selec- regions involved in MEMORY may provide the substrate for tively to fearful expressions. Finally, functional magnetic the enhancing effect of emotional arousal on memory, as resonance imaging (fMRI) has revealed increased activation demonstrated by a significant body of animal and human in the amygdala in response to both happy and fearful faces studies (McGaugh et al. 1995; Cahill et al. 1995). (Breiter et al. 1996). Experimental lesions that target the amygdala have pro- To conclude, studies in monkeys and humans over the duced many of the behaviors that were originally described past several decades have reinforced the long-held view, after the large lesions of Klüver and Bucy and Brown and supported by findings in other species (see EMOTION AND Schaefer (Weiskrantz 1956; Aggleton, Burton, and Passing- THE ANIMAL BRAIN), that the amgydala has an important ham 1980; Zola-Morgan et al. 1991). However, due to meth- role in emotional function. New experimental approaches, odological difficulties, these lesions typically have included such as functional neuroimaging and the use of highly inadvertent cortical and/or fiber damage; thus, interpreta- selective lesion techniques, hold promise for an exciting tions of amygdala function based on these classic studies new era of progress that builds on this work. alone must be made with caution. Highly circumscribed See also EMOTIONS; FACE RECOGNITION; OBJECT RECOG- amygdala lesions can now be produced using the neurotoxic NITION, HUMAN NEUROPSYCHOLOGY lesion technique, and results from these studies indicate a —Lisa Stefanacci role for the amygdala in temperament and oral exploration (Amaral et al. 1997), food preferences (Murray, Gaffan, and References Flint 1996), and in the devaluation of a food reward after selective satiation (Malkova, Gaffan, and Murray 1997). Adolphs, R., D. Tranel, H. Damasio, and A. Damasio. (1994). Impaired recognition of emotion in facial expressions follow- Interestingly, the behavioral changes observed after neu- ing bilateral damage to the human amygdala. Nature 372: 669– rotoxic amygdala lesions are much less profound than those 672. that were reported using more traditional (i.e., less discrete) Aggleton, J. P., and M. Mishkin. (1986). The amygdala: sensory lesion techniques. The perirhinal cortex, known to be gateway to the emotions. In R. Plutchik and H. Kellerman, important for memory (Zola-Morgan et al. 1989), lies adja- Eds., Emotion: Theory, Research, and Experience. New York: cent to the amygdala, and these two regions are strongly Academic Press, Inc. pp. 281–298. neuroanatomically interconnected (Stefanacci, Suzuki, and Aggleton, J. P. (1985). A description of intra-amygdaloid connec- Amaral 1996). It has been suggested that the amygdala and tions in old world monkeys. Exp. Brain Res 57: 390–399. the perirhinal cortex may have some shared roles in emo- Aggleton, J. P., M. J. Burton, and R. E. Passingham. (1980). Corti- tional behavior (Iwai et al. 1990; Stefanacci, Suzuki, and cal and subcortical afferents to the amygdala of the rhesus mon- key (Macaca Mulatta). Brain Res 190: 347–368. Amaral 1996). Thus, it is possible that dramatic emotional Amaral, D., J. P. Capitanio, C. J. Machado, W. A. Mason, and S. P. changes may occur only after lesions that include both of Mendoza. (1997). The role of the amygdaloid complex in these regions. rhesus monkey social behavior. Soc. Neurosci. Abstr 23: 570. Recent studies in humans have explored the role of the Breiter, H. C., N. L. Etcoff, P. J. Whalen, W. A. Kennedy, S. L. amygdala in the recognition of facial emotion and of the Rauch, R. L. Buckner, M. M. Strauss, S. E. Human, and B. R. recognition of fear in particular (see EMOTION AND THE Rosen. (1996). Response and habituation of the human HUMAN BRAIN). Taken together, the evidence is not entirely amygdala during visual processing of facial expression. Neuron supportive. On the positive side, one study reported that a 17: 875–887. patient with relatively discrete, bilateral amygdala damage Brown, S., and E. A. Schaefer. (1888). An investigation into the as determined by MAGNETIC RESONANCE IMAGING (MRI) functions of the occipital and temporal lobes of the monkey’s brain. Philos. Trans. R. Soc. Lond. B 179: 303–327. was impaired at recognizing fear in facial expressions Cahill, L., R. Babinsky, H. J. Markowitsch, and J. L. McGaugh. (Adolphs et al. 1994). A second patient who has partial (1995). The amygdala and emotional memory. Nature 377: bilateral amygdala damage, determined by MRI, was simi- 295–296. larly impaired (Calder et al. 1996) and was also impaired in Calder, A. J., A. W. Young, D. Rowland, D. I. Perrett, J. R. recognizing fear and anger in the auditory domain (Scott et Hodges, and N. L. Etcoff. (1996). Facial emotion recognition al. 1997). However, this patient also has more general facial after bilateral amygdala damage: differentially severe impair- and auditory processing impairments (Young et al. 1996; ment of fear. Cognitive Neuropsychology 13(5): 699–745. Scott et al. 1997). Fried, I., K. A. MacDonald, and C. L. Wilson. (1997). Single neu- Functional neuroimaging data provide additional support ron activity in human hippocampus and amygdala during rec- for the notion that the amygdala is preferentially involved in ognition of faces and objects. Neuron 18: 753–765. Analogy 17 Hamann, S. B., L. Stefanacci, L. R. Squire, R. Adolphs, D. Tranel, Gallagher, M., and A. A. Chiba. (1996). The amygdala and emo- H. Damasio, and A. Damasio. (1996). Recognizing facial emo- tion. Curr. Opin. Neurobio 6: 221–227. tion. Nature 379: 497. Kling, A. S., and L. A. Brothers. (1992). The amygdala and social Iwai, E., M. Yukie, J. Watanabe, K. Hikosaka, H. Suyama, and S. behavior. In J. Aggleton, Ed., The Amygdala: Neurobiological Ishikawa. (1990). A role of amygdala in visual perception and Aspects of Emotion, Memory, and Mental Dysfunction. New cognition in macaque monkeys (Macaca fuscata and Macaca York: Wiley-Liss, pp. 353–377. mulatta). Toholu J. Exp. Med 161: 95–120. Analogy Klüver, H., and P. C. Bucy. (1938). An analysis of certain effects of bilateral temporal lobectomy in the rhesus monkey, with special reference to “psychic blindness.” J. Psych 5: 33–54. Analogy is (1) similarity in which the same relations hold Leonard, C. M., E. T. Rolls, F. A. W. Wilson, and G. C. Baylis. between different domains or systems; (2) inference that if (1985). Neurons in the amygdala of the monkey with responses selective for faces. Behavioral Brain Research 15: 159–176. two things agree in certain respects then they probably agree Malkova, L., D. Gaffan, and E. A. Murray. (1997). Excitotoxic in others. These two senses are related, as discussed below. lesions of the amygdala fail to produce impairment in visusal Analogy is important in cognitive science for several learning for auditory secondary reinfocement but interfere with reasons. It is central in the study of LEARNING and discov- reinforcer devaluation effects in rhesus monkeys. J. Neuro- ery. Analogies permit transfer across different CONCEPTS, science 17: 6011–6020. situations, or domains and are used to explain new topics. McGaugh, J. L., L. Cahill, M. B. Parent, M. H. Mesches, K. Once learned, they can serve as MENTAL MODELS for Coleman-Mesches, and J. A. Salinas. (1995). Involvement of understanding a new domain (Halford 1993). For exam- the amygdala in the regulation of memory storage. In J. L. ple, people often use analogies with water flow when rea- McGaugh, F. Bermudez-Rattoni, and R. A. Prado Alcala, soning about electricity (Gentner and Gentner 1983). Eds., Plasticity in the Central Nervous System—Learning and Memory. Hillsdale, NJ: Erlbaum. Analogies are often used in PROBLEM SOLVING and induc- Morris, J. S., C. D. Frith, D. I. Perrett, D. Rowland, A. W. Yound, tive reasoning because they can capture significant paral- A. J. Calder, and R. J. Dolan. (1996). A differential neural lels across different situations. Beyond these mundane response in the human amygdala to fearful and happy facial uses, analogy is a key mechanism in CREATIVITY and sci- expressions. Nature 812–815. entific discovery. For example, Johannes Kepler used an Murray, E. A., E. A. Gaffan, and R. W. Flint, Jr. (1996). Anterior analogy with light to hypothesize that the planets are rhinal cortex and the amygdala: dissociation of their contribu- moved by an invisible force from the sun. In studies of tions to memory and food preference in rhesus monkeys. microbiology laboratories, Dunbar (1995) found that Behav. Neurosci 110: 30–42. analogies are both frequent and important in the discovery Scott, S. K., A. W. Young, A. J. Calder, D. J. Hellawell, M. P. process. Aggleton, and M. Johnson. (1997). Impaired auditory recogni- tion of fear and anger following bilateral amygdala lesions. Analogy is also used in communication and persuasion. Nature 385: 254–257. For example, President Bush analogized the Persian Gulf Stefanacci, L., W. A. Suzuki, and D. G. Amaral. (1996). Organiza- crisis to the events preceding World War II, comparing Sad- tion of connections between the amygdaloid complex and the dam Hussein to Hitler, Spellman and Holyoak 1992). The perirhinal and parahippocampal cortices: an anterograde and invited inference was that the United States should defend retrograde tracing study in the monkey. J. Comp. Neurol 375: Kuwait and Saudi Arabia against Iraq, just as the Allies 552–582. defended Europe against Nazi Germany. On a larger scale, Terzian, H., and G. Dalle Ore. (1955). Syndrome of Kluver and conceptual metaphors such as “weighing the evidence” and Bucy, reproduced in man by bilateral removal of the temporal “balancing the pros and cons” can be viewed as large-scale lobes. Neurology 5: 373–380. conventionalized analogies (see COGNITIVE LINGUISTICS). Weiskrantz, L. (1956). Behavioral changes associated with abla- tion of the amygdaloid complex in monkeys. J. Comp. Physio. Finally, analogy and its relative, SIMILARITY, are important Psych 49: 381–391. because they participate in many other cognitive processes. Young, A. W., D. J. Hellawell, C. Van de Wal, and M. Johnson. For example, exemplar-based theories of conceptual struc- (1996). Facial expression processing after amygdalotomy. Neu- ture and CASE-BASED REASONING models in artificial intelli- ropsychologia 34(1): 31–39. gence assume that much of human categorization and Zola-Morgan, S., L. Squire, P. Alvarez-Royo, and R. P. Clower. reasoning is based on analogies between the current situa- (1991). Independence of memory functions and emotional tion and prior situations (cf. JUDGMENT HEURISTICS). behavior: separate contributions of the hippocampal formation The central focus of analogy research is on the mapping and the amygdala. Hippocampus 1: 207–220. process by which people understand one situation in terms Zola-Morgan, S., L. R. Squire, D. G. Amaral, and W. A. Suzuki. of another. Current accounts distinguish the following sub- (1989). Lesions of perirhinal and parahippocampal cortex that spare the amygdala and hippocampal formation produce severe processes: mapping, that is, aligning the representational memory impairment. J. Neurosci 9: 4355–4370. structures of the two cases and projecting inferences; and evaluation of the analogy and its inferences. These first Further Readings two are signature phenomena of analogy. Two further pro- cesses that can occur are adaptation or rerepresentation of Amaral, D. G., J. L. Price, A. Pitkanen, and S. T. Carmichael. one or both analogs to improve the match and abstraction (1992). Anatomical organization of the primate amygdaloid of the structure common to both analogs. We first discuss complex. In J. Aggleton, Ed., The Amygdala: Neurobiological these core processes, roughly in the order in which they Aspects of Emotion, Memory, and Mental Dysfunction. New occur during normal processing. Then we will take up the York: Wiley-Liss, pp. 1–66. 18 Analogy simulation occurs fall into two classes: projection-first issue of analogical retrieval, the processes by which peo- models, in which the schema is derived from the base and ple are spontaneously reminded of past similar or analo- mapped to the target; and alignment-first models, in which gous examples from long-term memory. the abstract schema is assumed to arise out of the analogical In analogical mapping, a familiar situation—the base or mapping process. Most current cognitive simulations take source analog—is used as a model for making inferences the latter approach. For example, the structure-mapping about an unfamiliar situation—the target analog. Accord- engine (SME) of Falkenhainer, Forbus, and Gentner (1989), ing to Gentner’s structure-mapping theory (1983), the when given two potential analogs, proceeds at first rather mapping process includes a structural alignment between blindly, finding all possible local matches between elements two represented situations and the projection of inferences of the base and target. Next it combines these into structur- from one to the other. The alignment must be structurally ally consistent kernels, and finally it combines the kernels consistent, that is, there must be a one-to-one correspon- into the two or three largest and deepest matches of con- dence between the mapped elements in the base and target, nected systems, which represent possible interpretations of and the arguments of corresponding predicates must also the analogy. Based on this alignment, it projects candidate correspond (parallel connectivity). Given this alignment, inferences—by hypothesizing that other propositions con- candidate inferences are drawn from the base to the target nected to the common system in the base may also hold in via a kind of structural completion. A further assumption the target. The analogical constraint-mapping engine is the systematicity principle: a system of relations con- (ACME) of Holyoak and Thagard (1989) uses a similar nected by higher-order constraining relations such as local-to-global algorithm, but differs in that it is a multicon- causal relations is more salient in analogy than an equal straint, winner-take-all connectionist system, with soft con- number of independent matches. Systematicity links the straints of structural consistency, semantic similarity, and two classic senses of analogy, for if analogical similarity is pragmatic bindings. Although the multiconstraint system modeled as common relational structure, then a base permits a highly flexible mapping process, it often arrives at domain that possesses a richly linked system of connected structurally inconsistent mappings, whose candidate infer- relations will yield candidate inferences by completing the ences are indeterminate. Markman (1997) found that this connected structure in the target (Bowdle and Gentner kind of indeterminacy was rarely experienced by people 1997). solving analogies. Other variants of the local-to-global Another important psychological approach to analogical algorithm are Hofstadter and Mitchell’s Copycat system mapping is offered by Holyoak (1985), who emphasized the (1994) for perceptual analogies and Keane’s incremental role of pragmatics in problem solving by analogy—how analogy machine (IAM; 1990), which adds matches incre- current goals and context guide the interpretation of an anal- mentally in order to model effects of processing order. In ogy. Holyoak defined analogy as similarity with respect to a contrast to alignment-first models, in which inferences are goal, and suggested that mapping processes are oriented made after the two representations are aligned, projection- toward attainment of goal states. Holyoak and Thagard first models find or derive an abstraction in the base and (1989) combined this pragmatic focus with the assumption then project it to the target (e.g., Greiner 1988). Although of structural consistency and developed a multiconstraint alignment-first models are more suitable for modeling the approach to analogy in which similarity, structural parallel- generation of new abstractions, projection-first models may ism, and pragmatic factors interact to produce an interpreta- be apt for modeling conventional analogy and metaphor. tion. Finally, analogy has proved challenging to subsymbolic Through rerepresentation or adaptation, the representa- connectionist approaches. A strong case can be made that tion of one or both analogs is altered to improve the match. analogical processing requires structured representations Although central to conceptual change, this aspect of anal- and structure-sensitive processing algorithms. An interest- ogy remains relatively unexplored. And through schema ing recent “symbolic connectionist” model, Hummel and abstraction, which retains the common system representing Holyoak’s LISA (1997), combines such structured symbolic the interpretation of an analogy for later use, analogy can techniques with distributed concept representations. promote the formation of new relational categories and Thus far, our focus has been on how analogy is pro- abstract rules. cessed once it is present. But to model the use of analogy Evaluation is the process by which we judge the accept- and similarity in real-life learning and reasoning we must ability of an analogy. At least three criteria seem to be also understand how people think of analogies; that is, involved: structural soundness—whether the alignment and how they retrieve potential analogs from long-term mem- the projected inferences are structurally consistent; factual ory. There is considerable evidence that similarity-based validity of the candidate inferences—because analogy is not retrieval is driven more by surface similarity and less by a deductive mechanism, this is not guaranteed and must be structural similarity than is the mapping process. For checked separately; and finally, in problem-solving situa- example, Gick and Holyoak (1980; 1983) showed that tions, goal-relevance—the reasoner must ask whether the people often fail to access potentially useful analogs. Peo- analogical inferences are also relevant to current goals. A ple who saw an analogous story prior to being given a very lively arena of current research centers on exactly how and difficult thought problem were three times as likely to when these criteria are invoked in the analogical mapping solve the problem as those who did not (30 percent vs. 10 process. percent). Impressive as this is, the majority of subjects As discussed above, processing an analogy typically nonetheless failed to benefit from the analogy. However, results in a common schema. Accounts of how cognitive Analogy 19 when the nonsolvers were given the hint to think back to References the prior story, the solution rate again tripled, to about 80– Bassok, M., L. Wu, and K. L. Olseth. (1995). Judging a book by its 90 percent. Because no new information was given about cover: Interpretative effects of content on problem solving the story, we can infer that subjects had retained its mean- transfer. Memory and Cognition 23: 354–367. ing, but failed to think of it when reading the problem. The Bowdle, B., and D. Gentner. (1997). Informativity and asymmetry similarity match between the story and the problem, in comparisons. Cognitive Psychology 34: 244–286. though sufficient to carry out the mapping once both ana- Dunbar, K. (1995). How scientists really reason: scientific reason- logs were present in working memory, did not lead to ing in real-world laboratories. In R. J. Sternberg and J. E. spontaneous retrieval. This is an example of the inert Davidson, Eds., The Nature of Insight. Cambridge, MA: MIT knowledge problem in transfer, a central concern in EDU- Press, pp. 365–395. Falkenhainer, B., K. D. Forbus, and D. Gentner. (1989). The CATION. structure-mapping engine: An algorithm and examples. Artifi- Not only do people fail to retrieve analogies, but they are cial Intelligence 41: 1–63. often reminded of prior surface-similar cases, even when Forbus, K. D., D. Gentner, and K. Law. (1995). MAC/FAC: A they know that these matches are of little use in reasoning model of similarity-based retrieval. Cognitive Science 19: 141– (Gentner, Rattermann, and Forbus 1993). This relative lack 205. of spontaneous analogical transfer and predominance of sur- Gentner, D. (1983). Structure-mapping: A theoretical framework face remindings is seen in problem solving (Ross 1987) and for analogy. Cognitive Science 7: 155–170. may result in part from overly concrete representations Gentner, D., and D. R. Gentner. (1983). Flowing waters or teeming (Bassok, Wu, and Olseth 1995). crowds: Mental models of electricity. In D. Gentner and A. L. Computational models of similarity-based retrieval have Stevens, Eds., Mental Models. Hillsdale, NJ: Erlbaum, pp. 99– 129. taken two main approaches. One class of models aims to Gentner, D., and A. B. Markman. (1997). Structure-mapping in capture the phenomena of human memory retrieval, includ- analogy and similarity. American Psychologist 52: 45–56. ing both strengths and weaknesses. For example, analog Gentner, D., M. J. Rattermann, and K. D. Forbus. (1993). The roles retrieval by constraint satisfaction (ARCS; Thagard et al. of similarity in transfer: Separating retrievability from inferen- 1990) and Many are called/but few are chosen (MAC/FAC; tial soundness. Cognitive Psychology 25: 524–575. Forbus, Gentner, and Law 1995) both assume that retrieval Gick, M. L., and K. J. Holyoak. (1980). Analogical problem solv- is strongly influenced by surface similarity and by structural ing. Cognitive Psychology 12: 306–355. similarity, goal relevance, or both. In contrast, most case- Gick, M. L., and K. J. Holyoak. (1983). Schema induction and ana- based reasoning (CBR) models aim for optimality, focusing logical transfer. Cognitive Psychology 15: 1–38. on how to organize memory such that relevant cases are Greiner, R. (1988). Learning by understanding analogies. Artificial Intelligence 35: 81–125. retrieved when needed. Halford, G. S. (1993). Children’s Understanding: The Develop- Theories of analogy have been extended to other kinds of ment of Mental Models. Hillsdale, NJ: Erlbaum. similarity, such as METAPHOR and mundane literal similar- Hesse, M. B. (1966). Models and Analogies in Science. Notre ity. There is evidence that computing a literal similarity Dame, IN. University of Notre Dame Press. match involves the same process of structural alignment as Hofstadter, D. R., and M. Mitchell. (1994). The Copycat project: A does analogy (Gentner and Markman 1997). Current com- model of mental fluidity and analogy-making. In K. J. Holyoak putational models like ACME and SME use the same pro- and J. A. Barnden, Eds., Advances in Connectionist and Neural cessing algorithms for similarity as for analogy. Computation Theory, vol. 2, Analogical Connections. Nor- The investigation of analogy has been characterized by wood, NJ: Ablex, pp. 31–112. unusually fruitful interdisciplinary convergence. Important Holyoak, K. J. (1985). The pragmatics of analogical transfer. In G. H. Bower, Ed., The Psychology of Learning and Motivation, contributions have come from philosophy, notably Hesse’s vol. 19. New York: Academic Press, pp. 59–87. analysis (1966) of analogical models in science, and from Holyoak, K. J., and P. R. Thagard. (1989). Analogical mapping by artificial intelligence (AI), beginning with Winston’s constraint satisfaction. Cognitive Science 13: 295–355. research (1982), which laid out computational strategies Hummel, J. E., and K. J. Holyoak. (1997). Distributed representa- applicable to human processing. Recent research that com- tions of structure: A theory of analogical access and mapping. bines psychological investigations and computational mod- Psychological Review 104: 427–466. eling has advanced our knowledge of how people align Keane, M. T. (1990). Incremental analogising: Theory and model. representational structures and compute further inferences In K. J. Gilhooly, M. T. G. Keane, R. H. Logie, and G. Erdos, over them. Theories of analogy and structural similarity Eds., Lines of Thinking, vol. 1. Chichester, England: Wiley. have been successfully applied to areas such as CATEGORI- Markman, A. B. (1997). Constraints on analogical inference. Cog- nitive Science 21(4): 373–418. ZATION, DECISION MAKING, and children’s learning. At the Ross, B. H. (1987). This is like that: The use of earlier problems same time, cross-species comparisons have suggested that and the separation of similarity effects. Journal of Experimental analogy may be especially well developed in human beings. Psychology: Learning, Memory, and Cognition 13: 629–639. These results have broadened our view of the role of struc- Spellman, B. A., and K. J. Holyoak. (1992). If Saddam is Hitler tural similarity in human thought. then who is George Bush? Analogical mapping between sys- See also CONSTRAINT SATISFACTION; FIGURATIVE LAN- tems of social roles. Journal of Personality and Social Psychol- GUAGE; LANGUAGE AND COMMUNICATION; METAPHOR AND ogy 62: 913–933. CULTURE; SCHEMATA Thagard, P., K. J. Holyoak, G. Nelson, and D. Gochfeld. (1990). Analog retrieval by constraint satisfaction. Artificial Intelli- gence 46: 259–310. —Dedre Gentner 20 Anaphora further serve as antecedents of anaphoric expressions (Heim Winston, P. H. (1982). Learning new principles from precedents and exercises. Artificial Intelligence 19: 321–350. 1982; McCawley 1979; Prince 1981). Suppose (1b) is uttered in the context of (1a). We have stored an entry for Further Readings Lucie, and when the pronoun she is encountered, it can be assigned this value. In theory-neutral terms, this assignment Gentner, D., and A. B. Markman. (1995). Analogy-based reason- is represented in (3b), where Lucie is a discourse entry, and ing in connectionism. In M. A. Arbib, Ed., The Handbook of the pronoun is covalued with this entry. Brain Theory and Neural Networks. Cambridge, MA: MIT The actual resolution of anaphora is governed by dis- Press, pp. 91–93. course strategies. Ariel (1990) argues that pronouns look for Gentner, D., and J. Medina. (1998). Similarity and the develop- the most accessible antecedent, and discourse topics are ment of rules. Cognition 65: 263–297. Goswami, U. (1982). Analogical Reasoning in Children. Hillsdale, always the most accessible. For example, (3b) is the most NJ: Erlbaum. likely anaphora resolution for (1b) in the context of (1a), Holyoak, K. J., and P. R. Thagard. (1995). Mental Leaps: Analogy since Lucie is the discourse topic that will make this mini- in Creative Thought. Cambridge, MA: MIT Press. mal context coherent. Keane, M. T. (1988). Analogical Problem Solving. Chichester, Given the two procedures, it turns out that if Lili is iden- England: Ellis Horwood, and New York: Wiley. tified as the antecedent of the pronoun in (1b), the sentence Kolodner, J. L. (1993). Case-Based Reasoning. San Mateo, CA: has, in fact, two anaphora construals. Since Lili is also in the Kaufmann. discourse storage, (1b) can have, along with (3a), the coval- Medin, D. L., R. L. Goldstone, and D. Gentner. (1993). Respects uation construal (4). for similarity. Psychological Review 100: 254–278. Nersessian, N. J. (1992). How do scientists think? Capturing the (4) Lili (λx (x thinks z has got the flu) & z = Lili) dynamics of conceptual change in science. In R. N. Giere, and H. Feigl, Eds., Minnesota Studies in the Philosophy of Science. (5) Lili thinks she has got the flu, and Max does too. Minneapolis: University of Minnesota Press, pp. 3–44. Reeves, L. M., and R. W. Weisberg. (1994). The role of content Though (3a) and (4) are equivalent, it was discovered in the and abstract information in analogical transfer. Psychological 1970s that there are contexts in which these sentences dis- Bulletin 115: 381–400. play a real representational ambiguity (Keenan 1971). For Schank, R. C., A. Kass, and C. K. Riesbeck, Eds. (1994). Inside example, assuming that she is Lili, the elliptic second con- Case-Based Explanation. Hillsdale, NJ: Erlbaum. junct of (5) can mean either that Max thinks that Lili has the flu, or that Max himself has it. The first is obtained if the elided predicate is construed as in (4), and the second if it is Anaphora the predicate of (3a). Let us adopt here the technical definitions in (6). ((6a) The term anaphora is used most commonly in theoretical differs from the definition used in the syntactic binding the- linguistics to denote any case where two nominal expres- ory). In (3a), then, Lucie binds the pronoun; in (4), they are sions are assigned the same referential value or range. Dis- covalued. cussion here focuses on noun phrase (NP) anaphora with (6) a. Binding: α binds β iff α is an argument of a λ-predi- pronouns (see BINDING THEORY for an explanation of the cate whose operator binds β. types of expressions commonly designated “anaphors,” e.g., b. Coevaluation: α and β are covalued iff neither binds reflexive pronouns). the other and they are assigned the same value. Pronouns are commonly viewed as variables. Thus, (1b) corresponds to (2), where the predicate contains a free vari- Covaluation is not restricted to referential discourse-entities— able. This means that until the pronoun is assigned a value, a pronoun can be covalued also with a bound variable. Indeed, the predicate is an open property (does not form a set). Heim (1998) showed that covaluation-binding ambiguity can There are two distinct procedures for pronoun resolution: show up also in quantified contexts. In (7a), the variable x binding and covaluation. In binding, the variable gets bound (she) binds the pronoun her. But in (7b) her is covalued with x. by the λ-operator, as in (3a), where the predicate is closed, (7) Every wife thinks that only she respects her husband. denoting the set of individuals who think they have the flu, a. Binding: Every wife (λx (x thinks that [only x (λy(y and where the sentence asserts that Lili is in this set. respects y’s husband))])) (1) a. Lucie didn’t show up today. b. Covaluation: Every wife (λx (x thinks that [only x b. Lili thinks she’s got the flu. (λy(y respects x’s husband))])) (2) Lili (λx (x thinks z has got the flu)) In many contexts the two construals will be equivalent, but the presence of only enables their disambiguation here: (7a) (3) a. Binding: Lili (λx (x thinks x has got the flu)) entails that every wife thinks that other wives do not respect b. Covaluation: Lili (λx (x thinks z has got the flu) & their husbands, while (7b) entails that every wife thinks z = Lucie) other wives do not respect her husband. This is so, because In covaluation, the free variable is assigned a value from the the property attributed only to x in (7a) is respecting one’s DISCOURSE storage, as in (3b). An assumption standard own husband, while in (7b) it is respecting x’s husband. since the 1980s is that, while processing sentences in con- The binding interpretation of pronouns is restricted by text, we build an inventory of discourse entities, which can syntactic properties of the derivation (see BINDING THEORY). Anaphora 21 A question that has been debated is whether there are also tactic configuration allows, in principle, variable binding, syntactic restrictions on their covaluation interpretation. On obtaining an equivalent anaphora-interpretation through the factual side, under certain syntactic configurations, covaluation is excluded. Given a structure like (9), variable covaluation is not allowed. For example, in (9), binding is binding could be derived, with a different placement of independently excluded. The NP Lucie is not in a configura- Lucie and her, as in Lucie said we should invite her. The tion to bind the pronoun (since it is not the argument of a λ- result would be equivalent to the covaluation construal (10) predicate containing the pronoun). Suppose, however, that (for (9)). Hence, (10) is excluded. In (11a), no placement of (9) is uttered in the context of (8), so that Lucie is in the dis- he and Max could enable variable binding, so the covalua- course storage. The question is what prevents the covalua- tion in (11b) is the only option for anaphora. When a vari- tion construal in (10) for (9) (# marks an excluded able binding alternative exists, but it is not equivalent to interpretation). It cannot be just the fact that the pronoun covaluation, covaluation is permitted, as in (12)–(13). precedes the antecedent. For example, in (11), the preceding A relevant question is why variable binding is more effi- pronoun can be covalued with Max. cient than covaluation. One answer, developed in Levinson (1987), is purely pragmatic and derives this from the (8) Can we go to the bar without Lucie? Gricean maxims of quantity and manner. The other, devel- oped in Fox (1998), is based on the notion of semantic pro- (9) She said we should invite Lucie. cessing: variable binding is less costly since it enables (10) #She (λx x said we should invite Lucie) & she = immediate closure of open properties, while covaluation Lucie) requires that the property is stored open until we find an (11) a. The woman next to him kissed Max. antecedent for the variable. b. The woman next to him (λx (x kissed Max) & The optimality account for the covaluation restriction him = Max) entails a much greater computational complexity than the syntactic approach (condition C), since it requires construct- In the 1970s, it was assumed that there is a syntactic restric- ing and comparing two interpretations for one derivation. tion blocking such an interpretation (Langacker 1966; Lasnik This is among the reasons why covaluation is still a matter 1976). Reinhart (1976) formulated it as the requirement that of theoretical debate. Nevertheless, evidence that such com- a pronoun cannot be covalued with a full NP it c-commands, plexity is indeed involved in computing sentences like (10) which became known as Chomsky’s “condition C” (1981). comes from the acquisition of anaphora. Many studies (e.g., (In (11), the pronoun does not c-command Max.) Another Wexler and Chien 1991) report that children have much formulation in logical syntax terms was proposed by Keenan greater difficulties in ruling out illicit covaluation than in (1974): The reference of an argument must be determinable violations of the syntactic restrictions on variable binding. independently of its predicate. Grodzinsky and Reinhart (1993) argue that this is because The empirical problem with these restrictions is that, as their working memory is not yet sufficiently developed to shown in Evans (1980), there are systematic contexts in carry such complex computation. which they can be violated. Reinhart (1983) argued that this See also PRAGMATICS; SEMANTICS; SENTENCE PROCESS- is possible whenever covaluation is not equivalent to binding. ING; SYNTAX-SEMANTICS INTERFACE (12) [Who is the man with the gray hat?] He is Ralph —Tanya Reinhart Smith. a. He (λx (x is Ralph Smith) & he = Ralph Smith) References b. He (λx (x is x) & he = Ralph Smith) Ariel, M. (1990). Accessing Noun Phrase Antecedents. London (13) Only he (himself) still thinks that Max is a genius. and New York: Routledge. a. Only he (λx (x thinks Max is a genius) & Chomsky, N. (1981). Lectures on Government and Binding. Dor- he = Max) drecht: Foris. b. Only Max (λx (x thinks x is a genius) Evans, G. (1980). Pronouns. Linguistic Inquiry 11: 337–362. Fox, D. (1998). Locality in variable binding. In P. Barbosa et al., In (12), it is not easy to imagine a construal of the truth con- Eds., Is the Best Good Enough? Cambridge, MA: MIT Press ditions that would not include covaluation of the pronoun and MITWPL. with Ralph Smith. But this covaluation violates condition C, Grodzinsky, Y., and T. Reinhart. (1993). The innateness of binding as does (13). In both cases, however, the covaluation reading and coreference. Linguistic Inquiry. (a) is clearly distinct from the bound reading (b). (12b) is a Heim, I. (1982). File change semantics and the familiarity theory tautology, whereas (12a) is not. (13a) attributes a different of definiteness. In R. Bauerle et al., Eds., Meaning, Use and the Interpretation of Language. Berlin and New York: de Gruyter, property only to Max from what (13b) does. Believing one- pp. 164–189. self to be a genius may be true of many people, but what (13) Heim, I. (1998). Anaphora and semantic interpretation: A reinter- attributes only to Max is believing Max to be a genius (13a). pretation of Reinhart’s approach. In U. Sauerland and O. Per- The alternative (proposed by Reinhart 1983) is that cus, Eds., The Interpretative Tract, MIT Working Papers in covaluation is not governed by syntax, but by a discourse Linguistics, vol. 25. Cambridge, MA: MITWPL. strategy that takes into account the options open for the syn- Keenan, E. (1971). Names, quantifiers and a solution to the sloppy tax in generating the given derivation. The underlying identity problem. Papers in Linguistics 4(2). assumption is that variable binding is a more efficient way Keenan, E. (1974). The functional principle: Generalizing the to obtain anaphora than covaluation. So whenever the syn- notion “subject of.” In M. LaGaly, R. Fox, and A. Bruck, Eds., 22 Animal Cognition places a premium on recognizing honest signals. Zahavi Papers from the Tenth Regional Meeting of the Chicago Lin- guistic Society. Chicago: Chicago Linguistic Society. (1975) suggested a mechanism for this, using the following Langacker, R. (1966). On pronominalization and the chain of com- recipe: signals are honest, if and only if they are costly to pro- mand. In W. Reibel and S. Schane, Eds., Modern Studies in duce relative to the signaler’s current condition and if the English. Englewood Cliffs, NJ: Prentice Hall. capacity to produce honest signals is heritable. Consider the Lasnik, H. (1976). Remarks on coreference. Linguistic Analysis anti predator stotting displays of ungulates—an energetically 2(1). expensive rigid-legged leap. In Thompson’s gazelle, only Levinson, S. C. (1987). Pragmatics and the grammar of anaphora. males in good physical condition stot, and stotting males are Journal of Linguistics 23: 379–434. more likely to escape cheetah attacks than those who do not. McCawley, J. (1979). Presuppositions and discourse structure. In Departing slightly from Zahavi, behavioral ecologists C. K. Oh and D. A. Dinneen, Eds., Presuppositions. Syntax and Krebs and Dawkins (1984) proposed that signals are Semantics, vol. 11. New York: Academic Press. Prince, E. (1981). Towards a taxonomy of given-new information. designed not to inform but to manipulate. In response to In P. Cole, Ed., Radical Pragmatics. New York: Academic such manipulation, selection favors skeptical receivers Press, pp. 233–255. determined to discriminate truths from falsehoods. Such Reinhart, T. (1976). The Syntactic Domain of Anaphora. Ph.D. manipulative signaling evolves in situations of resource diss., MIT. competition, including access to mates, parental care, and Reinhart, T. (1983). Anaphora and Semantic Interpretation. limited food supplies. In cases where sender and receiver Croom-Helm and Chicago University Press. must cooperate to achieve a common goal, however, selec- Wexler, K., and Y. C. Chien. (1991). Children’s knowledge of tion favors signals that facilitate the flow of information locality conditions on binding as evidence for the modularity of among cooperators. Thus signals designed to manipulate syntax and pragmatics. Language Acquisition 1: 225–295. tend to be loud and costly to produce (yelling, crying with tears), whereas signals designed for cooperation tend to be Animal Cognition quiet, subtle, and cheap (whispers). Turning to ecological constraints, early workers sug- gested that signal structure was conventional and arbitrary. See ANIMAL NAVIGATION; COMPARATIVE PSYCHOLOGY; PRI- More in-depth analyses, however, revealed that the physical MATE COGNITION; SOCIAL PLAY BEHAVIOR structure of many signals is closely related to the functions served (Green and Marler 1979; Marler 1955). Thus, several Animal Communication avian and mammalian species use calls for mobbing preda- tors that are loud, short, repetitive, and broad band. Such sounds attract attention and facilitate sound localization. In Fireflies flash, moths spray pheromones, bees dance, fish contrast, alarm calls used to warn companions of an emit electric pulses, lizards drop dewlaps, frogs croak, birds approaching hawk are soft, high-pitched whistles, covering sing, bats chirp, lions roar, monkeys grunt, apes grimace, a narrow frequency range, only audible at close range and and humans speak. These systems of communication, irre- hard to locate (Marler 1955; Klump and Shalter 1984). The spective of sensory modality, are designed to mediate a flow species-typical environment places additional constraints on of information between sender and receiver (Hauser 1996). the detectability of signals and the efficiency of transmis- Early ethologists argued that signals are designed to sion in long-distance communication, selecting for the opti- proffer information to receptive companions, usually of mal time of day and sound frequency window (Marten, their own species (Tinbergen 1951; Hinde 1981; Smith Quine, and Marler 1977; Morton 1975; Wiley and Richards 1969). When a bird or a monkey gives a “hawk call,” for 1978). To coordinate the movements of groups who are out example, this conveys information about a kind of danger. of sight, elephants and whales use very low frequency And when a redwing blackbird reveals its red epaulette dur- sounds that circumvent obstacles and carry over long dis- ing territorial disputes, it is conveying information about tances. In contrast, sounds with high frequency and short aggressive intent. Analyses of aggressive interactions, how- wavelengths, such as some alarm calls and the biosonar sig- ever, revealed only weak correlations between performance nals used by bats and dolphins for obstacle avoidance and of certain displays and the probability of attack as opposed prey capture, attenuate rapidly. to retreat, leaving the outcome relatively unpredictable The design of some signals reflects a conflict between nat- (Caryl 1979). Thus, while information transfer is basic to all ural and sexual selection pressures (Endler 1993). An elegant communication, it is unclear how best to characterize the example is the advertisement call of the male Tungara frog information exchange, particularly because animals do not (Ryan and Rand 1993). In its most complete form, one or always tell the truth. more introductory whines are followed by chucks. Because In contradistinction to the ethologists, a new breed of ani- females are attracted to the chucks, males who produce these mal behaviorist—the behavioral ecologists—proposed an sounds have higher mating success. But because frog-eating alternative approach based on an economic cost-benefit anal- bats can localize chucks more readily than whines, frogs pro- ysis. The general argument was made in two moves: (1) ducing chucks are more likely to be eaten. They compromise selection favors behavioral adaptations that maximize gene by giving more whines than chucks until a female comes by. propagation; and (2) information exchange cannot be the There are many such cases in which signal design is closely entire function of communication because it would be easy related to function, reflecting a tightly stitched tapestry of fac- for a mutant strategy to invade by providing dishonest infor- tors that include the sender’s production capabilities, habitat mation about the probability of subsequent actions. This Animal Communication 23 ence and nature of a social audience (e.g., allies or ene- structure, climate, time of day, competitors for signal space, mies). Vocalizing animals will, for example, withhold alarm the spatiotemporal distribution of intended recipients, and the calling in response to a predator if no audience is present, pressures of predation and mate choice. and in other cases, will use vocalizations to actively falsify When signals are produced or perceived, complex pro- information (Cheney and Seyfarth 1990; Evans and Marler cessing by the sense organs and the central nervous system 1991; Marler, Karakashian, and Gyger 1991; reviewed in is engaged. Songbirds have a set of interconnected forebrain Hauser 1996). While there is no evidence that animal sig- nuclei specialized for song learning and production. Nuclei nals are guided by awareness of beliefs, desires, and inten- vary widely in size between the sexes, between species and tions, essential to human linguistic behavior (see PRIMATE even between individuals of the same sex, though there are significant exceptions to these generalizations. Variations COGNITION), there is a clear need for researchers in call appear to correlate, not only with the commitment to sing- semantics and cognition to work closely together to eluci- ing behavior, but also with the size of the song repertoire date the mental states of animals while communicating. (Arnold 1992; Nottebohm 1989). Some aspects of song See also COMPARATIVE PSYCHOLOGY; DISTINCTIVE FEA- learning are analogous to those documented for human TURES; ETHOLOGY; LANGUAGE AND COMMUNICATION; PHO- speech, including involvement of particular brain areas, NOLOGY; PRIMATE LANGUAGE; SOCIAL COGNITION IN local dialects, categorical perception, innate learning prefer- ANIMALS; SOCIAL PLAY BEHAVIOR ences, and a motor theory-like system for coordinating — Marc Hauser and Peter Marler articulatory production and feature perception (Nelson and Marler 1989). References For most animals, the acoustic morphology of the vocal Arnold, A. P. (1992). Developmental plasticity in neural circuits repertoire appears to be innately specified, with experience controlling birdsong: Sexual differentiation and the neural basis playing little to no role in altering call structure during of learning. Journal of Neurobiology 23: 1506–1528. development. In contrast, the ontogeny of call usage and Caryl, P. G. (1979). Communication by agonistic displays: What comprehension is stongly influenced by experience in sev- can games theory contribute to ethology? Behaviour 68: 136– eral nonhuman primates, and possibly in some birds 169. (Cheney and Seyfarth 1990; Hauser 1996; Marler 1991), Cheney, D. L., and R. M. Seyfarth. (1990). How Monkeys See the with benefits accruing to individuals that can learn to use World: Inside the Mind of Another Species. Chicago: Chicago call types and subtypes in new ways. Generally speaking, University Press. however, the number of discrete signals in animal reper- Endler, J. (1993). Some general comments on the evolution and toires seems to be limited (Green and Marler 1979; Moyni- design of animal communication systems. Proceedings of the Royal Society, London 340: 215–225. han 1970), although reliable repertoire estimates are hard to Evans, C. S., and P. Marler. (1991). On the use of video images as come by, especially when signals intergrade extensively. social stimuli in birds: Audience effects on alarm calling. Ani- Explosive expansion of the repertoire becomes possible if mal Behaviour 41: 17–26. elements of the repertoire can be recombined into new, Green, S., and P. Marler. (1979). The analysis of animal communi- meaningful utterances, as they are in human speech. cation. In P. Marler and J. Vandenbergh, Eds., Handbook of Empirical studies documenting the decomposability of Behavioral Neurobiology, vol. 3, Social Behavior and Commu- speech into smaller units, themselves meaningless, prepared nication. New York: Plenum Press, pp. 73–158. the groundwork for the Chomskyan revolution in linguis- Hailman, J. P., M. S. Ficken, and R. W. Ficken. (1987). Constraints tics. The human brain takes our repertoire of phonemes and on the structure of combinatorial “chick-a-dee” calls. Ethology recombines them into an infinite variety of utterances with 75: 62–80. Hauser, M. D. (1996). The Evolution of Communication. Cam- distinct meanings. There is no known case of animals using bridge, MA: MIT Press. this combinatorial mechanism. Some birds create large Hinde, R. A. (1981). Animal signals: Ethological and games- learned song repertoires by recombination, but like human theory approaches are not incompatible. Animal Behaviour music, birdsongs are primarily affective signals, lacking the 29: 535–542. kind of referential meaning that has been attributed to pri- Klump, G. M., and M. D. Shalter. (1984). Acoustic behaviour of mate vocalizations and chicken calls. Thus it appears that birds and mammals in the predator context: 1. Factors affecting the songbirds’ repertoire expansion serves more to alleviate the structure of alarm signals. 2. The functional significance habituation than to enrich meaning. The same is almost cer- and evolution of alarm signals. Zeitschrift für Tierpsychologie tainly true of animals with innate repertoires that engage in 66: 189–226. a more limited degree of recombination, although some Krebs, J. R., and R. Dawkins. (1984). Animal signals: Mind- reading and manipulation. In J.R. Krebs and N.B. Davies, researchers have reported evidence of syntactical organiza- Eds., Behavioural Ecology: an Evolutionary Approach. Sun- tion (Hailman et al. 1987). More detailed analyses of the derland, MA: Sinauer Associates Inc., pp. 380–402. production and perception of vocal signals are required Marler, P. (1955). Characteristics of some animal calls. Nature before we can reach any comprehensive conclusions on the 176: 6–7. developmental plasticity of animal communication systems. Marler, P. (1991). Differences in behavioural development in At least one bird (the domestic chicken) and a few pri- closely related species: Birdsong. In P. Bateson, Ed., The mates (ring-tailed lemurs, rhesus and diana monkeys, Development and Integration of Behaviour. Cambridge: Cam- vervets) produce vocalizations that are functionally referen- bridge University Press, pp. 41–70. tial, telling others about specific objects and events (food, Marler, P., S. Karakashian, and M. Gyger. (1991). Do animals predators). Use of such calls is often contingent on the pres- have the option of withholding signals when communication is 24 Animal Navigation inappropriate? The audience effect. In C. Ristau, Ed., Cogni- tors influencing call production. Behavioral Ecology 4: 194– tive Ethology: The Minds of Other Animals. Hillsdale, NJ: 205. Erlbaum, pp. 135–186. Klump, G. M., E. Kretzschmar, and E. Curio. (1986). The hearing Marten, K., D. B. Quine, and P. Marler. (1977). Sound transmis- of an avian predator and its avian prey. Behavioral Ecology and sion and its significance for animal vocalization. 2. Tropical Sociobiology 18: 317–323. habitats. Behavioral Ecology and Sociobiology 2: 291–302. Langbauer, W. R., Jr., K. Payne, R. Charif, E. Rapaport, and F. Morton, E. S. (1975). Ecological sources of selection on avian Osborn. (1991). African elephants respond to distant playbacks sounds. American Naturalist 109: 17–34. of low-frequency conspecific calls. Journal of Experimental Moynihan, M. (1970). The control, suppression, decay, disappear- Biology 157: 35–46. ance, and replacement of displays. Journal of Theoretical Biol- Marler, P. (1961). The logical analysis of animal communication. ogy 29: 85–112. Journal of Theoretical Biology 1: 295–317. Nelson, D. A., and P. Marler. (1989). Categorical perception of a Marler, P. (1976). Social organization, communication and graded natural stimulus continuum: Birdsong. Science 244: 976–978. signals: The chimpanzee and the gorilla. In P. P. G. Bateson and Nottebohm, F. (1989). From bird song to neurogenesis. Scientific R. A. Hinde, Eds., Growing Points in Ethology. Cambridge: American 260(2): 74–79. Cambridge University Press, pp. 239–280. Ryan, M. J., and A. S. Rand. (1993). Phylogenetic patterns of Marler, P. (1978). Primate vocalizations: Affective or symbolic? In behavioral mate recognition systems in the Physalaemus pustu- G. Bourne, Ed., Progress in Ape Research. New York: Aca- losus species group (Anura: Leptodactylidae): The role of demic Press, pp. 85–96. ancestral and derived characters and sensory exploitation. In D. Marler, P. (1984). Song learning: Innate species differences in the Lees and D. Edwards, Eds., Evolutionary Patterns and Pro- learning process. In P. Marler and H. S. Terrace, Eds., The Biol- cesses. London: Academic Press, pp. 251–267. ogy of Learning. Berlin: Springer, pp. 289–309. Smith, W. J. (1969). Messages of vertebrate communication. Sci- Marler, P., A. Dufty, and R. Pickert. (1986). Vocal communication ence 165: 145–150. in the domestic chicken: 2. Is a sender sensitive to the presence Tinbergen, N. (1952). Derived activities: Their causation, biologi- and nature of a receiver? Animal Behaviour 34: 194–198. cal significance, origin and emancipation during evolution. Nordeen, E. J., and K. W. Nordeen. (1990). Neurogenesis and sen- Quarterly Review of Biology 27: 1–32. sitive periods in avian song learning. Trends in Neurosciences Wiley, R. H., and D. G. Richards. (1978). Physical constraints on 13: 31–36. acoustic communication in the atmosphere: Implications for the Nottebohm, F. (1981). A brain for all seasons: Cyclical anatomical evolution of animal vocalizations. Behavioral Ecology and changes in song control nuclei of the canary brain. Science 214: Sociobiology 3: 69–94. 1368–1370. Zahavi, A. (1975). Mate selection: a selection for a handicap. Jour- Robinson, J. G. (1979). An analysis of the organization of vocal nal of Theoretical Biology 53: 205–214. communication in the titi monkey, Callicebus moloch. Zeitschrift für Tierpsychologie 49: 381–405. Robinson, J. G. (1984). Syntactic structures in the vocalizations of Further Readings wedge-capped capuchin monkeys, Cebus nigrivittatus. Behav- Andersson, M. (1982). Female choice selects for extreme tail iour 90: 46–79. length in a widowbird. Nature 299: 818–820. Ryan, M. J., and A. S. Rand. (1995). Female responses to ancestral Andersson, M. (1994). Sexual Selection. Princeton, NJ: Princeton advertisement calls in Tungara frogs. Science 269: 390–392. University Press. Ryan, M. J., and W. Wilczynski. (1988). Coevolution of sender and Brenowitz, E. A., and A. P. Arnold. (1986). Interspecific compari- receiver: Effect on local mate preference in cricket frogs. Sci- sons of the size of neural song control regions and song com- ence 240: 1786–1788. plexity in duetting birds: Evolutionary implications. The Seyfarth, R. M., and D. L. Cheney. (1997). Some general features Journal of Neuroscience 6: 2875–2879. of vocal development in nonhuman primates. In C. T. Snowdon Brown, C., and P. Waser. (1988). Environmental influences on the and M. Hausberger, Eds., Social Influences on Vocal Develop- structure of primate vocalizations. In D. Todt, P. Goedeking, ment. Cambridge: Cambridge University Press, pp. 249–273. and D. Symmes, Eds., Primate Vocal Communication. Berlin: Suga, N. (1988). Auditory neuroethology and speech process- Springer, pp. 51–68. ing: Complex sound processing by combination-sensitive Cheney, D. L., and R. M. Seyfarth. (1988). Assessment of meaning neurons. In G. M. Edelman, W. E. Gall, and W. M. Cowan, and the detection of unreliable signals by vervet monkeys. Ani- Eds., Auditory Function. New York: Wiley-Liss Press, pp. mal Behaviour 36: 477–486. 679–720. Cleveland, J., and C. T. Snowdon. (1981). The complex vocal rep- Williams, H., and F. Nottebohm. (1985). Auditory responses in ertoire of the adult cotton-top tamarin, Saguinus oedipus oedi- avian vocal motor neurons: A motor theory for song perception pus. Zeitschrift für Tierpsychologie 58: 231–270. in birds. Science 229: 279–282. DeVoogd, T. J., J. R. Krebs, S. D. Healy, and A. Purvis. (1993). Relations between song repertoire size and the volume of brain Animal Navigation nuclei related to song: Comparative evolutionary analyses amongst oscine birds. Proceedings of the Royal Society, Lon- don 254: 75–82. Animal navigation is similar to conventional (formalized) Endler, J. A. (1987). Predation, light intensity, and courtship navigation in at least three basic ways. First, it relies heavily behaviour in Poecilia reticulata. Animal Behaviour 35: 1376– on dead reckoning, the continual updating of position by 1385. summing successive small displacements (changes in posi- FitzGibbon, C. D., and J. W. Fanshawe. (1988). Stotting in Thomp- tion). In the limit (as the differences between successive son’s gazelles: An honest signal of condition. Behavioral Ecol- positions are made arbitrarily small), this process is equiva- ogy and Sociobiology 23: 69–74. lent to obtaining the position vector by integrating the veloc- Hauser, M. D., and P. Marler. (1993). Food-associated calls in ity vector with respect to time, which is why the process is rhesus macaques (Macaca mulatta). 1. Socioecological fac- Animal Navigation 25 also called path integration (see Gallistel 1990, chap. 4, for review of literature). Second, it only occasionally takes a fix, that is, estab- lishes position and heading (orientation) on a map using perceived distances and directions from mapped features of the terrain. While it is necessary from time to time to correct the inevitable cumulative error in dead reckoning, animals, like human navigators, often invert the process in the inter- vals between fixes, using their reckoned position and head- ing on their map to estimate their relation to surrounding features. In doing so, they appear to ignore the current testi- mony of their senses. Thus, when the dimensions of a famil- iar maze are altered, a rat moving rapidly through it collides with walls that come too soon, and turns into a wall where it expects an opening (Carr and Watson 1908). Indeed, it runs right over the pile of food that is its goal when it encounters that pile much sooner than expected (Stoltz and Lott 1964). Bats threading their way through obstacle courses bring their wings together over their head to squeeze through a gap that is no longer so narrow as to require this maneuver (Neuweiler and Möhres 1967). Figure 1. Conversion of egocentric position coordinates to a And third, it places relatively minor reliance on beacon common geocentric framework. The egocentric position vector of navigation, the following of sensory cues from a goal or its the landmark, with angle β (bearing of landmark) and magnitude immediate surroundings. Animals of widely diverse species d1 is first rotated so that it has angle η + β (heading plus bearing), locate a goal by the goal’s position relative to the general then added to the geocentric position vector of the animal (with compass angle γa and magnitude da), producing the geocentric framework provided by the mapped terrain (see Gallistel position vector for the landmark (with compass angle γl and 1990, chap. 5, for review), not by the sensory characteristics of the goal or its immediate surroundings. Indeed, when the magnitude dl). (Slightly modified from figure 1 in Gallistel and Cramer 1996. Used by permission of the authors and publisher.) two are placed in conflict, the animal goes to the place having the correct position in the larger framework, not to the place having the correct sensory characteristics. For example, if a chimpanzee sees food hidden in one of two differently col- From the above, it is clear that COGNITIVE MAPS are a ored containers whose positions are then surreptitiously inter- critical component in animal navigation, just as conven- changed, the chimpanzee searches for the food in the tional maps are in conventional navigation. A cognitive map container at the correct location rather than in the one with the is a representation of the layout of the environment. Its correct color (Tinkelpaugh 1932). In human toddlers, position properties and the process by which it is constructed are also takes precedence over the characteristics of the container. questions of basic importance. These cognitive maps are When a toddler is misoriented within a rectangular room, it known to be Euclidean, sense-preserving representations of ignores the container in which it saw a toy hidden, but which the environment, that is, they encode the metric relations it now mistakenly takes to be in the wrong location. It looks (distances and angles) and the sense relation (right versus instead in an altogether different container, which it takes to left). This is most readily demonstrated by showing an ani- be in the correct location, even though the child demonstrably mal the location of a hidden goal within a featureless rectan- remembers the appearance of the container in which it saw gular room, then inertially disorienting the animal (by slow the toy hidden (Hermer and Spelke 1996). rotation in the dark) before allowing it to search for the goal. The sensory cues from the goal itself appear to control Animals of diverse species search principally at the correct approach behavior only in the final approach to the goal, location and its rotational equivalent, for example, in the and even then, only when the goal is in approximately the correct corner and the corner diagonally opposite to it correct location. The same is true for the nearby landmarks (Cheng 1986; Hermer and Spelke 1996; Margules and Gal- with respect to which an animal more precisely locates its listel 1988). They rarely look elsewhere, for example, in the goal. It uses them as aids to locating its goal only if they other two corners. To distinguish between the diagonals of a occupy approximately the correct place in the larger frame- rectangle, however, they must record both metric relations work. Bees readily learn to search for food on one side of a and sense relations on their map of the rectangular room. landmark in one location but on the opposite side of the The short wall is on the left for one diagonal, regardless of identical landmark placed in a different location (Collett and which way one faces along that diagonal, but on the right for Kelber 1988). In effect, location confers identity, rather than the other diagonal. If an animal could not distinguish walls vice versa. A container or landmark with the wrong proper- based on their length (a metric relationship), or on which ties in the right place is taken to be the correct container or was to the left and which to the right (a sense relationship), landmark, while a container or landmark with the correct then it could not distinguish between the diagonals of a rect- properties in the wrong place is taken to be a different con- angle. tainer or different landmark. 26 Animal Navigation, Neural Networks Until recently, there were no suggestions about how ani- Animal Navigation, Neural Networks mals might construct a Euclidean map of their environ- ment. They only perceive small portions of it at any one time, so how can they perceive the relations between these Animals show a remarkable ability to navigate through their portions? Recently, it has been suggested that the dead environment. For example, many animals must cover large reckoning process is the key to the construction of the map regions of the local terrain in search of a goal (food, mates, (Gallistel 1990; Gallistel and Cramer 1996; McNaughton et etc.), and then must be able to return immediately and safely al. 1996) because it specifies the Euclidean relationship to their nesting spot. From one occasion to the next, animals between different points of view within a geocentric frame- seem to use varied, novel trajectories to make these searches, work, a system of coordinates anchored to the earth, as and may enter entirely new territory during part of the opposed to an egocentric framework, a system of coordi- search. Nonetheless, on obtaining the goal, they are typically nates anchored to the animal’s body (egocentric position able to calculate a direct route to return to their home base. vector in figure). Rotating the egocentric position vector by For many species, this ANIMAL NAVIGATION is thought to the animal’s heading (its orientation in the geocentric be based on two general abilities. The first, called “dead framework) and adding the rotated vector to the geocentric reckoning” (or “path integration”), uses information about position vector provided by the dead reckoning process car- the animal’s own movements through space to keep track of ries the representation of the perceived portion of the envi- current position and directional heading, in relation to an ronment into a common geocentric positional framework abstract representation of the overall environment. The sec- (landmark’s geocentric position vector in figure). This ond, landmark-based orientation, uses familiar environmen- method of map construction, in which the animal’s dead tal landmarks to establish current position, relative to reckoning automatically represents its position on its cog- familiar terrain. nitive map, tells us much about how the animal perceives Over the last few decades, some insight has been gained the shape of its environment. about how NEURAL NETWORKS in the mammalian brain might work to provide the basis for these abilities. In particu- See also ANIMAL NAVIGATION, NEURAL NETWORKS; lar, two specialized types of cells have been observed to pos- COGNITIVE ARTIFACTS; HUMAN NAVIGATION sess relevant spatial signals in the brains of navigating rats. —C. Randy Gallistel “Place cells,” originally discovered in the rat HIPPOCAM- PUS (O’Keefe and Dostrovsky 1971), fire whenever the ani- References mal is in one specific part of the environment. Each place cell has its own, unique region of firing. As the animal trav- Carr, H., and J. B. Watson. (1908). Orientation of the white rat. Journal of Comparative Neurology and Psychology 18: 27–44. els through the environment, these cells seem to form a Cheng, K. (1986). A purely geometric module in the rat’s spatial maplike representation, with each location represented by representation. Cognition 23: 149–178. neural activity in a specific set of place cells. Collett, T. S., and A. Kelber. (1988). The retrieval of visuo-spatial Remarkably, these cells also seem to use the two general memories by honeybees. Journal of Comparative Physiology, abilities mentioned above: dead reckoning and landmark- series A 163: 145–150. based orientation. Evidence that place cells use landmarks to Gallistel, C. R. (1990). The Organization of Learning. Cambridge, establish the place-specific firing patterns came from early MA: MIT Press. studies in which familiar landmarks were moved (e.g., Gallistel, C. R., and A. E. Cramer. (1996). Computations on met- O’Keefe and Conway 1978; Muller and Kubie 1987). For ric maps in mammals: Getting oriented and choosing a multi- example, in one experiment (Muller and Kubie 1987), rats destination route. Journal of Experimental Biology 199: 211– 217. foraged in a large, cylindrical apparatus, equipped with a sin- Hermer, L., and E. Spelke. (1996). Modularity and development: gle, white cue card on its otherwise uniformly gray wall. The case of spatial reorientation. Cognition 61: 195–232. When this single orienting landmark was moved to a differ- Margules, J., and C. R. Gallistel. (1988). Heading in the rat: Deter- ent location on the wall, this caused an equal rotation of the mination by environmental shape. Animal Learning and Behav- place cell firing fields. Evidence that the cells can also use ior 16: 404–410. dead reckoning, however, was obtained from studies in McNaughton, B. L., C. A. Barnes, J. L. Gerrard, K. Gothard, M. which the landmarks were removed entirely (e.g., Muller and W. Jung, J. J. Knierim, H. Kudrimoti, Y. Qin, W. E. Skaggs, M. Kubie 1987; O’Keefe and Speakman 1987). In this case, the Suster, and K. L. Weaver. (1996). Deciphering the hippocampal place cells were often able to maintain their firing patterns, polyglot: The hippocampus as a path integration system. Jour- so that they continued to fire in the same location, as the ani- nal of Experimental Biology 199: 173–185. Neuweiler, G., and F. P. Möhres. (1967). Die Rolle des Ortgedächt- mal made repeated, winding trajectories through the environ- nisses bei der Orientierung der Grossblatt-Fledermaus Megad- ment. It was reasoned that this ability must be based on a erma lyra. Zeitschrift für vergleichende Physiologie 57: 147– dead reckoning process because in the absence of orienting 171. landmarks, the only ongoing information about current posi- Stoltz, S. P., and D. F. Lott. (1964). Establishment in rats of a tion would have to be based on the animal’s own movements persistent response producing net loss of reinforcement. through space. Further support for the dead reckoning (path Journal of Comparative and Physiological Psychology 57: integration) process came from a study in which artificial 147–149. movement-related information was given directly to the ani- Tinkelpaugh, O. L. (1932). Multiple delayed reaction with chim- mal while it navigated (Sharp et al. 1995). Animals foraged panzee and monkeys. Journal of Comparative Psychology 13: in a cylinder with black and white stripes, of uniform width. 207–243. Animal Navigation, Neural Networks 27 region) that combines place or head direction information, Both activation of the vestibular system (indicating that the along with movement-related cues, to feed back onto the animal had moved through space), or rotation of the vertical place or head direction cells, permitting them to choose a stripes (as would happen due to the animal’s own movement new locational or directional setting in response to move- in relation to the stripes) could sometimes “update” the place ment. cell firing fields, so that they shifted the location of their While these complementary place and directional repre- fields, as though the animal had actually moved in the way sentations are thought to guide the animal’s overall naviga- suggested by the vestibular or optic flow input. Thus move- tional behavior (see O’Keefe and Nadel 1978), the ment-related inputs directly influence the positional setting mechanism is not yet clear. of the hippocampal place cells. The second type of navigation-related cells, known as See also COGNITIVE MAPS; COMPUTATION AND THE “head direction cells” (Taube, Muller, and Ranck 1990a), BRAIN; HUMAN NAVIGATION; SPATIAL PERCEPTION complement the place cells by signaling the animal’s current —Patricia E. Sharp directional heading, regardless of its location. Each head direction cell fires whenever the animal is facing one particu- References lar direction (over an approximately 90 degree range). Each has its own, unique, directional preference, so that each Blair, H. T. (1996). A thalamocortical circuit for computing direc- direction the animal faces is represented by activity in a par- tional heading in the rat. In D. S. Touretzky, M. C. Mozer, and ticular subset of head direction cells. These cells were ini- M. E. Hasselmo, Eds., Advances in Neural Information Pro- tially discovered in the postsubiculum (a brain region closely cessing, vol. 8, Cambridge, MA: MIT Press. related to the hippocampus), and have since been discovered Blair, H. T., and P. E. Sharp. (1996). Visual and vestibular influ- ences on head direction cells in the anterior thalamus of the rat. in several other anatomically related brain regions. Behavioral Neuroscience 110: 1–18. Although it might be thought that these directional cells McNaughton, B. L., C. A. Barnes, J. L. Gerrard, K. Gothard, M. derive a constant, earth-based orientation from geomagnetic W. Jung, J. J. Knierim, H. Kudrimoti, Y. Quin, W. E. Skaggs, cues, this seems not to be the case. Rather, like place cells M. Suster, and K. L. Weaver. (1995). Deciphering the hippo- (and the animal’s own navigational behavior), head direction campal polyglot: The hippocampus as a path integration sys- cells seem to use both landmark orientation and dead reckon- tem. Journal of Experimental Biology 199: 173–185. ing (Taube, Muller, and Ranck 1990b). For example, in a Muller, R. U., and J. L. Kubie. (1987). The effects of changes in familiar environment, these cells will rotate the preferred the environment on the spatial firing of hippocampal complex- direction when familiar landmarks are rotated, using the spike cells. Journal of Neuroscience 7: 1951–1968. landmarks to get a “fix” on the animal’s current directional O’Keefe, J., and D. H. Conway. (1978). Hippocampal place units in the freely moving rat: Why they fire where they fire. Experi- heading. When, however, all familiar landmarks are mental Brain Research 31: 573–590. removed, the cells retain the animal’s previously established O’Keefe, J., and J. Dostrovsky. (1971). The hippocampus as a spa- directional preference. And, again like place cells, head tial map: Preliminary evidence from unit activity in the freely direction cells use both vestibular and optic flow information moving rat. Brain Research 34: 171–175. to update their locational setting (Blair and Sharp 1996). O’Keefe, J., and L. Nadel. (1978). The Hippocampus as a Cogni- Theoretical models have been developed to simulate the tive Map. New York: Oxford. spatial firing properties of these cells (e.g., Blair 1996; O’Keefe, J., and A. Speakman. (1987). Single unit activity in the McNaughton et al. 1995; Skaggs et al. 1995; Redish, Elga, rat hippocampus during a spatial task. Experimental Brain and Touretzky 1997; Samsonovich and McNaughton 1997). Research 68: 1–27. Most of these models begin with the idea that the place and Redish, A. D., A. N. Elga, and D. S. Touretzky. (1997). A coupled attractor model of the rodent head direction system. Network 7: head direction cells are respectively linked together to form 671–686. stable attractor networks that can stabilize into a unitary rep- Samsonovich, A., and B. L. McNaughton. (1997). Path integration resentation any one of the possible places or directions. For and cognitive mapping in a continuous attractor neural network example, in the head direction cell system, cells represent- model. Journal of Neuroscience 17: 5900–5920. ing similar directional headings are linked together through Sharp, P. E., H. T. Blair, D. Etkin, and D. B. Tzanetos. (1995). predominantly excitatory connections, while cells repre- Influences of vestibular and visual motion information on the senting different directional headings are linked through spatial firing patterns of hippocampal place cells. Journal of inhibitory connections. This reflects the basic phenomenon Neuroscience 15: 173–189. that, at any one time, cells within one particular portion of Skaggs, W. E., J. J. Knierim, H. S. Kudrimoti, and B. L. McNaugh- the directional range (e.g., 0 to 90 degrees) will be active, ton. (1995). A model of the neural basis of the rat’s sense of direction. In G. Tesauro, D. S. Touretzky, and T. K. Lean, Eds., while all other head direction cells will be silent. Thus the Advances in Neural Information Processing Systems, vol. 7, stable attractor network, left on its own, will always settle Cambridge, MA: MIT Press. into a representation of one particular direction (place). To Taube, J. S., R. U. Muller, and J. B. Ranck, Jr. (1990a). Head direc- reflect the finding that place and head direction cells can be tion cells recorded from the postsubiculum in freely moving “set” by environmental landmarks, most models equip the rats: 1. Description and quantitative analysis. Journal of Neuro- cells with sensory inputs that can influence which particular science 10: 420–435. place or direction is represented. To reflect the finding that Taube, J. S., R. U. Muller, and J. B. Ranck, Jr. (1990b). Head direc- the navigation system can also be updated by movement tion cells recorded from the postsubiculum in freely moving related information, most models also incorporate an addi- rats: 2. Effects of environmental manipulations. Journal of Neu- tional layer of cells (in some other, as yet unidentified brain roscience 10: 436–447. 28 Animism Further Readings Wiener, S. I. (1993). Spatial and behavioral correlates of striatal neurons in rats performing a self-initiated navigation task. Blair, H. T., and P. E. Sharp. (1995a). Anticipatory firing of ante- Journal of Neuroscience 13: 3802–3817. rior thalamic head direction cells: Evidence for a thalamocorti- Wiener, S. I., V. Kurshunov, R. Garcia, and A. Berthoz. (1995). cal circuit that computes head direction in the rat. Journal of Inertial, substratal, and landmark cue control of hippocampal Neuroscience 15: 6260–6270. place cell activity. European Journal of Neuroscience 7: 2206– Brown, M. A., and P. E. Sharp. (1995b). Simulation of spatial 2219. learning in the Morris water maze by a neural network model of Zhang, K. (1996). Representation of spatial orientation by the the hippocampal formation and nucleus accumbens. Hippoc- intrinsic dynamics of the head direction cell ensemble: A the- ampus 5: 189–197. ory. Journal of Neuroscience 16: 2112–2126. Chen, L. L., L. H. Lin, C. A. Barnes, and B. L. McNaughton. (1994). Head direction cells in the rat posterior cortex: 2. Con- Animism tributions of visual and idiothetic information to the directional firing. Experimental Brain Research 101: 24–34. Chen, L. L., L. H. Lin, E. J. Green, C. A. Barnes, and B. L. Animism means labeling inanimate objects as living, attrib- McNaughton. (1994). Head direction cells in the rat posterior uting characteristics of animate objects (typically humans) cortex: 1. Anatomical distribution and behavioral modulation. to inanimate objects, and making predictions or explana- Experimental Brain Research 101: 8–23. tions about inanimate objects based on knowledge about Foster, T. C., C. A. Castro, and B. L. McNaughton. (1989). Spatial animate objects (again usually represented by human selectivity of rat hippocampal neurons: Dependence on pre- beings). Anthropomorphism or personification means the paredness for movement. Science 244: 1580–1582. extension of human attributes and behaviors to any nonhu- Gallistel, C. R. (1990). The Organization of Learning. Cambridge, mans. Thus animistic reasoning can be regarded as personi- MA: MIT. Goodridge, J. P., and J. S. Taube. (1995). Preferential use of the fication of an inanimate object. In both cases, assigning landmark navigational system by head direction cells. Behav- mental states (desires, beliefs, and consciousness) to inani- ioral Neuroscience 109: 49–61. mate objects, including extraterrestrial entities (e.g., the Gothard, K. M., K. M. Skaggs, K. M. Moore, and B. L. McNaugh- sun) and geographical parts (e.g., a mountain), provides the ton. (1996). Binding of hippocampal CA1 neural activity to most impressive example (“The sun is hot because it wants multiple reference frames in a landmark-based navigation task. to keep people warm”). Journal of Neuroscience 16: 823–835. The term animism was introduced by English anthropol- McNaughton, B. L. (1989). Neural mechanisms for spatial compu- ogists to describe mentalities of indigenous people living in tation and information storage. In L. A. Nadel, P. Cooper, P. small, self-sufficient communities. Although such usage Culicover, and R. Harnish, Eds., Neural Connections and Men- was severely criticized by Lévy-Bruhl (1910), the term tal Computations, Cambridge, MA: MIT Press, pp. 285–349. McNaughton, B. L., L. L. Chen, and E. J. Markus. (1991). “Dead became popular among behavioral scientists, as PIAGET reckoning,” landmark learning, and the sense of direction: a (1926) used it to characterize young children’s thinking. neurophysiological and computational hypothesis. Journal of Piaget and his followers (e.g., Laurendeau and Pinard 1962) Cognitive Neuroscience 3: 190–201. took animistic and personifying tendencies as signs of Mizumori, S. J. Y., and J. D. Williams. (1993). Directionally selec- immaturity, as reflecting the fact that young children have tive mnemonic properties of neurons in the laterodorsal nucleus not yet learned to differentiate between animate and inani- of the thalamus of rats. Journal of Neuroscience 13: 4015– mate objects or between humans and nonhumans. Chiefly 4028. because of methodological differences, a large number of Muller, R. U., J. L. Kubie, and J. B. Ranck, Jr. (1987). Spatial fir- studies on child animism inspired by Piaget, conducted in ing patterns of hippocampal complex-spike cells in a fixed the 1950s and early 1960s, obtained conflicting results as to environment. Journal of Neuroscience 7: 1935–1950. O’Keefe, J., and N. Burgess. (1996). Geometric determinants of the frequency of animistic responses (Richards and Siegler the place fields of hippocampal neurons. Nature 381: 425–428. 1984), but the results were discussed only within the Piaget- Quirk, G. J., R. U. Muller, and J. L. Kubie. (1990). The firing of ian framework. hippocampal place cells in the dark depends on the rat’s recent Since the 1980s, studies of young children’s biological experience. Journal of Neuroscience 10: 2008–2017. understanding or naive biology have shed new light on child Sharp, P. E., R. U. Muller, and J. L. Kubie. (1990). Firing proper- animism. A number of investigators have shown that even ties of hippocampal neurons in a visually-symmetrical stimulus young children possess the knowledge needed to differenti- environment: Contributions of multiple sensory cues and mne- ate between humans, typical nonhuman animate objects, monic processes. Journal of Neuroscience 10: 3093–3105. and inanimate ones (see COGNITIVE DEVELOPMENT). For Taube, J. S. (1995). Head direction cells recorded in the anterior example, Gelman, Spelke, and Meck (1983) found that even thalamic nuclei of freely moving rats. Journal of Neuroscience 15: 70–86. three-year-olds can almost always correctly attribute the Taube, J. S., and H. L. Burton. (1995). Head direction cell activity presence or absence of animal properties to familiar animals monitored in a novel environment and in a cue conflict situa- and nonliving things. Simons and Keil (1995) demonstrated tion. Journal of Neurophysiology 74: 1953–1971. that young children can distinguish between natural and Taube, J. S., J. P. Goodridge, E. J. Golub, P. A. Dudchenko, and R. artificial constituent parts of their bodies even when they do W. Stackman. (1996). Processing the head direction cell signal: not know specifics about them. Young children may even A review and commentary. Brain Research Bulletin 40: 0–10. assume that each animal and plant has its underlying essen- Touretzky, D. S., and A. D. Redish. (1996). A theory of rodent nav- tial nature (Gelman, Coley, and Gottfried 1994; see also igation based on interacting representations of space. Hippo- campus 6: 247–270. ESSENTIALISM). Animism 29 Then why do young children, even when they are intel- spread, but they are more about the metaphysical or imagina- lectually serious, make animistic or personifying remarks tive universe than about the real world (Atran 1990). Even fairly often, although not so often as Piaget claimed? What contemporary Japanese culture, outside the science classroom functions does the mode of reasoning behind animistic or does not consider it a silly idea that large, old inanimate enti- personifying errors have? Both Carey (1985) and Inagaki ties (e.g., giant rocks, mountains) have CONSCIOUSNESS. and Hatano (1987) propose that, though young children are See also CONCEPTUAL CHANGE; CULTURAL EVOLUTION; able to classify entities into ontological categories, when CULTURAL SYMBOLISM; CULTURAL VARIATION; MAGIC AND they have to infer an object’s unknown attributes or reac- SUPERSTITION; NATIVISM tions, the children apply their knowledge about human —Giyoo Hatano beings to other animate objects or even to inanimate objects. This is probably because they do not have rich categorical References knowledge, and thus have to rely on ANALOGY in infer- ences. Because they are intimately familiar with humans, Atran, S. (1990). Cognitive Foundations of Natural History. Cam- although necessarily novices in most other domains, they bridge: Cambridge University Press. can most profitably use their knowledge about humans as a Atran, S. (Forthcoming). Folk biology and the anthropology of sci- source analogue for making analogies. ence: Cognitive universals and cultural particulars. Brain and Inagaki and Hatano (1987) propose that animistic or per- Behavioral Sciences. sonifying tendencies of young children are products of their Barrett, J. L., and F. C. Keil. (1996). Conceptualizing a nonnatural entity: Anthropomorphism in God concepts. Cognitive Psy- active minds and basically adaptive natures. Young chil- chology 31: 219–247. dren’s personification or person analogies may lead them to Carey, S. (1985). Conceptual Change in Childhood. Cambridge, accurate predictions for animate objects phylogenetically MA: MIT Press. similar to humans. It can also provide justification for a vari- Gelman, R., E. Spelke, and E. Meck. (1983). What preschoolers ety of experiences, sometimes even with phylogenetically know about animate and inanimate objects. In D. Rogers and J. less similar objects, such as trees or flowers. Young children A. Sloboda, Eds., The Acquisition of Symbolic Skills. New may have learned these heuristic values through their prior York: Plenum Press, pp. 297–326. contacts with a variety of animate objects. The analogies Gelman, S. A., J. Coley, and G. M. Gottfried. (1994). Essentialist young children make may involve structurally inaccurate beliefs in children: The acquisition of concepts and theories. In mapping (e.g., mapping the relation between humans and L. A. Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: Domain Specificity in Cognition and Culture. Hillsdale, NJ: food to that between plants and water), and induce biased Erlbaum, pp. 341–365. reasoning (neglect of the roles of nutrients in the soil and Holyoak, K. J., and P. Thagard. (1995). Mental Leaps. Cambridge, photosynthesis). Although young children may carry anal- MA: MIT Press. ogy beyond its proper limits, and produce false inferences, Inagaki, K., and G. Hatano. (1987). Young children’s spontaneous they can generate “educated guesses” by analogies, relying personification as analogy. Child Development 58: 1013–1020. on their only familiar source analogue of a person (Holyoak Laurendeau, M., and A. Pinard. (1962). Causal Thinking in the and Thagard 1995). Animistic errors and overattribution of Child: A Genetic and Experimental Approach. New York: human characteristics to nonhuman animate objects should International Universities Press. therefore be regarded as accidental by-products of this rea- Lévy-Bruhl, L. (1910). How Natives Think. Translated by Princeton: soning process. Because their personification is subject to a Princeton University Press, 1985. Originally published as Les fonctions mentales dans les sociétés inférieures. Paris: Alcan. variety of constraints, such as checking the plausibility of the Mead, M. (1932). An investigation of the thought of primitive chil- inference against what is known about the target, it does not dren with special reference to animism. Journal of the Royal produce many personifying errors, except for assigning men- Anthropological Institute 62: 173–190. tal states to nonhumans. Piaget, J. (1926). The Child’s Conception of the World. Translated How can we explain animistic thinking among indigenous by Totowa, NJ: Rowman and Allanheld, 1960. Originally pub- adults? According to Atran (forthcoming), in cultures lished as La représentation du monde chez l’enfant. Paris: throughout the world it is common to classify all entities into Presses Universitaires de France. four ontological categories (humans, nonhuman animals, Richards, D. D., and R. S. Siegler. (1984). The effects of task plants, and nonliving things, including artifacts), and to requirements on children’s life judgments. Child Development arrange animals and plants hierarchically and more or less 55: 1687–1696. Simons, D. J., and F. C. Keil. (1995). An abstract to concrete shift accurately because such taxonomies are products of the in the development of biological thought: The inside story. human mind’s natural classification scheme (see also FOLK Cognition 56: 129–163. BIOLOGY). Because indigenous people generally possess rich knowledge about major animals and plants in their ecological Further Readings niche, their animistic and personifying remarks cannot result from having to rely on the person analogy, except for poorly Bullock, M. (1985). Animism in childhood thinking: A new look at understood nonnatural entities like God (Barrett and Keil an old question. Developmental Psychology 21: 217–225. 1996). Such remarks seem to be products of cultural beliefs, Dennis, W. (1953). Animistic thinking among college and univer- acquired through discourse about a specific class of entities. sity students. Scientific Monthly 76: 247–249. Mead’s early observation (1932) that children in the Manus Dolgin, K. G., and D. A. Behrend. (1984). Children’s knowledge tribes were less animistic than adults lends support to this about animates and inanimates. Child Development 55: 1646– conjecture. Animistic or personifying explanations are wide- 1650. 30 Anomalous Monism (The term “anomalous monism” and the argument were Inagaki, K. (1989). Developmental shift in biological inference processes: From similarity-based to category-based attribution. introduced in Davidson 1970.) Human Development 32: 79–87. The three premises are not equally plausible. (1) is obvi- Looft, W. R., and W. H. Bartz. (1969). Animism revived. Psycho- ous. (2) has seemed true to many philosophers; HUME and logical Bulletin 71: 1–19. KANT are examples, though their reasons for holding it were Massey, C. M., and R. Gelman. (1988). Preschoolers’ ability to very different (Davidson 1995). It has been questioned by decide whether a photographed unfamiliar object can move others (Anscombe 1971; Cartwright 1983). A defense of (2) itself. Developmental Psychology 24: 307–317. would begin by observing that physics is defined by the aim of discovering or devising a vocabulary (which among other Anomalous Monism things determines what counts as an event) which allows the formulation of a closed system of laws. The chief argument for the nomological and definitional irreducibility of mental Anomalous monism is the thesis that mental entities concepts to physical is that mental concepts, insofar as they (objects and events) are identical with physical entities, involve the propositional attitudes, are normative, while the but under their mental descriptions mental entities are nei- concepts of a developed physics are not. This is because ther definitionally nor nomologically reducible to the propositions are logically related to one another, which vocabulary of physics. If we think of views of the relation places a normative constraint on the correct attribution of between the mental and the physical as distinguished, first, attitudes: since an attitude is in part identified by its logical by whether or not mental entities are identical with physi- relations, the pattern of attitudes in an individual must cal entities, and, second, divided by whether or not there exhibit a large degree of coherence. This does not mean that are strict psychophysical laws, we get a fourfold classifica- people may not be irrational, but the possibility of irratio- tion: (1) nomological monism, which says there are strict nality depends on a background of rationality (Davidson correlating laws, and that the correlated entities are identi- 1991). cal (this is often called materialism); (2) nomological (3) rules out two forms of REDUCTIONISM: reduction of dualism (interactionism, parallelism, epiphenomenalism); the mental to the physical by explicit definition of mental (3) anomalous dualism, which holds there are no laws cor- predicates in physical terms (some forms of behaviorism relating the mental and the physical, and the substances suggest such a program), and reduction by way of strict are discrete (Cartesianism); and (4) anomalous monism, bridging laws—laws that connect mental with physical which allows only one class of entities, but denies the pos- properties. (1)–(3) do, however, entail ontological reduction, sibility of definitional and nomological reduction. It is because they imply that mental entities do not add to the claimed that anomalous monism is the answer to the MIND- physical furniture of the world. The result is ontological BODY PROBLEM, and that it follows from certain premises, monism coupled with conceptual dualism. (Compare the main ones being: Spinoza’s metaphysics.) Anomalous monism is consistent 1. All mental events are causally related to physical events. with the thesis that psychological properties or predicates For example, changes in PROPOSITIONAL ATTITUDES are supervenient on physical properties or predicates, in this such as beliefs and desires cause agents to act, and sense of SUPERVENIENCE: a property M is supervenient on a actions cause changes in the physical world. Events in set of properties P if and only if M distinguishes no entities the physical world often cause us to alter our beliefs, not distinguishable by the properties in P (there are other intentions and desires. definitions of supervenience). 2. If two events are related as cause and effect, there is a A widely accepted criticism of anomalous monism is that strict law under which they may be subsumed. This it makes MENTAL CAUSATION irrelevant because it is the means that cause and effect have descriptions that instan- physical properties of events that do the causing (Kim tiate a strict law. A strict law is one that makes no use of 1993). The short reply is that it is events, not properties, that open-ended escape clauses such as “other things being equal.” Such laws must belong to a closed system: what- are causes and effects (Davidson 1993). If events described ever can affect the system must be included in it. in physical terms are effective, and they are identical with 3. There are no strict psychophysical laws (laws connecting those same events described in psychological terms, then or identifying mental events under their mental descrip- the latter must also be causally effective. The vocabularies tions with physical events under their physical descrip- of physics and of psychology are irreducibly different ways tions). From this premise and the fact that events of describing and explaining events, but one does not rule described in psychological terms do not belong to a out, or supercede, the other. closed system, it follows that there are no strict PSYCHO- See also AUTONOMY OF PSYCHOLOGY; ELIMINATIVE LOGICAL LAWS; psychological laws, if carefully stated, MATERIALISM; PHYSICALISM; RADICAL INTERPRETATION must always contain ceteris paribus clauses. —Donald Davidso Take an arbitrary mental event M. By (1), it is causally connected with some physical event P. By (2), there must be References a strict law connecting M and P; but by (3), that law cannot be a psychophysical law. Because only physics aims to pro- Anscombe, G. E. M. (1971). Causality and Determination. Cam- vide a closed system governed by strict laws, the law con- bridge: Cambridge University Press. necting M and P must be a physical law. But then M must Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: have a physical description—it must be a physical event. Oxford University Press. Aphasia 31 comprehension dissociation, with impaired writing but Davidson, D. (1980). Mental events. In D. Davidson, Essays on Actions and Events. Oxford: Oxford University Press. often less severe disturbance to reading. Because Broca’s Davidson, D. (1991). Three varieties of knowledge. In A. Phillips area lies next to motor areas for muscular control of speech Griffiths, Ed., A. J. Ayer: Memorial Essays. Royal Institute of (lips, palate, vocal chords, jaw), early assumptions were that Philosophy Supplement, vol. 30. Cambridge: Cambridge Uni- Broca’s area was a center for the encoding of articulated versity Press. speech. Davidson, D. (1993). Thinking causes. In J. Heil and A. Mele, Wernicke’s aphasia, by contrast, results from damage to Eds., Mental Causation. Oxford: Oxford University Press. the posterior region of the left hemisphere, specifically in Davidson, D. (1995). Laws and cause. Dialectica 49: 263–279. the areas adjacent to the primary auditory cortex on the pos- Kim, J. (1993). Can supervenience and “non-strict laws” save terior portion of the superior left temporal gyrus. Patients anomalous monism? In J. Heil and A. Mele, Eds., Mental Cau- with Wernicke’s aphasia produce speech that is fluent, sation. Oxford: Oxford University Press. effortless, and rapid (hence the term fluent aphasia). The Anthropology content of their productions, however, is remarkably “empty” and filled with inappropriate word use (verbal paraphasias). Importantly, patients with Wernicke’s aphasia See INTRODUCTION: CULTURE, COGNITION, AND EVOLU- demonstrate a profound comprehension deficit—often even TION; COGNITIVE ANTHROPOLOGY; CULTURAL RELATIVISM; at the single word level. Both writing and (particularly) ETHNOPSYCHOLOGY READING are standardly highly impaired. The discovery of a link between these two distinct types Antirealism of language disruption and two distinct brain areas led to neuroanatomical-connectionist models of brain organization for language (Wernicke 1874; Lichtheim 1884), which, in See REALISM AND ANTIREALISM one form or another, have been pervasive through to the later twentieth century (e.g., GESCHWIND 1979). These mod- Aphasia els attempted to capture and predict the wide variety of lan- guage deficits that had been reported throughout the Aphasia (acquired aphasia) is a disorder of communication literature in terms of “disconnection” syndromes. Thus, for caused by brain damage. The acquired aphasias constitute a example, the early Wernicke-Lichtheim connectionist family of disruptions to comprehension and production of model easily represented the fact that damage to the arcuate language in both oral and written form. Much of the history fasciculus (which roughly connects Wernicke’s to Broca’s of aphasia has been (and continues to be) concerned with area) leads to the inability to repeat language, a syndrome attempts to characterize the natural organization of language that was termed conduction aphasia. (For a complete review as revealed by the selective manner in which language of aphasic syndromes, see Goodglass 1993.) breaks down under focal brain damage. Early versions of such models were modality-based, The history of the field has precursors in the very earliest viewing Broca’s and Wernicke’s areas as essentially motor recordings of medicine, but largely achieved modern form and sensory language areas, respectively. Broca’s area was with the work of Paul BROCA (1861) and Carl Wernicke considered primarily responsible for the encoding of articu- (1874). From this clinical work, two generalizations con- latory form for production (speaking), and Wernicke’s area cerning the brain-language relationship were derived that was considered primarily responsible for the organization of have become canonical in the field. First, it was documented language perception (listening/understanding). that lesions to areas in the left, but not right, cerebral hemi- However, these connectionist/associationist approaches sphere standardly result in language disruption (leading to were criticized nearly from their inception as oversimplifi- the concept of unilateral cerebral dominance for language; cations that did not capture the cognitive and conceptual e.g., Broca 1865). Second, within the left hemisphere, complexity of the behavioral disruptions found in even the lesions to different areas result in reliably different patterns “classic” (Broca’s and Wernicke’s) aphasias (e.g., Jackson of language loss (e.g., Wernicke 1874). 1878; Head 1926; Pick 1931; Goldstein 1948; Luria 1966). Thus, damage to what has become known as Broca’s Such criticisms led to changes in the postulated nature of area, in the lower portion of the left frontal lobe (more par- the “nodes” underlying anatomical-connectionist models (or ticularly, the opercular and triangular parts of the inferior to nonconnectionist characterizations entirely), with move- frontal gyrus, including the foot of the third frontal convolu- ment toward more linguistically and cognitively relevant tion, and extending into subcortical white matter), produces characterizations. clinical observations of difficulty in articulation and produc- Zurif, Caramazza, and Myerson (1972) were major modern tion of speech with relative (but not complete) sparing of proponents of this movement, with empirical demonstrations comprehension, resulting in what has come to be called of an “overarching agrammatism” underlying the deficit in Broca’s aphasia. Patients with damage to this area produce many instances of Broca’s aphasia. They demonstrated that little (or at least labored) speech, which is poorly articulated not only was production in these patients “agrammatic,” but and telegraphic, involving omission of so-called function or that comprehension also suffered from a disruption to the closed-class words (articles, auxiliaries, etc.). Their speech comprehension of structural relationships, particularly when relies heavily on nouns, and (to a far smaller degree) verbs. closed-class function words were critical to interpretation or Their written communication follows this same production- when disambiguating semantic information was unavailable. 32 Archaeology Similarly, a modality-overarching difficulty in semantical Broca, P. (1961). Perte de la parole. Ramollissement chronique et destruction partielle du lobe anterieur gauche du cerveau. Bul- interpretation was claimed for patients with damage to Wer- letin de la Societe d’Anthropologie 2: 235. nicke’s area. In the early versions of this “linguistic-relevance” Broca, P. (1865). Sur la faculte du langage articule. Bulletin de la approach to aphasia, the loci of damage were described in Societe d’Anthropologie 6: 337–393. terms of “loss of knowledge” (e.g., loss of syntactic rules). Geschwind, N. (1979). Specializations of the human brain. Scien- However the claim of knowledge-loss proved empirically dif- tific American September: 180–201. ficult to sustain, whereas descriptions in terms of disruptions Goodglass, H. (1993). Understanding Aphasia. San Diego: Aca- to the processing (access, integration) of linguistically relevant demic Press. representations (words, SYNTAX, SEMANTICS) was empirically Head, H. (1926). Aphasia and Kindred Disorders of Speech. New demonstrable. In support of such modality-independent York: Macmillan. descriptions of aphasia, this same distribution of deficits has Jackson, J. H. (1878). On affections of speech from disease of the brain. Brain 1: 304–330. been shown in languages that do not rely on the auditory/oral Lichtheim, O. (1984). On aphasia. Brain 7: 443–484. modality. Studies of SIGN LANGUAGES (a visuospatial, nonau- Luria, A. R. (1966). Higher Cortical Functions in Man. New York: ditory language) in deaf signers have demonstrated that left- Basic Books. hemisphere damage results in marked impairment to sign Swinney, D., E. B. Zurif, P. Prather, and T. Love. (1996). Neuro- language abilities, but right hemisphere damage does not logical distribution of processing resources underlying lan- (despite the fact that such damage disrupts non-language spa- guage comprehension. Journal of Cognitive Neuroscience 8(2): tial and cognitive abilities). Further, syntactic versus semantic 174–184. sign-language disruptions have been shown to pattern neu- Wernicke, C. (1874). Der aphasiche Symptomenkomplex. Breslau: roanatomically with the language problems accompanying Cohn und Weigert. Republished as: The aphasia symptom com- damage to Broca’s and Wernicke’s areas, respectively (Bel- plex: A psychological study on an anatomical basis. In Wer- nicke’s Works on Aphasia. The Hague: Mouton. lugi, Poizner, and Klima 1989). Zurif, E., A. Caramazza, and R. Myerson. (1972). Grammatical judg- In all, much work has demonstrated that characteriza- ments of agrammatic aphasics. Neuropsychologia 10: 405–417 tions of the functional commitment of brain architecture to language as revealed via the aphasias requires explicit con- Further Readings sideration of the abstract, modality-neutral functional archi- tecture (syntax, etc.) of language. Bellugi, U., and G. Hickok. (1994). Clues to the neurobiology of The use of behavioral techniques that examine language language. In Broadwell, Ed., Neuroscience, Memory and Lan- processing as it takes place in real time (online techniques; guage. Washington: Library of Congress. e.g., Swinney et al. 1996) have recently served to further Goldstein, K. (1948). Language and Language Disturbances. New York: Grune and Stratton. detail the brain-language relationships seen in aphasia. Goodglass, H., and E. Kaplan. (1972). The Assessment of Aphasia This work has demonstrated disruptions to functional sys- and Related Disorders. Philadelphia: Lea and Febiger. tems underlying language at finely detailed levels of lin- Jackson, J. H. (1915). Reprints of some Hughlings Jacksons papers guistic processing/analysis, even providing a basis for the on affections of speech. Brain 38: 28–190. argument that some disruptions underlying “classic” syn- Jakobson, R. (1956). Two aspects of language and two types of dromes may represent, at least partially, disruptions to ele- aphasic disturbances. In R. Jakobson and M. Halle, Eds., Fun- mental processing resources that are recruited by the damentals of Language. The Hague: Mouton. language system (MEMORY, ATTENTION, access, etc.). With Marie, P. (1906). Revision de la question de l’aphasie: La troisieme the details provided by these temporally fine-grained circonvolution frontale gauche ne jose aucun role special dans examinations of aphasias and by modern brain imaging, la fonction du langage. Semaine Medicale 26: 241. Pick, A. (1931). Aphasie. In O. Bumke and O. Foerster, Eds., the apparent lack of homogeneity of the language disrup- Handbuch der normalen und pathologischen Physiologie, vol. tions found in aphasic syndromes (including the many 15, Berlin: Springer, pp. 1416–1524. putative aphasic syndromes not associated with Broca’s or Sarno, M. T., Ed. (1981). Acquired Aphasia. New York: Academic Wernicke’s areas) appears on course to being better under- Press. stood. It has led, on one hand, to increasing examination of Schwartz, M., M. Linebarger, and E. Saffran. (1985). The status of individual cases of aphasia for determination of “new” the syntactic deficit theory of agrammatism. In M. L. Kean, aspects of the brain-language relationship (and, to more Ed., Agrammatism. Orlando: Academic Press. cautious claims about group/syndrome patterns), and on Swinney, D., E. B. Zurif, and J. Nicol. (1989). The effects of focal the other hand, to new models of language, based increas- brain damage on sentence processing: An examination of the ingly on verifiable language behaviors as revealed by neurological organization of a mental module. Journal of Cog- nitive Neuroscience 1: 25–37. “anomalous” aphasic cases. Zurif, E. B., and D. Swinney. (1994). The neuropsychology of lan- See also GRAMMAR, NEURAL BASIS OF; HEMISPHERIC guage. In Handbook of Psycholinguistics. San Diego: Aca- SPECIALIZATION; LANGUAGE IMPAIRMENT, DEVELOPMEN- demic Press. TAL; LANGUAGE, NEURAL BASIS OF; SENTENCE PROCESSING —David A. Swinney Archaeology References SeeARTIFACTS AND CIVILIZATION; COGNITIVE ARCHAEOL- Bellugi, U., H. Poizner, and E. Klima. (1989). Language, modality and the brain. Trends in Neurosciences 12(10): 380–388. OGY; TECHNOLOGY AND HUMAN EVOLUTION Articulation 33 for the forthcoming rounded vowel. This phenomenon is Architecture called coarticulation and has been one of the most challeng- ing issues in speech production theory. Coarticulation is not See COGNITIVE ARCHITECTURE restricted to immediately adjacent sounds and may, in fact, extend over several segments and even cross syllable and Art word boundaries. The complex overlapping of articulatory movements has been the subject of considerable research, as summarized by Fowler and Saltzman (1993) and by Kent See PICTORIAL ART AND VISION and Minifie (1977). Coarticulation is an obstacle to segmen- tation, or the demarcation of speech behavior into discrete Articulation units such as phonemes and words. With the exception of SIGN LANGUAGES used by people Articulation means movement. In speech, articulation is the who are deaf, speech is the primary means of communica- process by which speech sounds are formed. The articulators tion in all human communities. Speech is therefore closely are the movable speech organs, including the tongue, lips, jaw, related to language (and to the auditory perception of lan- velum, and pharynx. These organs, together with related guage) and is often the only means by which a particular tissues, comprise the vocal tract, or the resonating cavities of language can be studied, because the majority of the world’s speech production that extend from the larynx (voice box) to languages do not have a written form. Speech appears to be the lips or nostrils. Human speech production is accomplished unique to humans (see ANIMAL COMMUNICATION). Because by the coordination of muscular actions in the respiratory, speech is harnessed to language, it is difficult or impossible laryngeal, and vocal tract systems. Typically, the word to gain a deep understanding of speech apart from its lin- articulation refers to the functions of the vocal tract, but guistic service. As Fujimura (1990) observed, “While speech production requires the action of all three systems. A speech signals convey information other than linguistic full account of speech production would go beyond codes, and the boundary between linguistic and extra- or articulation to include such topics as intonation and emotional paralinguistic issues may not be clearcut, there is no ques- expression. Essentially, articulation is the means by which tion that the primary goal of speech research is to under- speech is formed to express language (see LANGUAGE stand the relation of the units and organization of linguistic PRODUCTION and LANGUAGE AND COMMUNICATION). forms to the properties of speech signals uttered and per- Articulation is a suitable topic for cognitive science for ceived under varying circumstances” (p. 244). The output of several reasons, but especially because it is (1) arguably the the phonological component of the grammar has often been most precisely performed of human movements, (2) a serial assumed as the input to the system that regulates speech behavior of exceptional complexity, (3) the most natural production (see PHONETICS and PHONOLOGY). means of language expression in all communities except Because the speech signal is perishable, expression and people with impairments of hearing, and (4) a uniquely perception of its serial order are essential to communication human behavior linked to a variety of other accomplish- by speech. In his classic paper, LASHLEY (1951) considered ments. speech as exemplary of the problem of serial order in human Ordinary conversational speech is produced at rates of behavior. He proposed three mechanisms for the control of five to ten syllables per second, or about twenty to thirty seriation: determining tendency (the idea to be expressed), phonemes (sound units that distinguish words) per second. activation of the selected units (meaning that they are Individual speech sounds therefore have an average duration primed for use but not yet serially ordered), and the schema of approximately fifty milliseconds. This rapid rate has been of order (or the syntax of the act that finally yields a serial emphasized in studies of speech perception because no ordering of the intended utterance). Lashley’s insights illu- other sound sequence can be perceived at comparable rates minate some of the major cognitive dimensions of articula- of presentation (Liberman et al. 1967). The rapid rate is tion, and Lashley’s ideas resonate in contemporary studies impressive also from the perspective of production and the of speech. One area in particular is the study of sequencing motor control processes it entails. Each sound must be errors (e.g., substitutions, deletions, and exchanges of seg- uttered in the correct sequence, and each, in turn, requires ments) in both normal and pathological speech. These errors the precise timing of the movements that distinguish it from have attracted careful study because of the belief that the other sounds. Although a given sound can be prototypically mistakes in the motor output of speech can reveal the under- defined by its associated movements (e.g., closure of the lying organization of speech behavior. Large corpora of lips and laryngeal vibrations for the b in boy), the actual pat- speech errors have been collected and analyzed in attempts tern of movements varies with other sounds in the sequence to discover the structures of speech organization (Fromkin to be produced (the phonetic context). Generally, articula- 1980). But this is only part of the problem of serial order in tory movements overlap one another and can be mutually speech. It is also necessary to understand how individual adjusted. At any one instant, the articulators may appear to movements are coordinated to meet the needs of intelligibil- be simultaneously adjusted to the requirements of two or ity while being energetically efficient (Kelso, Saltzman, and more sounds. For example, the s sound in the word stew is Tuller 1986; MacNeilage 1970). typically produced with lip rounding, but the s sound in the A number of laboratory techniques have been developed word stay is not. The reason for this difference is that the s to study speech production. The two major methodologies sound in the word stew anticipates the lip rounding required are physiologic and acoustic. Physiological methods are 34 Articulation diverse because no single method is suited to study the dif- Fromkin, V. A., Ed. (1980). Errors in Linguistic Performance: Slips of the Tongue, Ear, Pen and Hand. New York: Academic Press. ferent structures and motor systems involved in speech. Fujimura, O., and D. Erickson. (1997). Acoustic phonetics. In W. Among the methods used are electromyography, aerody- J. Hardcastle and J. Laver, Eds., Handbook of Phonetic Sci- namics, various kinds of movement transduction, X-ray, and ences, pp. 65–115. Cambridge, MA: Blackwell. photoelectrical techniques (Stone 1997). Of these, X-ray Goodell, E. W., and M. Studdert-Kennedy. (1993). Acoustic evi- techniques have provided the most direct information, but, dence for the development of gestural coordination in the to avoid the hazards of X-ray exposure, investigators are speech of 2-year-olds: a longitudinal study. Journal of Speech using alternative methods such as the use of miniature mag- and Hearing Research 36: 707–727. netometers. Acoustic studies offer the advantages of econ- Guenther, F. H. (1995). Speech sound acquisition, coarticulation omy, convenience, and a focus on the physical signal that and rate effects in a neural network model of speech produc- mediates between speaker and listener. Acoustic methods tion. Psychological Review 102: 594–621. Henke, W. L. (1966). Dynamic Articulatory Model of Speech Pro- are limited to some degree because of uncertainties in infer- duction Using Computer Simulation. Ph.D. diss., MIT. ring articulatory actions from the acoustic patterns of Jordan, M. I. (1991). Serial order: a parallel distributed processing speech (Fant 1970), but acoustic analysis has been a primary approach. In J. L. Elman and D. E. Rumelhart, Eds., Advances source of information on articulation and its relation to in Connectionist Theory: Speech, Hillsdale, NJ: Erlbaum, pp. speech perception (Fujimura and Erickson 1997, Stevens 214–249. 1997). Kelso, J. A. S., E. L. Saltzman, and B. Tuller. (1986). The dynami- Among the most influential theories or models of articu- cal perspective on speech production: data and theory. Journal lation have been stage models, dynamic systems, and con- of Phonetics 14: 29–59. nectionist networks. In stage models, information is Kent, R. D., and F. D. Minifie. (1977). Coarticulation in recent successively processed in serially or hierarchically struc- speech production models. Journal of Phonetics 5: 115–133. Lashley, K. (1951). The problem of serial order in behavior. In L. tured components (Meyer and Gordon 1985). Dynamic sys- A. Jeffress, Ed., Cerebral Mechanisms in Behavior, pp. 506– tems theories seek solutions in terms of task-dependent 528. New York: Wiley. biomechanical properties (Kelso, Saltzman, and Tuller Liberman, A. M., F. S. Cooper, D. P. Shankweiler, and M. Stud- 1986). Connectionist networks employ massively parallel dert-Kennedy. (1970). Perception of the speech code. Psycho- architectures that are trained with various kinds of input logical Review 74: 431–461. information (Jordan 1991). Significant progress has been Meyer, D. E., and P. C. Gordon. (1985). Speech production: motor made in the computer simulation of articulation, beginning programming of phonetic features. Journal of Memory and in the 1960s (Henke 1966) and extending to contemporary Language 24: 3–26. efforts that combine various knowledge structures and con- MacNeilage, P. (1970). Motor control of serial ordering of speech. trol strategies (Saltzman and Munhall 1989, Guenther 1995, Psychological Review 77: 182–196. Saltzman, E. L., and K. G. Munhall. (1989). A dynamical approach Wilhelms-Tricarico 1996). This work is relevant both to the to gestural patterning in speech production. Ecological Psy- understanding of how humans produce speech and to the chology 1: 333–382. development of articulatory speech synthesizers (see Shattuck-Hufnagel, S. (1983). Sublexical units and suprasegmental SPEECH SYNTHESIS). structure in speech production planning. In P. MacNeilage, Ed., A major construct of recent theorizing about speech The Production of Speech, New York: Springer, pp. 109–136. articulation is the gesture, defined as an abstract character- Stevens, K. N. (1997). Articulatory-acoustic-auditory relation- ization of an individual movement (e.g., closure of the lips). ships. In W. J. Hardcastle and J. Laver, Eds., Handbook of Pho- It has been proposed that gestures for individual articulators netic Sciences, Cambridge, MA: Blackwell, pp. 462–506. are combined in a motor score that specifies the movements Stone, M. (1997). Laboratory techniques for investigating speech for a particular phonetic sequence. A particularly appealing articulation. In W. J. Hardcastle and J. Laver, Eds., Handbook of Phonetic Sciences, Cambridge, MA: Blackwell, pp. 11-32. property of the gesture is its potential as a construct in pho- Wilhelms-Tricarico, R. (1996). A biomechanical and physiologi- nology (Browman and Goldstein 1986), speech production cally-based vocal tract model and its control. Journal of Pho- (Saltzman and Munhall 1989), SPEECH PERCEPTION (Fowler netics 24: 23–38. 1986), and speech development in children (Goodell and Studdert-Kennedy 1993). Further Readings See also PHONOLOGY, ACQUISITION OF; PHONOLOGY, NEURAL BASIS OF; SPEECH RECOGNITION IN MACHINES Fant, G. (1980). The relations between area functions and the acoustic signal. Phonetica 37: 55–86. —Raymond D. Kent Fujimura, O. (1990). Articulatory perspectives of speech organi- zation. In W. J. Hardcastle and J. Laver, Eds., Speech Produc- tion and Speech Modelling. Dordrecht; Kluwer Academic References Press, pp. 323-342. Browman, C., and L. Goldstein. (1986). Towards an articulatory Kent, R. D., S. G. Adams, and G. S. Turner. (1996). Models of phonology. In C. Ewan and J. Anderson, Eds., Phonology Year- speech production. In N. J. Lass, Ed., Principles of Experimen- book 3, pp. 219–252. Cambridge: Cambridge University Press. tal Phonetics. St. Louis: Mosby, pp. 3–45. Fowler, C. A. (1986). An event approach to the study of speech Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. perception from a direct-realist perspective. Journal of Phonet- Cambridge, MA: MIT Press. ics 14: 3–28. Kent, R. D., B. S. Atal, and J. L. Miller, Eds. (1991). Papers in Fowler, C. A., and E. Saltzman. (1993). Coordination and coarticu- Speech Communication: Speech Production. Woodbury, NY: lation in speech production. Language and Speech 36: 171–195. Acoustical Society of America. Artifacts and Civilization 35 levy taxes and impose forced labor; in other words, they gave Lindblom, B. E. F., and J. E. F. Sundberg. (1971). Acoustical con- sequences of lip, tongue, jaw and larynx movement. Journal of the temple institutional control over manpower and the pro- the Acoustical Society of America 50: 1166–1179. duction of real goods. The accumulation of wealth in the Lofqvist, A. (1997). Theories and models of speech production. In hands of a ruling priesthood bolstered the development of W. J. Hardcastle and J. Laver, Eds., Handbook of Phonetic Sci- monumental architecture and the arts. More importantly, the ences. Cambridge, MA: Blackwell, pp. 405-426. temple economy fostered long distance trade and critical Mermelstein, P. (1973). Articulatory model of speech production. industries such as metallurgy (Moorey 1985). In turn, metal Journal of the Acoustical Society of America 53: 1070–1082. tools revolutionized crafts. For example, metal saws could Ohman, S. E. G. (1967). Numerical model of coarticulation. Jour- cut wooden planks into circular shapes, allowing such inven- nal of the Acoustical Society of America 41: 310–320. tions as the wheel (Littauer and Crauwel 1979). Metal weap- Perkell, J. S. (1997). Articulatory processes. In W. J. Hardcastle ons transformed warfare, leading to conquests far and near and J. Laver, Eds., Handbook of Phonetic Sciences. Cambridge, MA: Blackwell, pp. 333-370. that could be administered with tokens and economic tablets Smith, A. (1992). The control of orofacial movements in speech. (Algaze 1993). Finally, tokens and writing fostered new cog- Critical Reviews in Oral Biology and Medicine 3: 233–267. nitive skills and thereby transformed the way people thought Stevens, K. N. (1989). On the quantal nature of speech. Journal of (Schmandt-Besserat 1996). Phonetics 17: 3–46. Starting with the beginning of agriculture ca. 8000 B.C., clay tokens of multiple shapes were used for counting and Artifacts accounting goods. They were the first code or system for storing/communicating information: each token shape rep- resented one unit of merchandise, for example, a cone and a See ARTIFACTS AND CIVILIZATION; COGNITIVE ARTIFACTS; sphere stood, respectively, for a small and a large measure HUMAN NAVIGATION; TECHNOLOGY AND HUMAN EVOLU- of grain, and an ovoid for a jar of oil (figure 1). The tokens TION represent a first stage of abstraction. They translated daily- life commodities into miniature, mostly geometric counters, Artifacts and Civilization removing the data from its context and the knowledge from the knower. However, the clay counters represented plural- The development of civilization meant more than social and ity concretely, in one-to-one correspondence. Three jars of technological innovations. It required the acquisition of com- oil were shown by three ovoids, as in reality. plex cognitive processes. In particular, the ability to manipu- About 3500–3300 B.C., envelopes in the form of hollow late data abstractly was key to the development of urban clay balls were invented to keep in temple archives the society. This is illustrated by the evolution of artifacts for tokens representing unfinished transactions (tax debts?). For counting and accounting associated with the rise of the very convenience, the accountants indicated the tokens hidden first civilization in Sumer, Mesopotamia, about 3300–3100 inside the envelopes by impressing them on the outside. The B.C. Here, the development of a system of clay tokens and its two-dimensional markings standing for three-dimensional final transmutation into writing on clay tablets document the tokens represented a second step of abstraction. A third importance for an administration to process large amounts of level of abstraction was reached ca. 3200–3100 B.C., when information in ever greater abstraction (Schmandt-Besserat solid clay balls—tablets—did away with the actual tokens, 1992). Tokens and economic clay tablets made it possible to only displaying the impressions. The impressed markings Figure 1. Tokens held in an envelope from Susa, present-day Iran, impressed markings represented units of goods concretely, in one- ca. 3300 B.C. The large and small cones stood for large and small to-one correspondence. Published in Schmandt-Besserat, D. measures of grain and each of the lenticular disks represented a (1992), Before Writing, vol. 1. Austin: University of Texas Press, p. flock of animals (10?). The markings impressed on the outside of 126, fig. 73. Courtesy Musée du Louvre, Département des the envelope correspond to the tokens inside. Both tokens and Antiquités Orientales. 36 Artifacts and Civilization still represented numbers of units of goods concretely, in one-to-one correspondence. Three jars of oil were shown by three impressions of the ovoid token. Each impressed mark- ing therefore continued to fuse together the concept of the item counted (jar of oil) with that of number (one), without the possibility of dissociating them. The fourth step, the abstraction of numbers, ca. 3100 B.C., coincided with pictography—signs in the shape of tokens but traced with a stylus, rather than impressed. These incised signs were never repeated in one-to-one correspon- dence. Numbers of jars of oil were shown by the sign for jar of oil preceded by numerals (figure 2). The symbols to express abstract numbers were not new. They were the former units of grain: the impression of a cone token that formerly was a small measure of grain stood for 1 and that of a sphere representing a large measure of grain was 10. When, finally, the concept of number was abstracted from Figure 2. Economic clay tablet showing an account of thirty-three that of the item counted, numerals and writing could evolve jars of oil, from Godin Tepe, present-day Iran, ca. 3100 BC. The in separate ways. Abstract numerals grew to unprecedented units of oil are no longer repeated in one-to-one correspondence, large numbers, paving the way for mathematics and thereby but the sign for a measure of oil is preceded by numerals. The providing a new grasp on reality (Justus 1996). Pictographs circular sign stood for 10 and the wedge for 1. Published in assumed a phonetic value in order to satisfy new adminis- Schmandt-Besserat, D. (1992), Before Writing, vol. 1. Austin: trative demands, namely, the personal names of recipients University of Texas Press, p. 192, fig. 115. Courtesy T. Cuyler Young, Jr. or donors of the stipulated goods. The syllables or words composing an individual’s name were rendered in a rebus fashion. That is to say, a new type of pictographs no longer seem unrelated to the following hieroglyphic, Linear A or B stood for the objects they pictured, but rather for the sound scripts used in the Aegean between 2200–1300 B.C. (Pour- of the word they evoked. This fifth step in abstraction sat 1994). Farther afield, Egypt, where the use of tokens is marked the final departure from the previous token system. not clearly attested, produced ca. 3000 B.C. a full-blown The phonetic signs featured any possible items, such as the system of writing based on the rebus principle, visibly imi- head of a man standing for the sound “lu” or that of a tating Sumer (Ray 1986). These examples imply that the man’s mouth that was read “ka.” This further abstraction multiple cognitive steps from concrete to abstract data also marked the true takeoff of writing. The resulting manipulation occurred only once in the Near East. The syllabary was no longer restricted to economic record Mesopotamian concrete tokens and abstract tablets loomed keeping but opened ca. 2900 B.C. to other fields of human large over the process of civilization in the Old World. In endeavor. In sum, in Mesopotamia, the earliest civilization fact, abstract accounting is probably a universal prerequisite corresponded with the transmutation of an archaic token for civilization. But this has to remain an hypothesis as long system of accounting into a script written on clay tablets. as the precursors of writing in China and the New World The metamorphosis meant far more than the reduction from remain elusive. a three- to a two-dimensional recording device. It signified See also COGNITIVE ARCHAEOLOGY; COGNITIVE ARTI- step-by-step acquisition of new cognitive skills for process- FACTS; TECHNOLOGY AND HUMAN EVOLUTION ing data in greater abstraction. —Denise Schmandt-Besserat Artifacts for counting and writing were also part and par- cel of the rise of all subsequent Near Eastern civilizations. References However, the cultures following in the wake of Sumer were spared some of the hurdles to abstract data manipulation. Algaze, G. (1993). The Uruk World System. Chicago: University of Elam, in present-day western Iran, the nearest neighbor to Chicago Press. Mesopotamia, is the only exception where the stages from Hoyrup, J. (1994). In Measure, Number, and Weight. Albany: State tokens to impressed markings on envelopes and tablets took University of New York Press. place synchronically with Mesopotamia—no doubt because Justus, C. (1996). Numeracy and the Germanic upper decades. Journal of Indo-European Studies 23: 45–80. of the Sumerian domination of Elam ca. 3300–3200 B.C. Littauer, M. A., and J. H. Crauwel. (1979). Wheeled vehicles and But, when the Proto-Elamites created their own script ca. ridden animals in the ancient Near East. In Handbook der Ori- 3000 B.C., they borrowed simultaneously abstract numerals entalistik, vol. 7. Leiden: E. J. Brill. and phonetic signs (Hoyrup 1994). About 2500 B.C., the Moorey, P. R. S. (1985). Materials and Manufacture in Ancient Indus Valley civilizations emulated their Sumerian trade Mesopotamia. Oxford: BAR International Series. partners by devising a script that had no links with the Possehl, G. L. (1996). Indus Age: The Writing System. Philadel- Mesopotamian-like tokens recovered in pre-Harappan sites phia: University of Pennsylvania Press. (Possehl 1996). Crete probably adopted first the idea of Poursat, J.-C. (1994). Les systèmes primitifs de contabilité en tokens and then that of writing. This is suggested by the fact Crète minoenne. In P. Ferioli, E. Fiandra, G.G. Fissore, and M. that Minoan clay counters in the shape of miniature vessels Frangipane, Eds., Archives Before Writing. Rome: Ministero Artificial Life 37 It is thus a form of mathematical biology—albeit of a per I beni Culturali e Ambientali Ufficio Centrale per I beni Archivisti, pp. 247–252. highly interdisciplinary type. Besides their presence in biol- Ray, J. D. (1986). The emergence of writing in Egypt. World ogy, especially ETHOLOGY and evolutionary theory, A-Life’s Archaeology 17(3): 307–315. research topics are studied also (for instance) in artificial Schmandt-Besserat, D. (1992). Before Writing. 2 vols. Austin: intelligence, computational psychology, mathematics, phys- University of Texas Press. ics, biochemistry, immunology, economics, philosophy, and Schmandt-Besserat, D. (1996). How Writing Came About. Austin: anthropology. University of Texas Press. A-Life was named by Christopher Langton in 1986 (Langton 1986 and 1989). Langton’s term suggests (deliber- Further Readings ately) that the aim of A-Life is to build new living things. Avrin, L. (1991). Scribes, Script and Books. Chicago: American However, not all A-Life scientists share this goal. Even Library Association. fewer believe this could be done without providing some Bowman, A. K., and G. Woolf. (1994). Literacy and Power in the physical body and metabolism. Accordingly, some A-Life Ancient World. Cambridge: Cambridge University Press. workers favor less philosophically provocative terms, such Englund, R. K. (1994). Archaic administrative texts from Uruk. as “adaptive systems” or “animats” (real or simulated robots Archaische Texte aus Uruk 5 Berlin: Gebr. Mann Verlag. based on animals) (Meyer and Wilson 1991). Ferioli, P., E. Fiandra, and G. G. Fissore, Eds. (1996). Administra- The claim that even virtual creatures in cyberspace could tion in Ancient Societies. Rome: Ministero per I beni Culturali e be genuinely alive is called strong A-Life, in analogy to Ambientali Ufficio Centrale per I beni Archivisti. strong AI. Most A-Lifers reject it (but see Langton 1989 and Ferioli, P., E. Fiandra, G. G. Fissore, and M. Frangipane, Eds. Ray 1994). Or rather, most reject the view that such crea- (1994). Archives Before Writing. Rome: Ministero per I beni Culturali e Ambientali Ufficio Centrale per I beni Archivisti. tures can be alive in just the same sense that biological Goody, J. (1977). The Domestication of the Savage Mind. Cam- organisms are, but allow that they are, or could be, alive to a bridge: Cambridge University Press. lesser degree. Whether life does require material embodi- Goody, J. (1986). The Logic of Writing and the Organization of ment, and whether it is a matter of degree, are philosophi- Society. Cambridge: Cambridge University Press. cally controversial questions. Proponents of autopoiesis (the Goody, J. (1987). The Interface Between the Written and the Oral. continual self-production of an autonomous entity), for Cambridge: Cambridge University Press. example, answer “Yes” to the first and “No” to the second Günther, H., Ed. (1994). Schrift und Schriftlichkeit. Berlin: Walter (Maturana and Varela 1980). Others also answer the first de Gruyter. question with a “Yes,” but for different reasons (Harnad Heyer, P. (1988). Communications and History, Theories of Media, 1994). However, these philosophical questions do not need Knowledge and Civilization. New York: Greenwood Press. Nissen, H. J., P. Damerow, and R. K. Englund. (1993). Archaic to be definitively answered for A-Life to progress, or be sci- Bookkeeping. Chicago: University of Chicago Press. entifically illuminating. Using artifacts to study life, even Ong, W. J. (1982). Orality and Literacy. New York: Methuen. “life as it could be,” is not the same as aiming to instantiate Olson, D. R. (1994). The World on Paper. Cambridge: Cambridge life artificially. University Press. The theoretical focus of A-Life is the central feature of Rafoth, B. A., and D. L. Robin, Eds. (1988). The Social Construc- living things: self-organization. This involves the spontane- tion of Written Communication. Norwood: Ablex. ous EMERGENCE, and maintenance, of order out of an origin Schousboe, K., and M. T. Larsen, Eds. (1989). Literacy and Soci- that is ordered to a lesser degree. (The lower level may, ety. Copenhagen: Akademisk Forlag. though need not, include random “noise.”) Self-organization Street, B. V. (1993). Cross-Cultural Approaches to Literacy. Cam- is not mere superficial change, but fundamental structural bridge: Cambridge University Press. Wagner, D. A. (1994). Literacy, Culture and Development. Cam- development. This development is spontaneous, or autono- bridge: Cambridge University Press. mous. That is, it results from the intrinsic character of the Watt, W. C., Ed. (1994). Writing Systems and Cognition. Dor- system (often in interaction with the environment), rather drecht: Kluwer Academic Publishers than being imposed on it by some external force or designer. In SELF-ORGANIZING SYSTEMS, higher-level properties Artificial Intelligence result from interactions between simpler ones. In living organisms, the relevant interactions include chemical diffu- See INTRODUCTION: AI AND EDUCATION; COGNITIVE MOD- sion, perception and communication, and processes of vari- ELING, SYMBOLIC; COMPUTATIONAL INTELLIGENCE ation and natural selection. One core problem is the way in which self-organization and natural selection interact to produce biological order over time. Some work in A-Life Artificial Life suggests that whereas self-organization generates the fun- damental order, natural selection (following on variation) Artificial life (A-Life) uses informational concepts and weeds out the forms that are least well adapted to (least fit computer modeling to study life in general, and terrestrial for) the environment in question (Kauffman 1993). life in particular. It aims to explain particular vital phenom- The higher-level properties in living organisms are very ena, ranging from the origin of biochemical metabolisms to varied. They include universal characteristics of life (e.g., the coevolution of behavioral strategies, and also the autonomy and evolution); distinct lifestyles (e.g., parasit- abstract properties of life as such (“life as it could be”). ism and symbiosis); particular behaviors (e.g., flocking, 38 Artificial Life hunting, or evasion); widespread developmental processes cially connectionism, situated robotics, and genetic algo- (e.g., cell differentiation); and bodily morphology (e.g., rithms (evolutionary programming). These three AI branching patterns in plants, and the anatomy of sense approaches may be integrated in virtual or physical systems. organs or control mechanisms in animals). For instance, some A-Life robots are controlled by evolved A-Life studies all these biological phenomena on all NEURAL NETWORKS, whose (initially random) connections these levels. A-Life simulations vary in their degree of specify “reflex” responses to specific environmental cues abstractness or idealization. Some model specific behaviors (e.g., Cliff, Harvey, and Husbands 1993). or morphologies of particular living things, whereas others A-Life’s methodology differs from classical (symbolic) study very general questions, such as how different rates of AI in many ways. It relies on bottom-up (not top-down) pro- mutation affect coevolution (Ray 1992). They vary also in cessing, local (not global) control, simple (not complex) their mode of modeling: some A-Life work concentrates on rules, and emergent (not preprogrammed) behavior. Often, it programs, displaying its creatures (if any) only as images on models evolving or coevolving populations involving many the VDU, while some builds (and/or evolves) physical thousands of individuals. It commonly attempts to model an robots. The wide range of A-Life research is exemplified in entire creature, rather than some isolated module such as the journals Artificial Life and Adaptive Behavior, and in vision or problem-solving (e.g., Beer 1990). And it claims international (including European) conference proceedings to avoid methods involving KNOWLEDGE REPRESENTATION of the same names. Brief overviews include Langton (1989) and PLANNING, which play a crucial role in classical AI and Boden (1996, intro.). For popular introductions, see (Brooks 1991). The behavior of A-Life robots is the result Emmeche (1994) and Levy (1992). of automatic responses to the contingencies of the environ- A-Life is closely related to—indeed, it forms part of— ment, not preprogrammed sequences or internal plans. Each cognitive science in respect of its history, its methodology, response typically involves only one body part (e.g., the and its philosophy. third leg on the right), but their interaction generates Historically, it was pioneered (around the mid-twentieth “wholistic” behavior: the robot climbs the step, or follows century) by the founders of AI: Alan TURING and John VON the wall. NEUMANN. They both developed theoretical accounts of self- Philosophically, A-Life and AI are closely related. organization, showing how simple underlying processes Indeed, if intelligence can emerge only in living things, could generate complex systems involving emergent order. then AI is in principle a subarea of A-Life. Nevertheless, Turing (1952) showed that interacting chemical diffusion some philosophical assumptions typical of classical AI are gradients could produce higher-level (including periodic) queried, even rejected, by most workers in A-Life. All the structures from initially homogeneous tissue. Von Neumann, philosophical issues listed below are discussed in Boden before the discovery of DNA or the genetic code, identified 1996, especially the chapters by Bedau, Boden, Clark, the abstract requirements for self-replication (Burks 1966). Godfrey-Smith, Hendriks-Jansen, Langton, Pattee, Sober, He even defined a universal replicator: a cellular automaton and Wheeler; see also Clark (1997). (CA) capable of copying any system, including itself. A CA Much as AI highlights the problematic concept of intelli- is a computational “space” made up of many discrete cells; gence, A-Life highlights the concept of life—for which no each cell can be in one of several states, and changes (or universally agreed definition exists. It also raises questions retains) its state according to specific—typically localistic— of “simulation versus realization” similar to those concern- rules. Von Neumann also pointed out that copy errors could ing strong AI. Problems in A-Life that are relevant also to enable evolution, an idea that later led to the development of the adequacy of FUNCTIONALISM as a philosophy for AI and EVOLUTIONARY COMPUTATION (evolutionary programming, cognitive science include the role of embodiment and/or evolution strategies, genetic algorithms, etc.). environmental embeddedness in grounding cognition and Even in relatively simple CAs, (some) high-level order INTENTIONALITY. may emerge only after many iterations of the relevant lower- A-Life in general favors explanations in terms of emer- level rules. Such cases require high-performance computing. gence, whereas AI tends to favor explanation by functional Consequently, Turing’s and von Neumann’s A-Life ideas decomposition. Moreover, many A-Life researchers seek could be explored in depth only long after their deaths. explanations in terms of closely coupled dynamical sys- Admittedly, CAs were studied by von Neumann’s colleague tems, described by phase-space trajectories and differential Arthur Burks (1970) and his student John Holland, who pio- equations rather than computation over representations. neered genetic algorithms soon after CAs were defined (Hol- Although A-Life does avoid the detailed, “objective,” land 1975); and more people—John Conway (Gardner world-modeling typical of classical AI, whether it manages 1970), Steve Wolfram (1983 and 1986), Stuart Kauffman to avoid internal representations entirely is disputed. Also in (1969 and 1971), and Langton (1984), among others— dispute is whether the “autonomy” of environmentally became interested in them soon afterward. But these early embedded A-Life systems can capture the hierarchical order studies focused on theory rather than implementation. More- and self-reflexiveness found in some human action (and over, they were unknown to most researchers in cognitive partly modeled by classical AI). Many philosophers of A- science. The field of A-Life achieved visibility in the early Life justify their rejection of representations by criticizing 1990s, largely thanks to Langton’s initiative in organizing the broadly Cartesian assumptions typical of classical, and the first workshop on A-Life (in Los Alamos) in 1987. most connectionist, AI. They draw instead on philosophical Methodologically, A-Life shares its reliance on computer insights drawn from Continental philosophy, or phenome- modeling with computational psychology and AI—espe- nology, sometimes using the concept of autopoiesis. Attention 39 Besides its theoretical interest, A-Life has many techno- Ray, T. S. (1992). An approach to the synthesis of life. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, Eds., logical applications. These include evolutionary computation Artificial Life II. Redwood City, CA: Addison-Wesley, pp. for commercial problem solving, environmentally embedded 371–408. (Reprinted in M. A. Boden, Ed., The Philosophy of robots for practical use, and computer animation for movies Artificial Life. Oxford: Oxford University Press, pp. 111-145.) and computer games. The “Creatures” computer environ- Ray, T. S. (1994). An evolutionary approach to synthetic biology: ment, for example, employs A-Life techniques to evolve Zen and the art of creating life. Artificial Life 1: 179–210. individual creatures capable of interacting, and of learning Turing, A. M. (1952). The chemical basis of morphogenesis. from their “world” and the human user’s “teaching.” Philosophical Transactions of the Royal Society: B 237: 37–72. See also ADAPTATION AND ADAPTATIONISM; EVOLUTION; Wolfram, S. (1983). Statistical mechanics of cellular automata. Review of Modern Physics 55: 601–644. DYNAMIC APPROACHES TO COGNITION; SITUATED COGNI- Wolfram, S. (1986). Theory and Applications of Cellular Autom- TION AND LEARNING; SITUATEDNESS/EMBEDDEDNESS ata. Singapore: World Scientific. —Margaret A. Boden Aspect References Beer, R. D. (1990). Intelligence as Adaptive Behavior: An Experi- See TENSE AND ASPECT ment in Computational Neuroethology. New York: Academic Press. Boden, M. A., Ed. (1996). The Philosophy of Artificial Life. Attention Oxford: Oxford University Press. Brooks, R. A. (1991). Intelligence without representation. Artifi- cial Intelligence 47: 139-159. William JAMES once wrote, “Every one knows what atten- Burks, A. W. (1966). Theory of Self-Reproducing Automata. tion is. It is the taking possession by the mind, in clear and Urbana: University of Illinois Press. vivid form, of one out of what seem several simultaneously Burks, A. W. (1970). Essays on Cellular Automata. Urbana: Uni- possible objects or trains of thought. Focalization, concen- versity of Illinois Press. tration, of consciousness are of its essence. It implies with- Clark, A. J. (1997). Being There: Putting Brain, Body, and World drawal from some things in order to deal effectively with Together Again. Cambridge, MA: MIT Press. others” (James 1890: 403–404). The study of selectivity in Cliff, D., I. Harvey, and P. Husbands. (1993). Explorations in evo- information processing operations, of “withdrawal from lutionary robotics. Adaptive Behavior 2: 71–108. some things in order to deal effectively with others,” has its Emmeche, C. (1994). The Garden in the Machine: The Emerging modern origins in the 1950s with BROADBENT’s Perception Science of Artificial Life. Princeton: Princeton University Press. Gardner, M. (1970). The fantastic combinations of John Conway's and Communication (1958). Loosely speaking, much of this new solitaire game “Life.” Scientific American 223(4): 120– work may be taken as research into “attention,” though 123. problems of selective information processing of course need Harnad, S. (1994). Levels of functional equivalence in reverse not be identical to James’s first-person view of selective bioengineering. Artificial life 1: 293–301. consciousness or awareness. Holland, J. H. (1975). Adaptation in Natural and Artificial Sys- In fact many aspects of selectivity must contribute to the tems. Ann Arbor: University of Michigan Press. organization of any focused line of activity. At any given Kauffman, S. A. (1969). Metabolic stability and epigenesis in ran- time, the person is actively pursuing some goals rather than domly connected nets. Journal of Theoretical Biology 22: 437– others. Some actions rather than others bring those goals 467. closer. Some parts rather than others of the sensory input are Kauffman, S. A. (1971). Cellular homeostasis, epigenesis, and rep- lication in randomly aggregated macro-molecular systems. relevant and must be examined or monitored. The general Journal of Cybernetics 1: 71–96. state of alertness or drowsiness is also often considered an Kauffman, S. A. (1992). The Origins of Order: Self-Organization aspect of “attention.” In Perception and Communication, a and Selection in Evolution. Oxford: Oxford University Press. single selective device was used to handle many different Langton, C. G. (1984). Self-reproduction in cellular automata. phenomena: selective listening, loss of performance with Physica D 10: 135–144. long work periods, impairments by loud noise, and so forth. Langton, C. G. (1986). Studying artificial life with cellular autom- When the theory was updated in Decision and Stress (1971), ata. Physica D 22: 1120–1149. data from many experimental tasks already required a sub- Langton, C. G. (1989). Artificial life. In C. G. Langton, Ed., Artifi- stantially more complex approach, with a variety of distinct cial Life: The Proceedings of an Interdisciplinary Workshop on selective mechanisms. Modern work seeks to understand the Synthesis and Simulation of Living Systems (held September 1987). Redwood City, CA: Addison-Wesley, pp. 1–47. both separate aspects of processing selectivity and their (Reprinted, with revisions, in M. A. Boden, Ed., The Philosophy relations. of Artificial Life. Oxford: Oxford University Press, pp. 39–94.) Selective perception in a variety of modalities has been Levy, S. (1992). Artificial Life: The Quest for a New Creation. particularly well investigated. Experiments on selective lis- New York: Pantheon. tening in the 1950s dealt with people listening to two simul- Maturana, H. R., and F. J. Varela. (1980). Autopoiesis and Cogni- taneous speech messages. First, these experiments showed tion: The Realization of the Living. London: Reidel. limited capacity: People were often unable to identify both Meyer, J.-A., and S. W. Wilson, Eds. (1991). From Animals to Ani- messages at once. Second, they showed conditions for effec- mats: Proceedings of the First International Conference on tive selectivity: People could identify one message and Simulation of Adaptive Behavior. Cambridge, MA: MIT Press. 40 Attention ignore the other providing the messages differed in simple together, these “visual areas” cover roughly the posterior physical characteristics such as location, loudness, or voice, third of the cerebral hemispheres. Recordings from single but not when they differed only in content. Third, these cells in several visual areas of the monkey show weak or experiments showed the striking consequences of efficient suppressed responses to stimuli that the animal is set to selection: A person listening to one message and ignoring ignore (Moran and Desimone 1985). Measurements of gross another would subsequently be able to report only the crud- electrical activity in the human brain, and associated est characteristics of the ignored message, for example, that changes in local cerebral bloodflow, similarly suggest it had changed from speech to a tone, but not whether it had greater responses to attended than to unattended stimuli been in a familiar or an unfamiliar language. (Heinze et al. 1994). Damage to one side of the brain weak- All three points have received much subsequent study. As ens the representation of stimuli on the opposite side of an example of the many things learned regarding limited visual space. Such stimuli may be seen when they are pre- capacity, experiments with mixed visual and auditory stimuli sented alone, but pass undetected when there is concurrent show that the major limit on simultaneous perception is input on the unimpaired side (Bender 1952). All these modality-specific: One visual and one auditory stimulus can results suggest that concurrent visual inputs compete for be identified together much better than two stimuli in the representation in the network of visual areas (Desimone and same modality (Treisman and Davies 1973). Regarding the Duncan 1995). Attended stimuli are strongly represented, control of stimulus selection, experiments show the joint while responses to unwanted stimuli are suppressed. influence of top-down (task-driven) and bottom-up (stimu- Complementary to selective perception is the selective lus-driven) considerations. Top-down influences are impor- activation of goals or components of an action plan. Here, tant when a person is specifically instructed to pay attention too, errors reflect limited capacity, or difficulty organizing just to objects in a certain region of a visual display (selec- two lines of thought or action simultaneously. Everyday slips tion by location), objects having a certain color or other of action, such as driving to work instead of the store, or stir- property (selection by object feature), or objects of a certain ring coffee into the teapot, are especially likely when a per- category (e.g., letters rather than digits) (von Wright 1968). son is preoccupied with other thoughts (Reason and Selection by location (spatial attention) has been particularly Mycielska 1982). Practice is again a key consideration. well studied (Posner 1978). Irrespective of the task or Although it may be impossible to organize two unfamiliar instruction, however, stimulus factors such as intensity activities at once, familiar behavior seems to occur automati- (Broadbent 1958) or sudden onset (Jonides and Yantis 1988) cally, leaving attention (in this sense) free for other concerns. also contribute to the choice of which stimulus is processed. Indeed, familiar actions may tend to occur “involuntarily,” or Long practice in considering a certain stimulus relevant or when they are currently inappropriate. Again everyday important will also favor its selection, as when one’s own action slips provide clear examples: taking a familiar route name attracts attention in a crowded room (Moray 1959). when intending to drive elsewhere, or taking out one’s key Regarding the results of efficient selection, finally, experi- on arrival at a friend’s door (James 1890). A laboratory ver- ments have detailed what differs in the processing of sion is the Stroop effect: Naming the color of a written word attended and ignored stimuli. Often, very little can be explic- suffers substantial interference from a tendency instead to itly remembered of stimuli a person was asked to ignore, read the word itself (Stroop 1935). Such results suggest a even though those stimuli were perfectly audible or visible model in which conflicting action tendencies compete for (Wolford and Morrison 1980). In contrast, indirect measures activation. Practice increases an action’s competitive may suggest a good deal of hidden or unconscious process- strength. ing; for example, an ignored word previously associated with Disorganized behavior and action slips occur commonly shock may produce a galvanic skin response even while sub- after damage to the frontal lobes of the brain (Luria 1966). jects fail to notice its occurrence (Corteen and Dunn 1974). Disorganization can take many forms: intrusive actions irrel- The nature and duration of such implicit processing of unat- evant to a current task, perseverative repetition of incorrect tended material remains a topic of active debate, for exam- behavior, choices that seem ill-judged or bizarre. A major ple, in the discussion of IMPLICIT VS. EXPLICIT MEMORY. question is how action selection develops from the joint activity of multiple frontal lobe systems. A more detailed These studies reflect general questions that may be asked treatment is given in ATTENTION AND THE HUMAN BRAIN. of any selective process. One is the question of divided atten- tion, or how much can be done at once. Another is the ques- To some extent, certainly, it is appropriate to consider tion of selective attention, or how efficiently desired stimuli different aspects of “attention” as separate. To take one con- can be processed and unwanted stimuli ignored. Experi- crete example, it has been amply documented that there are ments measuring establishment of a new selective priority many distinct forms of competition or interference between concern attention setting and switching. The complement to one line of activity and another. These include modality- switching is sustained attention, or ability to maintain one specific perceptual competition, effector-specific response fixed processing set over an extended time period. competition, and competition between similar internal rep- The neurobiology of visual attention is a particularly resentations (e.g., two spatial or two verbal representations; active topic of current research. In the primate brain, visual see Baddeley 1986); though there are also very general information is distributed to a network of specialized corti- sources of interference even between very dissimilar tasks cal areas responsible for separate visual functions and deal- (Bourke, Duncan, and Nimmo-Smith 1996). Each aspect of ing partially with separate visual dimensions such as shape, competition reflects a distinct way in which the nervous sys- motion and color (Desimone and Ungerleider 1989). Taken tem must select one set of mental operations over another. Attention in the Animal Brain 41 At the same time, selectivities in multiple mental McClelland, J. L., and D. E. Rumelhart. (1981). An interactive acti- vation model of context effects in letter perception: Part 1. An domains must surely be integrated to give coherent, purpo- account of basic findings. Psychological Review 88: 375–407. sive behaviour (Duncan 1996). It has often been proposed Moran, J., and R. Desimone. (1985). Selective attention gates visual that some mental “executive” takes overall responsibility for processing in the extratriate cortex. Science 229: 782–784. coordinating mental activity (e.g., Baddeley 1986); for Moray, N. (1959). Attention in dichotic listening: affective cues example, for ensuring that appropriate goals, actions, and and the influence of instructions. Quarterly Journal of Experi- perceptual inputs are all selected together. At least as attrac- mental Psychology 11: 56–60. tive, perhaps, is an approach through self-organization. By Posner, M. I. (1978). Chronometric Explorations of Mind. Hills- analogy with “relaxation” models of many mental processes dale, NJ: Erlbaum. (McClelland and Rumelhart 1981), selected material in any Reason, J., and K. Mycielska. (1982). Absent-minded? The Psy- one mental domain (e.g., active goals, perceptual inputs, chology of Mental Lapses and Everyday Errors. Englewood Cliffs, NJ: Prentice-Hall. material from memory) may support selection of related Stroop, J. R. (1935). Studies of interference in serial verbal reac- material in other domains. The description of top-down con- tions. Journal of Experimental Psychology 18: 643–662. trol given earlier, for example, implies that goals control Treisman, A. M., and A. Davies. (1973). Divided attention to ear perceptual selection; equally, however, active goals can and eye. In S. Kornblum, Ed., Attention and Performance IV. always be overturned by novel perceptual input, as when a London: Academic Press, pp. 101–117. telephone rings or a friend passes by in the street. Which- von Wright, J. M. (1968). Selection in visual immediate memory. ever approach is taken, a central aspect of “attention” is this Quarterly Journal of Experimental Psychology 20: 62–68. question of overall mental coordination. Wolford, G., and F. Morrison. (1980). Processing of unattended See also CONSCIOUSNESS, NEUROBIOLOGY OF; EYE visual information. Memory and Cognition 8: 521–527. MOVEMENTS AND VISUAL ATTENTION; INTROSPECTION; Further Readings MEMORY; NEURAL NETWORKS; SELF-KNOWLEDGE; TOP- DOWN PROCESSING IN VISION Allport, D. A. (1989). Visual attention. In M. I. Posner, Ed., Foun- —John Duncan dations of Cognitive Science. Cambridge, MA: MIT Press, pp. 631–682. Norman, D. A., and T. Shallice. (1986). Attention to action: willed References and automatic control of behavior. In R. J. Davidson, G. E. Baddeley, A. D. (1986). Working Memory. Oxford: Oxford Univer- Schwartz, and D. Shapiro, Eds., Consciousness and self-regula- sity Press. tion. Advances in research and theory, vol. 4. New York: Ple- Bender, M. B. (1952). Disorders in Perception. Springfield, IL: num, pp. 1–18. Charles C. Thomas. Pashler, H. (1997). The Psychology of Attention. Cambridge, MA: Bourke, P. A., J. Duncan, and I. Nimmo-Smith. (1996). A general MIT press. factor involved in dual task performance decrement. Quarterly Posner, M. I., and S. E. Petersen. (1990). The attention system of Journal of Experimental Psychology 49A: 525–545. the human brain. Annual Review of Neuroscience 13: 25–42. Broadbent, D. E. (1958). Perception and Communication. London: Pergamon. Attention in the Animal Brain Broadbent, D. E. (1971). Decision and Stress. London: Academic Press. Corteen, R. S., and D. Dunn. (1974). Shock-associated words in a In most contexts ATTENTION refers to our ability to concen- nonattended message: a test for momentary awareness. Journal trate our perceptual experience on a selected portion of the of Experimental Psychology 102: 1143–1144. Desimone, R., and L. G. Ungerleider. (1989). Neural mechanisms available sensory information, and, in doing so, to achieve a of visual processing in monkeys. In F. Boller and J. Grafman, clear and vivid impression of the environment. To evaluate Eds., Handbook of Neuropsychology, vol. 2. Amsterdam: something that seems as fundamentally introspective as Elsevier, pp. 267–299. attention, cognitive science research usually uses a measure Desimone, R., and J. Duncan. (1995). Neural mechanisms of selec- of behavioral performance that is correlated with attention. tive visual attention. Annual Review of Neuroscience 18: 193– To examine brain mechanisms of attention, the correlations 222. are extended another level by measuring the activity of neu- Duncan, J. (1996). Cooperating brain systems in selective percep- rons during different ‘attentive’ behaviors. Although this tion and action. In T. Inui and J. L. McClelland, Eds., Attention article focuses exclusively on attentive processes in the and Performance XVI. Cambridge, MA: MIT Press, pp. 549– visual system, attentive processing occurs within each sen- 578. Heinze, H. J., G. R. Mangun, W. Burchert, H. Hinrichs, M. Scholz, sory system (see AUDITORY ATTENTION) and more generally T. F. Munte, A. Gos, M. Scherg, S. Johannes, H. Hundeshagen, in most aspects of cognitive brain function. Our understand- M. S. Gazzaniga, and S. A. Hillyard. (1994). Combined spatial ing of the neuronal correlates of attention comes principally and temporal imaging of brain activity during visual selective from the study of the influence of attentive acts on visual attention in humans. Nature 372: 543–546. processing as observed in animals. The selective aspects of James, W. (1890). The Principles of Psychology. New York: Holt. attention are apparent in both of vision’s principal functions, Jonides, J., and S. Yantis. (1988). Uniqueness of abrupt visual identifying objects and navigating with respect to objects onset in capturing attention. Perception and Psychophysics 43: and surfaces. 346–354. Attention is a dynamic process added on top of the pas- Luria, A. R. (1966). Higher Cortical Functions in Man. London: sive elements of selection provided by the architecture of Tavistock. 42 Attention in the Animal Brain attended object and suppresses the neural response to other the visual system. For foveate animals, looking or navigat- objects in the neuron’s receptive field (Moran and Desimone ing encompasses the set of actions necessary to find a 1985; Treue and Maunsell 1996). desired goal and place it in foveal view. The selective Within the ventral stream and at progressively higher lev- aspects of attention within this context deal primarily with els of object analysis, the competitive convergence of the sys- decisions about information in the peripheral field of view. tem forces a narrowing of processing by selecting what Seeing or identifying objects encompasses a more detailed information or which objects gain control of the neural activ- analysis of centrally available information. In this context ity. The connectivity of the visual system is not simply a the selective aspects of attention deal primarily with the feedforward system but a highly interconnected concurrent delineation of objects and the integration of their parts that network. Whatever information wins the contention at one lead to their recognition (see OBJECT RECOGNITION, ANIMAL level is passed forward and backward and often laterally (Van STUDIES). Essen and DeYoe 1995). These factors heavily constrain the Although the retinae encode a wide expanse of the visual neural activity, limiting the activation of neurons at each sub- environment, object analysis is not uniform across the visual sequent level to a progressively restricted set of stimulus con- field but instead is concentrated in a small zone called the figurations. As increasingly higher levels of analytic field of focal attention (Neisser 1967). Under most circum- abstraction are attained in the ventral stream, the receptive stances this restricted zone has little to due with acuity lim- field convergence narrows the independence of parallel rep- its set by the receptor density gradient in the retina but is resentations until in the anterior temporal lobe the neuronal due to an interference between objects generated by the den- receptive fields encompass essentially all of the central visual sity of the visual information. Focal attention encompasses field. Because directing attention emphasizes the processing the dynamic phenomena that enable us to isolate and exam- of the object(s) at the attended location, the act of attending ine objects under the conditions of interference. The selec- both engages the ventral stream on the object(s) at that loca- tive aspect of attention raises an important question, namely, tion and dampens out information from the remaining visual how many things can be attended to at one time? Interest- field (Chelazzi et al. 1993). These circumstances parallel the ingly, the answer varies, and depends at least in part on the capacity limitation found in many forms in vision and may level of analysis that is necessary to distinguish between the constitute the performance limiting factor. Capacity limits things that are present. What is clear is that the moment-to- may vary depending upon the level of convergence in the cor- moment analytic capacity of the visual system is surpris- tical stream that must be reached before a discriminative ingly limited. decision about the set of observed objects can be achieved An examination of the physiology and anatomy of visual (Merigan, Nealey, and Maunsell 1993). processes in animals, especially primates, provides us with When attention is to be shifted to a new object, as we read key pieces of the attention puzzle. Visual information is dis- the next word or reach for a cup or walk along a path, infor- persed from primary visual cortex through extrastriate cor- mation must be obtained from the periphery as to the spatial tex along two main routes. One leads ventrally toward layout of objects. The dorsal stream appears to provide this anterior temporal cortex, the other dorsally into parietal information. The convergence of information within the neu- association cortex. The ventral stream progression portrays ronal receptive fields generates sensitivity to large surfaces, a system devoted to object analysis, and in anterior temporal their motion and boundaries and the positions of objects areas, represents a stage where sensory processes merge within them, without particular sensitivity to the nature of with systems associated with object recognition and mem- the objects themselves. The parietal visual system is espe- ory (Gross 1992). The dorsal stream emphasizes the posi- cially sensitive to the relative motion of surfaces such as that tions of surfaces and objects and in parietal areas represents generated during movement through the environment. An a stage where the sensory and motor processes involved in object to which attention is shifted is usually peripherally the exploration of the surrounding space become inter- located with respect to the object currently undergoing per- twined (Mountcastle 1995). The different emphasis in infor- ceptual analysis. For parietal cortical neurons, maintained mation processing within parietal and temporal areas is also attention on a particular object results in a heightened visual apparent with respect to the influences of attentional states. sensitivity across the visual field and, in contrast to the tem- Both the sensitivity to visual stimuli and the effective recep- poral stream, a suppressed sensitivity to objects currently at tive field size are more than doubled for parietal visual neu- the locus of directed attention (Motter 1991). rons during an attentive fixation task, whereas under similar The transition between sensory information and MOTOR conditions the receptive fields of inferior temporal cortical neurons are observed to collapse around a fixation target. CONTROL is subject to a clear capacity limitation—competi- As visual processing progresses in both the dorsal and tion between potential target goals must be resolved to pro- ventral streams, there is an accompanying expansion in the duce a coherent motor plan. When target goals are in receptive field size of individual neurons and a correspond- different locations, spatially selective processing can be ing convergence of information as an increasing number of used to identify locations or objects that have been selected objects fit within each receptive field. Despite the conver- as target goals. In the period before a movement is made, gence, an inseparable mixture of information from different the neural activity associated with the visual presence of the objects does not occur in part because of the competitive, object at the target site evolves to differentiate the target winner-take-all nature of the convergence and in part object from other objects. This spatially selective change, because of attentive selection. Directing attention to a par- consistent with an attentive selection process, has been ticular object alters the convergent balance in favor of the observed in parietal cortex as well as the motor eye fields of Attention in the Human Brain 43 frontal cortex and subcortical visuomotor areas such as the Schall, J. D. (1995). Neural basis of saccade target selection. Reviews in the Neurosciences 6: 63–85. superior colliculus (Schall 1995). Treue, S., and J. H. R. Maunsell. (1996). Attentional modulation of How early in the visual system are attentive influences visual motion processing in cortical areas MT and MST. Nature active? If attention effectively manipulates processing in the 382: 539–541. earliest stages of vision, then the visual experiences we have Van Essen, D. C., and E. A. DeYoe. (1995). Concurrent processing are in part built up from internal hypotheses about what we in the primate visual cortex. In M.S. Gazzaniga, Ed., The Cogni- are seeing or what we want to see. Two sets of physiological tive Neurosciences. Cambridge, MA: MIT Press, pp. 383–400. observations suggest these important consequences of selec- tive attention do occur. First, directed attention studies and Further Readings studies requiring attentive selection of stimulus features Connor C. E., J. L. Gallant, D. C. Preddie, and D. C. Van Essen. have shown that the neural coding of objects can be com- (1996). Responses in area V4 depend on the spatial relationship pletely dominated by top-down attentive demands as early between stimulus and attention. Journal of Neurophysiology as extrastriate cortex and can bias neuronal processing even 75: 1306–1308. in primary visual cortex. Second, after arriving in primary Corbetta, M., F. M. Miesin, S. Dobmeyer, G. L. Shulman, and S. E. visual cortex, visual information spreads through the corti- Petersen (1991). Selective and divided attention during visual cal systems within 60–80 msec. The effects of selective discriminations of shape, color and speed: functional anatomy attention develop in parallel during the next 100 msec in by positron emission tomography. Journal of Neuroscience 11: extrastriate occipital, temporal, and parietal cortex and the 2382–2402. Desimone, R., and J. Duncan. (1995). Neural mechanisms of selective frontal eye fields, making it not only difficult to pinpoint a visual attention. Annual Review of Neuroscience 18: 193–222. single decision stage but also making it likely that a coher- Friedman-Hill, S. R., L. Robertson, and A. Treisman. (1995). Pari- ent solution across areas is reached by a settling of the net- etal contributions to visual feature binding: evidence from a work (Motter 1997). patient with bilateral lesions. Science 269: 853– 855. The detailed physiological insights gained from animal Haenny, P. E., J. H. R. Maunsell, and P. H. Schiller. (1988). State studies complement the imaging studies of ATTENTION IN dependent activity in monkey visual cortex. II. Retinal and ex- THE HUMAN BRAIN that have probed higher-order cognitive traretinal factors in V4. Experimental Brain Research 69: 245– functions and attempted to identify the neural substrates of 259. volitional aspects of attention. Together these sets of studies Haxby, J., B. Horwitz, L. G. Ungerleider, J. Maisog, P. Pietrini, have provided new views of several classic phenomena in and C. Grady. (1994). The functional organization of human extra-striate cortex: a PET-rCBF study of selective attention to attention including capacity limitations and the temporal faces and locations. Journal of Neuroscience 14: 6336–6353. progression of selection. Koch, C., and S. Ullman. (1985). Shifts in selective visual atten- See also ATTENTION IN THE HUMAN BRAIN; EYE MOVE- tion: towards the underlying neural circuitry. Human Neurobi- MENTS AND VISUAL ATTENTION; VISUAL ANATOMY AND ology 4: 219–227. PHYSIOLOGY; VISUAL PROCESSING STREAMS Motter, B. C. (1994). Neural correlates of color and luminance fea- ture selection in extrastriate area V4. Journal of Neuroscience —Brad Motter 14: 2178–2189. Olshausen B., C. Andersen, and D. C. Van Essen. (1993). A neural References model of visual attention and invariant pattern recognition. Journal of Neuroscience 13: 4700–4719. Chelazzi, L., E. K. Miller, J. Duncan, and R. Desimone. (1993). A Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintun, and M. E. Raichle. neural basis for visual search in inferior temporal cortex. (1988). Positron emission tomographic studies of the cortical Nature 363: 345–347. anatomy of single-word processing. Nature 331: 585–589. Gross, C. G. (1992). Representation of visual stimuli in inferior Richmond, B. J., R. H. Wurtz, and T. Sato. (1983). Visual temporal cortex. Philosophical Transactions of the Royal Soci- responses of inferior temporal neurons in the awake rhesus ety, London, Series B 335: 3–10. monkey. Journal of Neurophysiology 50: 1415–1432. Merigan, W. H., and J. H. R. Maunsell. (1993). How parallel are Robinson, D. L. (1993). Functional contributions of the primate the primate visual pathways? Annual Review of Neuroscience pulvinar. Progress in Brain Research 95: 371–380. 16: 369–402. Schiller, P. H., and K. Lee. (1991). The role of the primate extra- Merigan, W. H., T. A. Nealey, and J. H. R. Maunsell. (1993). striate area V4 in vision. Science 251: 1251–1253. Visual effects of lesions of cortical area V2 in macaques. Jour- Tsotsos, J. K., S. M. Culhane, W. Y. K. Wai, Y. Lai, N. Davis, and nal of Neuroscience 13: 3180–3191. F. Nuflo. (1995). Modeling visual attention via selective tuning. Moran, J., and R. Desimone. (1985). Selective attention gate visual Artificial Intelligence 78: 507–545. processing in the extrastriate cortex. Science 229: 782–784. Zipser, K., V. A. F. Lamme, and P. H. Schiller. (1996). Contextual Motter, B. C. (1991). Beyond extrastriate cortex: the parietal visual modulation in primary visual cortex. Journal of Neuroscience system. In A. G. Leventhal, Ed., Vision and Visual Dysfunction, 16: 7376–7389 vol. 4, The Neural Basis of Visual Function. London: Mac- millan, pp. 371–387. Attention in the Human Brain Motter, B. C. (1998). Neurophysiology of visual attention. In R. Parasuraman, Ed., The Attentive Brain. Cambridge, MA: MIT Press. To illustrate what is meant by attention, consider the display Mountcastle, V. B. (1995). The parietal system and some higher in figure 1. Your ATTENTION may be drawn to the tilted T brain functions. Cerebral Cortex 5: 377–390. because it differs in such a striking way from the back- Neisser, U. (1967). Cognitive Psychology. New York: Appleton- ground. When one figure differs from the background by a Century-Crofts. 44 Attention in the Human Brain in the incongruent condition when compared with the con- gruent condition (e.g., the noun blue displayed in blue color) or neutral (noncolor word; see Posner and DiGiro- lamo 1996 for a review). Other tasks requiring inhibition of habitual responses also activate the anterior cingulate. For example, responding to a noun by generating an associated use produces more activation of the anterior cingulate than simply repeating the noun (Petersen et al. 1989). In the generate condition, the most familiar response (i.e., repeating the noun) needs to be repressed, to allow the expression of the verb. Classifying a noun into a category also produces cingulate activation related to the number of targets. This finding suggests that the anterior cingulate activation is due to special processing Figure 1. of the target rather than being necessary to make the classi- single feature, it pops out and your attention is drawn to it. fication, a result consistent with the idea of cognitive con- This is an example of attention driven by input. However, if trol. The cingulate has close connection to underlying you know that the target is an L you can guide your search subcortical areas in the BASAL GANGLIA (Houk 1995). among the stimuli with horizontal and vertical strokes. This These areas have also shown activity in some of the same is an example of the form of higher level voluntary control tasks described above and play a role in the inhibition of that is the subject of this section. Voluntary control is reflexive motor responses. It seems likely they form part of accompanied by the subjective feeling of selection between the network subserving this form of voluntary control. potential actions and is one of the most distinctive features Goldberg and Bloom (1990) proposed a “dual premotor of human experience. The interaction of top-down voluntary system hypothesis” of volitional movement. This theory, actions with bottom-up automatic processes, which is illus- which attributes an executive function to the anterior cingu- trated by figure 1, has interested researchers since the begin- late and the supplementary motor area, was developed to nings of psychology (James 1890; see WILLIAM JAMES). explain the alien hand sign. The alien hand sign is the per- One approach to the question of voluntary control is to formance of apparently purposive movements that the argue that it is an illusion that arises out of the competitive patient fails to recognize as self-generated. The theory pos- activation of a large number of brain systems. What appears its a lateral premotor system (LPS; Area 6), that organizes to be volition is the result of a complex network relaxing to motor behavior in reaction to external stimulus, and a a particular state. Although without denying the top-down medial premotor system (MPS; anterior cingulate, supple- component, this view stresses the bottom-up processes. A mentary motor area, and basal ganglia loops), which under- different view elaborated in this section is that there is a lies intentional behavior. MPS underlies volitional high-level executive attention network with its own anatomy movement by inhibiting the LPS. If a lesion occurs in MPS, that works to resolve competition and in the process gives LPS is released and obligatory dependence on external rise to subjective feelings of cognitive control (Norman and information emerges. The patient develops compulsive Shallice 1986). This view emphasizes the top-down control. automatisms, which are not perceived as self-generated. The The executive system participates in tasks that involve inhibitory effect of MPS over LPS during volitional move- conflict between systems. This is a property one expects to ment resembles the inhibitory effect of MPS (i.e., anterior find in a system that has as a major function inhibition of cingulate) over semantic networks during the Stroop task. reflexive or bottom-up responses to external stimuli in order The idea of alien rather than self control is also found in to allow autonomous action. A classical paradigm to study some forms of schizophrenia, a disorder that has also been the inhibition of habitual responses is the Stroop task shown to involve abnormalities in the anterior cingulate and (Stroop 1935). In this task, subjects name the color of the basal ganglia (Benes 1993; Early 1994). ink of a word. Sometimes, the word is a color name (e.g., Cognitive studies have shown several forms of short term red) in a different ink color (e.g., blue). In those incongruent or WORKING MEMORY and considerable independence trials subjects automatically read “red” and have to inhibit between them (Baddeley 1986). Recent imaging data show this answer to respond “blue.” Inhibition produces interfer- that verbal, spatial, and object memories involve separate ence revealed by slow reaction times in the incongruent con- anatomical areas (Smith and Jonides 1995). There is evi- dition. dence that all forms of memory are interfaced to a common Is it possible to uncover the neural substrates of cognitive executive system that involves the same midline frontal control? Imaging techniques developed in the last several anatomy described previously (Baddeley 1986; Posner and years have yielded promising results (Toga and Mazziota Raichle 1994). 1996). An area in the medial surface of the frontal lobe, PET studies have also shown that executive attention named the anterior cingulate gyrus, appears to be important plays an important role in high level skills (Kosslyn 1994; for the inhibition of automatic response that is central to Posner and Raichle 1994). Studies involving recording from voluntary action. Five studies involving measurement of scalp electrodes have provided some information on the time blood flow by POSITRON EMISSION TOMOGRAPHY (PET) in a course of the activations found in PET studies during read- Stroop task have shown activation of the anterior cingulate ing. Skills such as READING have a very strong dependence Attention in the Human Brain 45 on rapid processing. A skilled reader fixates on a given word Schwartz, and D. Shapiro, Eds., Consciousness and Self Regu- lation. New York: Plenum, pp. 1–17. for only about 275 msec (Rayner and Sereno 1994). In gen- Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintun, and M. E. erating the use of visual words, activation of the cingulate Raichle. (1989). Positron emission tomographic studies of the begins as early as 150 msec after input when blocks of trials processing of single words. Journal of Cognitive Neuroscience in which subjects derive a word meaning alternate with 1: 153–170. blocks in which they read the word aloud (Snyder et al. Posner, M. I., and G. J. DiGirolamo. (1998). Conflict, target detec- 1995). The cingulate activation occurs whenever higher level tion and cognitive control. In R. Parasuraman, Ed., The Atten- supervisory control is needed to organize the mental tive Brain. Cambridge, MA: MIT Press. response to the input. In the case of generating the use of a Posner, M. I., and M. E. Raichle. (1994). Images of Mind. New word, attention leads and probably is required for the activa- York: Scientific American Library. tion of a network of areas that lead eventually to articulation Rayner, K., and S. C. Sereno. (1994). Eye movements in reading: psycholinguistic studies. In M. A. Gernsbacher, Ed., Handbook of novel ideas associated with the input string. We see an of Psycholinguistics. New York: Academic Press, pp. 57–81. early semantic analysis of the input word after 200 msec and Smith, E. E., and J. Jonides. (1995). Working memory in humans: development of associations to the input in frontal and pari- neuropsychological evidence. In M. S. Gazzaniga, Ed., The etal sites over the next second. Although it is possible to lay Cognitive Neurosciences. Cambridge, MA: MIT Press, pp. out a sequence of processing steps, they can be misleading. 1009–1020. Because attention may occur rather early it is possible for Snyder, A. Z., Y. Abdullaev, M. I. Posner, and M. E. Raichle. subjects to reprogram the organization of these steps and (1995). Scalp electrical potentials reflect regional cerebral thus to carry out a number of different instructions with the blood flow responses during processing of written words. Pro- same brain network. Studies of the role of attention suggest ceedings of the National Academy of Sciences 92: 1689–1693. that reorganization involves amplification of the operations Stroop, J. R. (1935). Studies of interference in serial verbal reac- tions. Journal of Experimental Psychology 18: 643–662. that are attended in comparison to unattended operations. Toga, A. W., and J. C. Mazziotta, Eds. (1996). Brain Mapping: The Increases in overall neuronal activity appear to produce Methods. New York: Academic Press. faster speed and higher priority for the attended computa- tions. As attention is released from high order activity during Further Readings practice in the skill it becomes possible to improve the speed of performance by amplification of early processing steps. Bisiach, E. (1992). Understanding consciousness: clues from uni- Studies of mental arithmetic, visual IMAGERY, and other lateral neglect and related disorders. In A. D. Milner and M. D. forms of skilled performance using neuroimaging methods Rugg (Eds.), The Neuropsychology of Consciousness. London: seem to support many of the same principles that have been Academic Press, pp. 113–139. Burgess, P. W., and T. Shallice. (1996). Response suppression, ini- outlined above for word reading. tiation and strategy use following frontal lobe lesions. Neuro- See also ATTENTION IN THE ANIMAL BRAIN; AUDITORY psychologia 34: 263–273. ATTENTION; ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC Chelazzi, L., E. K. Miller, J. Duncan, and R. Desimone. (1993). A FIELDS; EYE MOVEMENTS AND VISUAL ATTENTION; ILLUSIONS; neural basis for visual search in inferior temporal cortex. MAGNETIC RESONANCE IMAGING; TOP-DOWN PROCESSING IN Nature 363: 345–347. VISION; VISUAL WORD RECOGNITION D'Esposito, M., J. A. Detre, D. C. Alsop, R. K. Shin, S. Atlas, and M. Grossman. (1995). The neural basis of the central executive —Michael I. Posner and Diego Fernandez-Duque system of working memory. Nature 378: 279–281. Démonet, J. F., R. Wise, and R. S. J. Frackowiak. (1993). Language functions explored in normal subjects by positron emission References tomography: a critical review. Human Brain Mapping 1: 39–47. Baddeley, A. (1986). Working Memory. Oxford: Oxford University Graves, R. E., and B. S. Jones. (1992). Conscious visual perceptual Press. awareness vs. non-conscious visual spatial localisation exam- Benes, F. M. (1993). Relationship of cingulate cortex to schizo- ined with normal subjects using possible analogues of blind- phrenia and other psychiatric disorders. In B. A. Vogt and M. sight and neglect. Cognitive Neuropsychology 9(6): 487–508. Gabriel, Eds., Neurobiology of Cingulate Cortex and Limbic Jonides, J. P. (1981). Voluntary versus automatic control over the Thalamus. Boston: Birkhauser. mind's eye. In J. Long and A. Baddeley, Eds., Attention and Early, T. S. (1994). Left globus pallidus hyperactivity and right- Performance IX. Hillsdale, NJ: Erlbaum, pp. 187–204. sided neglect in schizophrenia. In R. L. Cromwell and C. R. LaBerge, D. (1995). Attentional Processing: The Brain's Art of Snyder, Eds., Schizophrenia: Origins, Processes, Treatment Mindfulness. Cambridge, MA: Harvard University Press. and Outcome. New York: Oxford University Press, pp. 17–30. Pardo, J. V., P. T. Fox, and M. E. Raichle. (1991). Localization of a Goldberg, G., and K. K. Bloom. (1990). The alien hand sign. human system for sustained attention by positron emission American Journal of Physical Medicine and Rehabilitation tomography. Nature 349(6304): 61–64. 69(5): 228–238. Pardo, J. V., P. J. Pardo, K. W. Janer, and M. E. Raichle. (1990). Houk, J. C. (1995). Information processing in modular circuits The anterior cingulate cortex mediates processing selection in linking basal ganglia and cerebral cortex. In J. C. Houk, J. L. the Stroop attentional conflict paradigm. Proceedings of Davies, and D. G. Beiser, Eds., Model of Information Process- National Academy of Science 87: 256–259. ing in the Basal Ganglia. Cambridge, MA: Bradford, pp. 3–10. Posner, M. I., G. J. DiGirolamo, and D. Fernandez-Duque. (1997). Kosslyn, S. M. (1994). Image and Brain. Cambridge, MA: MIT Brain mechanisms of cognitive skills. Consciousness and Cog- Press. nition 6: 267–290. Norman, D. A., and T. Shallice. (1986). Attention to action: willed Rafal, R. D. (1994). Neglect. Current Opinion in Neurobiology 4: and automatic control of behavior. In R. J. Davidson, G. E. 231–236. 46 Attribution Theory the other person; that is, the person is “figural” against the Stuss, D. T., T. Shallice, M. P. Alexander, and T. W. Picton. (1995). A multidisciplinary approach to anterior attention functions. In “ground” of the situation. J. Grafman, K. J. Holyoak, and F. Boller, Eds., Structure and A key development in research on the perceptual inter- Functions of the Human Prefrontal Cortex. New York: New pretation of the FAE was Jones and Nisbett’s (1972) argu- York Academy of Sciences. ment that although to an observer of action the person is Umiltà, C. (1988). Orienting of attention. In F. Boller and J. Graf- figural, to the actor the situation is figural and hence self- man, Eds., Handbook of Neuropsychology. Amsterdam: attributions for behavior are less dispositional. The percep- Elsevier, pp. 175–192 tual account of this actor-observer difference was supported by experiments that presented actors with the visual perspec- Attribution Theory tive on their behavior of an observer (by use of videotape) and found that their self-attributions became more disposi- tional (Storms 1973). Nevertheless, evidence also emerged Because humans are social animals, an individual’s pros- for nonperceptual interpretations of why first-person expla- pects for survival and success depend on the ability to nations differ. An actor draws on different information— understand, predict, and influence the behavior of other per- information about one’s mental state while acting and about sons. Hence “people watching” is an essential human one’s behavior in the past (see Eisen 1979). impulse. Yet it does not suffice to merely watch other peo- A shift toward emphasis on cognitive mechanisms came in ple’s overt actions; we strive to infer why people behave as the research programs constructed on the foundation of they do. The psychological processes underlying these Kelley’s (1967 and 1972) models. Kelley’s (1967) covariation interpretations of the causes of behavior are studied in a model focused on cases where an uncertain or curious subfield of social psychology known as attribution theory. observer generates an attribution “bottom up” from the data Although all proposed models of attribution assume that provided by multiple instances of a behavior. Attributors accurate understanding of the actual causes of behavior is a induce the general locus of causation for a behavior by assess- primary goal, models differ in assumptions about process- ing how the behavior covaries with the actor, the situation, and ing limitations that impede accuracy and about other human the temporal occasion. For example, to interpret why Sue’s goals that interfere with accuracy. As in many areas of cog- date John is dancing by himself, one would consider whether a nitive science, such as perception and DECISION MAKING, consensus of people at the party are dancing alone, whether researchers often attempt to learn how the system works dancing alone is something that Sue’s dates have often done, from where it fails, testing predictions about the errors that and whether it is something that John has often done. In tests would arise from a process. One pattern of error is the ten- of the model, participants generally respond to summaries of dency of observers to overestimate how much another’s covariation data roughly as predicted, with the exception that behavior is determined by the person’s stable traits, a ten- consensus information is under-weighted (McArthur 1972). dency first described by Ichheiser (1949) and formalized by Biases in INDUCTION have been interpreted in terms of “extra” Ross (1977) as “the fundamental attribution error” (FAE). information implicitly communicated to participants (see Errors in attributing to stable dispositions (e.g., aptitudes, GRICE; Hilton and Slugoski 1985; McGill 1989) or in terms of attitudes, and traits) have important consequences for the “missing” information that participants lack (Cheng and Nov- observer’s subsequent expectancies, evaluations, and behav- ick 1992). Of late, research on causal induction has merged ioral responses. For example, when a teacher attributes a with the field of CAUSAL REASONING. student’s failure to lack of intelligence (as opposed to a situ- To model more typical cases of attribution where people ational factor) this leads to reduced expectations of the stu- lack time and energy to work “bottom up,” Kelley (1972) dent’s future success, reduced liking of the student, and proposed that people interpret a single instance of behavior reduced teaching investment in the student. We will review “top down” from a theory. For example, an attributor who theory and research on the FAE to illustrate how attribution applies the Multiple Sufficient Causes (MSC) schema fol- theory has progressed. lows the discounting principle that if one of two alternative The blueprint for attribution theory was Heider’s (1958) causes is present then the other is less likely. Tests of this GESTALT PERCEPTION analysis of interpersonal interaction. model, however, found that a dispositional attribution is not He argued that a person’s response to a social situation is fully discounted by information about the presence of a suf- largely a function of how the person subjectively organizes ficient situational cause for the behavior (Snyder and Jones the stimulus of a social situation, such as through attribu- 1974). This manifestation of the FAE was interpreted prima- tions. Perhaps the most influential idea is that attributions rily in terms of human cognitive limitations that require the are guided by lay theories such as the schema that achieve- use of JUDGMENT HEURISTICS, such as anchoring, when ment reflects both situational forces (environmental fac- making discounting inferences about a dispositional cause tors that facilitate or constrain the actor) and internal (Jones 1979). forces (the combination of effort and aptitude). Heider To integrate insights about different mechanisms contrib- contended that the attributor, like a scientist, uses such the- uting to the FAE, researchers have proposed sequential ories in combination with his or her observations. How- stage models: An initial perception-like process traces an ever, the attributor errs because his or her observations are actor’s behavior to a corresponding disposition, then a sec- distorted by perceptual processes and, sometimes, emo- ond inference-like process adjusts the initial attribution to tional processes. For Heider, the FAE results from Gestalt account for any situational factors. Whereas the first stage is perceptual processes that draw an observer’s attention to posited to be automatic, much like a perceptual module, the the other person rather than to the situation surrounding Attribution Theory 47 second stage requires effort and attention and hence takes References place only if these are available—that is, if the attributor is Abramson, L. Y., M. E. P. Seligman, and J. Teasdale. (1978). Learned not “cognitively busy.” In support of a dual process model, helplessness in humans: critique and reformulation. Journal of experiments find that increasing participants’ “busyness” Abnormal Psychology 87: 49–74. results in more dispositional attributions (Gilbert, Pelham, Carroll, J. S., and J. W. Payne. (1977). Crime seriousness, recidi- and Krull 1988). However, it remains unclear whether the vism risk and causal attribution in judgments of prison terms by initial dispositional process is a perceptual module or students and experts. Journal of Applied Psychology 62: 595– merely a well-learned schematic inference. 602. Reacting against analyses of attribution as a decontextu- Cheng, P. W., and L. R. Novick. (1992). Covariation in natural alized cognitive task, another recent theme is that attribu- causal induction. Psychological Review 99: 365–382. Dweck, C. S., Y. Hong, and C. Chiu. (1993). Implicit theories: tions are markedly influenced by goals related to particular individual differences in the likelihood and meaning of disposi- social contexts. The goal of assigning blame seems to tional inference. Personality and Social Psychology Bulletin 19: accentuate the FAE (Shaver 1985). The goal of maintaining 644–656. self-esteem leads people to dispositional attributions for Eisen, S. V. (1979). Actor-observer differences in information successes but not failures (Snyder, Stephan, and Rosenfield inference and causal attribution. Journal of Personality and 1976). The goal of making a good impression on an audi- Social Psychology 37: 261–272. ence mitigates the FAE (Tetlock 1985). Studies of attribu- Gilbert, D. T., B. W. Pelham, and D. S. Krull. (1988). On cognitive tions in context have brought renewed attention to the busyness: when person perceivers meet persons perceived. important consequences of attribution, for example, the Journal of Personality and Social Psychology 54: 733-740. relation of dispositional attributions to sanctioning decisions Heider, F. (1958). The Psychology of Interpersonal Relations. New York: Wiley. (Carroll and Payne 1977). Applied research has found that Hilton, D. J., and B. R. Slugoski. (1985). Knowledge-based causal self-serving styles of attribution not only protect an individ- attribution: the abnormal conditions focus model. Psychologi- ual against clinical depression (Abrahamson, Seligman, and cal Review 93: 75–88. Teasdale 1978) but also contribute to achievement motiva- Ichheiser, G. (1949). Misunderstandings in human relations. Amer- tion and performance (Weiner 1985). ican Journal of Sociology 55: 150–170. A current direction of attribution research involves closer Jones, E. E. (1979). The rocky road from acts to dispositions. attention to the knowledge structures that shape causal American Psychologist 34: 107–117. explanations. Explanations are constructed in order to Jones, E. E., and R. E. Nisbett. (1972). The actor and the observer: cohere with the content knowledge triggered by observation divergent perceptions of the causes of behavior. In E. E. Jones of behavior, such as stereotypes and scripts (Read and Miller et al., Eds., Attribution: Perceiving the Causes of Behavior. Morristown, NJ: General Learning Press. 1993). They are also constrained by frames for what consti- Kelley, H. H. (1967). Attribution theory in social psychology. In D. tutes an EXPLANATION in a given setting (Pennington and Levine, Ed., Nebraska Symposium on Motivation. Lincoln: Hastie 1991). The guiding role of knowledge structures elu- University of Nebraska Press. cidates why the pattern of attribution errors differs across Kelley, H. H. (1972). Causal schemata and the attribution process. individuals, institutions, and cultures. Recent evidence In E.E. Jones et al., Eds., Attribution: Perceiving the Causes of points to individual differences (Dweck, Hong, and Chiu Behavior. Morristown, NJ: General Learning Press. 1993) and cultural differences (Morris and Peng 1994) in the McArthur, L. Z. (1972). The how and what of why: some determi- underlying causal schemas or lay theories that guide attribu- nants and consequences of causal attributions. Journal of Per- tion. For example, findings that the FAE is stronger in West- sonality and Social Psychology 22: 171–193. ern, individualistic societies than in collectivist societies Morris, M. W., and R. Larrick. (1995). When one cause casts doubt on another: a normative model of discounting in causal attribu- such as China seems to reflect different lay theories about tion. Psychological Review 102(2): 331–335. the autonomy of individuals relative to social groups. Closer Morris, M. W., and K. Peng. (1994). Culture and cause: American measurement of causal schemas and theories reveals that and Chinese attributions for social and physical events. Journal attribution errors previously interpreted in terms of process- of Personality and Social Psychology 67: 949–971. ing limitations, such as the incomplete discounting of dispo- Pennington, N., and R. Hastie. (1991). A cognitive theory of juror sitions, reflect participants’ knowledge structures rather than decision making: the story model. Cardozo Law Review 13: their inferential processes (Morris and Larrick 1995). 519–557. In sum, attribution theory has moved beyond the identifi- Read, S. J., and L. C. Miller. (1993). Rapist or “regular guy”: cation of errors like the FAE to an understanding of how explanatory coherence in the construction of mental models of they are produced by perceptual and cognitive limitations, others. Personality and Social Psychology Bulletin 19: 526–540. Ross, L. (1977). The intuitive psychologist and his shortcomings: by contextual goals, and by knowledge structures. As the distortions in the attribution process. In L. Berkowitz, Ed., story of research on the FAE phenomenon illustrates, attri- Advances in Experimental Social Psychology vol. 10, New bution theory uncovers reciprocal relations between individ- York: Academic Press, pp. 174–221. ual cognition, on one hand, and social and cultural contexts, Shaver, K. G. (1985). The Attribution of Blame. New York: Springer. on the other hand, and hence bridges the cognitive and Snyder, M. L., and E. E. Jones. (1974). Attitude attribution when social sciences. behavior is constrained. Journal of Experimental Social Psy- See also ECONOMICS AND COGNITIVE SCIENCE; SCHE- chology 10: 585–600. MATA; SOCIAL COGNITION Snyder, M. L., W. G. Stephan, and D. Rosenfield. (1976). Egotism and attribution. Journal of Personality and Social Psychology —Michael W. Morris, Daniel Ames, and Eric Knowles 33: 435–441. 48 Audition tion or fatigue (Ward 1973), and prolonged high-intensity Storms, M. D. (1973). Videotape and the attribution process: reversing actors' and observers' points of view. Journal of Per- stimulation can damage the sensory process (noise-induced sonality and Social Psychology 27: 165–175. hearing loss; for a series of review articles, see J. Acoust. Soc. Tetlock, P. E. (1985). Accountability: A social check on the funda- Am. 1991, vol. 90: 124–227). mental attribution error. Social Psychology Quarterly 48: 227– The auditory system is organized tonotopically such that 236. the frequency of a stimulating sound is mapped onto a loca- Weiner, B. (1985). An attributional theory of achievement motiva- tion along the basilar membrane within the cochlea, provid- tion and emotion. Psychological Review 92: 548–573. ing a place code (cf. AUDITORY PHYSIOLOGY). For example, low-frequency tones lead to maximal displacement of the Further Readings apical portion of the basilar membrane and high-frequency Gilbert, D. T., and P. S. Malone. (1995). The correspondence bias. tones lead to maximal displacement of the basal portion of Psychological Bulletin 117: 21–38. the basilar membrane. In addition, cells exhibit frequency Jones, E. E., D. E. Kannouse, H. H. Kelley, R. E. Nisbett, S. Valins, selectivity throughout the auditory pathway (e.g., Pickles and B. Weiner, Eds. (1972). Attribution: Perceiving the Causes 1988). This tonotopic organization provides a basis for of Behavior. Morristown, NJ: General Learning Press. spectral analysis of sounds. Temporal aspects of the stimu- Kunda, Z. (1987). Motivated inference: self-serving generation and lus (waveform fine structure or envelope) are preserved in evaluation of causal theories. Journal of Personality and Social the pattern of activity of auditory nerve fibers (Kiang et al., Psychology 53: 636–647. 1965), providing a basis for the coding of synchronized McArthur, L. Z., and R. M. Baron. (1983). Toward an ecological activity both across frequency and across the two ears. The theory of social perception. Psychological Review 90: 215–238. dual presence of place and timing cues is pervasive in mod- McGill, A. L. (1989). Context effects in judgments of causation. Journal of Personality and Social Psychology 57: 189–200. els of auditory perception. Michotte, A. E. (1946). La Perception de la Causalité. Paris: J. The percept associated with a particular sound might be Vrin. Published in English (1963) as The Perception of Causal- described in a variety of ways, but descriptions in terms of ity. New York: Basic Books. pitch, loudness, timbre, and perceived spatial location are Nisbett, R. E., and L. Ross. (1980). Human Inference: Strategies probably the most common (Blauert 1983; Yost 1994; and Shortcomings of Social Judgment. Englewood Cliffs, NJ: Moore 1997). Pitch is most closely associated with sound Prentice-Hall. frequency, or the fundamental frequency for complex peri- Read, S. J. (1987). Constructing causal scenarios: a knowledge odic sounds; loudness is most closely associated with sound structure approach to causal reasoning. Journal of Personality intensity; and timbre is most closely associated with the dis- and Social Psychology 52: 288–302. tribution of acoustic energy across frequency (i.e., the shape Regan, D. T., and J. Totten. (1975). Empathy and attribution: turn- ing observers into actors. Journal of Personality and Social of the power spectrum). The perceived location of a sound Psychology 32: 850–856. in space (direction and distance) is based primarily on the Schank, R. C., and R. P. Abelson. (1977). Scripts, Plans, Goals comparison of the sound arriving at the two ears (binaural and Understanding. Hillsdale, NJ: Erlbaum. hearing) and the acoustical filtering associated with the Winter, L., and J. S. Uleman. (1984). When are social judgments presence of the head and pinnae. Each of these perceptual made? Evidence for the spontaneousness of trait inferences. classifications also depends on other factors, particularly Journal of Personality and Social Psychology 47: 237–252. when complex, time-varying sounds are being considered. The frequency range of human hearing extends from a few cycles per second (Hertz, abbreviated Hz) to about Audition 20,000 Hz, although the upper limit of hearing decreases markedly with age (e.g., Weiss 1963; Stelmachowitcz et Audition refers to the perceptual experience associated with al. 1989). The intensity range of human hearing extends stimulation of the sense of hearing. For humans, the sense of over many orders of magnitude depending on frequency; at hearing is stimulated by acoustical energy—sound waves— 2–4 kHz, the range may be greater than twelve orders of that enter the outer ear (pinna and external auditory meatus) magnitude (120 decibels, abbreviated dB). and set into vibration the eardrum and the attached bones Despite the wide dynamic range of human hearing, the (ossicles) of the middle ear, which transfer the mechanical auditory system is remarkably acute: the just-discriminable energy to the inner ear, the cochlea. The auditory system can difference (JND) in frequency is as small as 0.2 percent (e.g., also be stimulated by bone conduction (Tonndorf 1972) when Wier, Jesteadt, and Green 1977) and in intensity is approxi- the sound source causes the bones of the skull to vibrate (e.g., mately one dB (e.g., Jesteadt, Wier, and Green 1977). Sensi- one’s own voice may be heard by bone conduction). Mechan- tivity to differences in sounds arriving at the two ears is ical energy is transduced into neural impulses within the perhaps even more remarkable: time delays as small as a few cochlea through the stimulation of the sensory hair cells microseconds may be discerned (Klumpp and Eady 1956). which synapse on the eighth cranial, or auditory, nerve. In Although behavioral estimates of the JND for intensity, fre- addition to the ascending, or afferent, auditory pathway from quency, etc., provide invaluable information regarding the the cochlea to the cortex, there is a descending, efferent, path- basic properties of the human auditory system, it is impor- way from the brain to the cochlea, although the functional tant to keep in mind that estimates of JNDs depend on both significance of the efferent pathway is not well understood at sensory and nonsensory factors such as memory and atten- present (Brugge 1992). Immediately following stimulation, tion (e.g., Harris 1952; Durlach and Braida 1969; Berliner the auditory system may become less sensitive due to adapta- and Durlach 1972; Howard et al. 1984). Audition 49 The interference one sound causes in the reception of Durlach, N. I., and L. D. Braida. (1969). Intensity perception I, Preliminary theory of intensity resolution. J. Acoust. Soc. Am. another sound is called masking. Masking has a peripheral 46: 372–383. component resulting from interfering/overlapping patterns of Fletcher, H. (1940). Auditory patterns. Rev. Mod. Phys. 12: 47–65. excitation in the auditory nerve (e.g., Greenwood 1961), and Greenwood, D. D. (1961). Auditory masking and the critical band. a central component due to uncertainty, sometimes called J. Acoust. Soc. Am. 33: 484–502. “informational masking” (Watson 1987; see also AUDITORY Harris, J. D. (1952). The decline of pitch discrimination with time. ATTENTION). In a classic experiment, Fletcher (1940) studied J. Exp. Psych. 43: 96–99. the masking of a tone by noise in order to evaluate the fre- Howard, J. H., A. J. O’Toole, R. Parasuraman, and K. B. Bennett. quency selectivity of the human auditory system. To account (1984). Pattern-directed attention in uncertain-frequency detec- for the obtained data, Fletcher proposed a “critical band” that tion. Percept. Psychophys. 35: 256–264. likened the ear to a bandpass filter (or, to encompass the Jesteadt, W., C. C. Wier, and D. M. Green (1977). Intensity dis- crimination as a function of frequency and sensation level. entire frequency range, a set of contiguous, overlapping J. Acoust. Soc. Am. 61: 169–177. bandpass filters). This proposed “auditory filter” is a theoret- Kiang, N. Y-S., T. Watanabe, E. C. Thomas, and L. F. Clark. ical construct that reflects frequency selectivity present in the (1965). Discharge Patterns of Single Fibers in the Cat's Audi- auditory system, and, in one form or another, auditory filters tory Nerve. Cambridge, MA: MIT Press. comprise a first stage in models of the spectrotemporal Klumpp, R., and H. Eady. (1956). Some measurements of interau- (across frequency and time) analysis performed by the audi- ral time differences thresholds. J. Acoust. Soc. Am. 28: 859– tory system (e.g., Patterson and Moore 1986). 864. The separation of sound into multiple frequency chan- Moore, B. C. J. (1997). An Introduction to the Psychology of Hear- nels is not sufficient to provide a solution to the problem of ing. Fourth edition. London: Academic Press. sound segregation. Sound waves from different sources sim- National Center for Health Statistics (1993). Vital statistics: preva- lence of selected chronic conditions: United States 1986–1988, ply add, meaning that the frequencies shared by two or Series 10. Data from National Health survey #182, USHHS, more sounds are processed en masse at the periphery. In PHS. order to form distinct images, the energy at a single fre- Patterson, R. A., and B. C. J. Moore. (1986). Auditory filters and quency must be appropriately parsed. The computations excitation patterns as representations of frequency resolution. used to achieve sound segregation depend on the coherence/ In B. C. J. Moore, Ed., Frequency Selectivity in Hearing. New incoherence of sound onsets, the shared/unshared spatial York: Academic Press. location of the sound sources, differences in the harmonic Pickles, J. O. (1988). An Introduction to the Physiology of Hear- structure of the sounds and other cues in the physical stimu- ing. Second edition. London: Academic Press. lus. Yost has proposed that the spectrotemporal and spatial- Stelmachowitcz, P. G., K. A. Beauchaine, A. Kalberer, and W. Jest- location analysis performed by the auditory system serves eadt. (1989). Normative thresholds in the 8- to 20-kHz range as a function of age. J. Acoust. Soc. Am. 86: 1384–1391. the purpose of sound source determination (Yost 1991) and Tonndorf, J. (1972). Bone conduction. In J. V. Tobias, Ed., Founda- allows the subsequent organization of sound images into an tions of Modern Auditory Theory II. New York: Academic internal map of the acoustic environment (Bregman 1990). Press. Approximately 28 million people in the United States Ward, W. D. (1973). Adaptation and Fatigue. In J. Jerger, Ed., suffer from hearing loss, and a recent census indicated that Modern Developments in Audiology. New York: Academic deafness and other hearing impairments ranked 6th among Press. chronic conditions reported (National Center for Health Sta- Watson, C. S. (1987). Uncertainty, informational masking, and the tistics 1993). Among those aged sixty-five and older, deaf- capacity of immediate auditory memory. In W. A. Yost and C. ness and other hearing impairments ranked third among S. Watson, Eds., Auditory Processing of Complex Sounds. chronic conditions. The assessment of function and non- Hillsdale, NJ: Erlbaum. Weiss, A. D. (1963). Auditory perception in relation to age. In J. E. medical remediation of hearing loss is typically performed Birren, R. N. Butler, S. W. Greenhouse, L. Sokoloff, and M. by an audiologist, whereas the diagnosis and treatment of Tarrow, Eds., Human Aging: a Biological and Behavioral ear disease is performed by an otologist. Study. Bethesda: NIMH. See also AUDITORY PLASTICITY; PHONOLOGY, ACQUISI- Wier, C. C., W. Jesteadt, and D. M. Green (1977). Frequency dis- TION OF PSYCHOPHYSICS; SIGN LANGUAGE AND THE BRAIN; crimination as a function of frequency and sensation level. J. SPEECH PERCEPTION Acoust. Soc. Am. 61: 178–184. Yost, W. A. (1991). Auditory image perception and analysis. Hear. —Virginia M. Richards and Gerald D. Kidd, Jr. Res. 56: 8–18. Yost, W. A. (1994). Fundamentals of Hearing: An Introduction. San Diego: Academic Press. References Berliner, J. E., and N. I. Durlach. (1972). Intensity perception IV. Further Readings Resolution in roving-level discrimination. J. Acoust. Soc. Am. 53: 1270–1287. Gilkey, R. A., and T. R. Anderson. (1997). Binaural and Spatial Blauert, J. (1983). Spatial Hearing. Cambridge, MA: MIT Press. Hearing in Real and Virtual Environments. Hillsdale, NJ: Bregman, A. S. (1990). Auditory Scene Analysis. Cambridge, MA: Erlbaum. MIT Press. Green, D. M. (1988). Profile Analysis: Auditory Intensity Discrim- Brugge, J. F. (1992). An overview of central auditory processing. ination. Oxford: Oxford Science Publications. In A. N. Popper and R. R. Fay, Eds., The Mammalian Auditory Hamernik, R. P., D. Henderson, and R. Salvi. (1982). New Perspec- Pathway: Neurophysiology. New York: Springer. tives on Noise-Induced Hearing Loss. New York: Raven. 50 Auditory Attention Anderson 1977). In addition, Benson and Heinz (1978), Hartmann, W. M. (1997). Signals, Sound and Sensation. Wood- bury, NY: AIP Press. studying single cells in monkey primary auditory cortex NIH (1995). NIH Consensus Development Conferences on during a selective attention task (dichotic listening), Cochlear Implants in Adults and Children. Bethesda: NIH. reported relative enhancement of the responses to attended stimuli. Attending to sounds to perform sound localization vs. simple detection also has been shown to result in Auditory Attention enhanced firing of units in auditory cortex (Benson, Heinz, and Goldstein 1981). Selective ATTENTION may be defined as a process by which Auditory attention has been investigated extensively in the perception of certain stimuli in the environment is humans using event-related potentials (ERPs) and event- enhanced relative to other concurrent stimuli of lesser related magnetic fields (ERFs). These recordings can nonin- immediate priority. A classic auditory example of this phe- vasively track with high temporal resolution the brain activ- nomenon is the so-called cocktail party effect, wherein a ity associated with different types of stimulus events. By person can selectively listen to one particular speaker while analyzing changes in the ERPs or ERFs as a function of the tuning out several other simultaneous conversations. direction of attention, one can make inferences about the For many years, psychological theories of selective atten- timing, level of processing, and anatomical location of stim- tion were traditionally divided between those advocating ulus selection processes in the brain. early levels of stimulus selection and those advocating late In an early seminal ERP study, Hillyard et al. (1973) selection. Early selection theories held that there was an implemented an experimental analog of the cocktail party early filtering mechanism by which “channels” of irrelevant effect and demonstrated differential processing of attended input could be attenuated or even rejected from further pro- and unattended auditory stimuli at the level of the “N1” cessing based on some simple physical attribute (BROAD- wave at ~100 msec poststimulus. More recent ERP studies BENT 1970; Treisman 1969). In contrast, late selection furthering this approach have reported that focused auditory theories held that all stimuli are processed to the same con- selective attention can affect stimulus processing as early as siderable detail, which generally meant through completion 20 msec poststimulus (the “P20-50” effect; Woldorff et al. of perceptual analysis, before any selection due to attention 1987). Additional studies using ERPs (Woldorff and Hill- took place (Deutsch and Deutsch 1963). yard 1991) and using ERFs and source-analysis modeling Various neurophysiological studies have attempted to (Woldorff et al. 1993) indicated these electrophysiological shed light on both the validity of these theories and the neu- attentional effects occurred in and around primary auditory ral mechanisms that underlie auditory attention. One possi- cortex, had waveshapes that precisely took the form of an ble neural mechanism for early stimulus selection would be amplitude modulation of the early sensory-evoked compo- the attenuation or gating of irrelevant input at the early lev- nents, and were colocalized with the sources of these sen- els of the sensory pathways by means of descending modu- sory-evoked components. These results were interpreted as latory pathways (Hernandez-Peón, Scherrer, and Jouvet, providing strong evidence for the existence of an attention- 1956). For example, there is a descending pathway in the ally modulated, sensory gain control of the auditory input auditory system that parallels the ascending one all the way channels at or before the initial stages of cortical processing, out to the cochlea (Brodal 1981), and direct electrical stimu- thereby providing strong support for early selection atten- lation of this descending pathway at various levels, includ- tional theories that posit that stimulus input can be selected ing auditory cortex, can inhibit the responses of the afferent at levels considerably prior to the completion of perceptual auditory nerves to acoustic input. Other animal studies have analysis. Moreover, the very early onset latency of these indicated that stimulation of pathways from the frontal cor- attentional effects (20 ms) strongly suggests that this selec- tex and the mesencephalic reticular formation can modulate tion is probably accomplished by means of a top-down, pre- sensory transmission through the THALAMUS, thus provid- set biasing of the stimulus input channels. ing another mechanism by which higher brain centers might On the other hand, reliable effects of attention on the ear- modulate lower level processing during selective attention liest portion of the human auditory ERP reflecting auditory (Skinner and Yingling 1977). In addition, sensory process- nerve and brainstem-level processing have generally not ing activity in primary auditory CEREBRAL CORTEX or early been found (Picton and Hillyard 1974), thus providing no auditory association cortices could conceivably be directly evidence for peripheral filtering via the descending auditory modulated by “descending” pathways from still higher cor- pathway that terminates at the cochlea. Nevertheless, recent tical levels. research measuring a different type of physiological It has proven difficult, however, to demonstrate that any response—otoacoustic cochlear emissions—has provided of these possible mechanisms for sensory modulation are some evidence for such early filtering (Giard et al. 1991). actually used during auditory attention. Early animal studies Additional evidence that attention can affect early audi- purporting to show attenuation of irrelevant auditory input tory processing derives from studies of another ERP/ERF at the sensory periphery (Hernandez-Peón, Scherrer, and wave known as the mismatch negativity/mismatch field Jouvet 1956) were roundly criticized on methodological (MMN/MMF), which is elicited by deviant auditory stimuli grounds (Worden 1966). Nevertheless, there have been ani- in a series of identical stimuli. Because the MMN/MMF can mal studies providing evidence of some very early (i.e., be elicited in the absence of attention and by deviations in brainstem-level) modulation of auditory processing as a any of a number of auditory features, this wave was pro- function of attentional state or arousal (e.g., Oatman and posed to reflect a strong automaticity of the processing of Auditory Attention 51 auditory stimulus features (reviewed in Naatanen 1990 and ous hemodynamic imaging studies, the anterior cingulate is 1992). Both the MMN (Woldorff et al. 1991) and the MMF likely to be involved, as it is activated during a number of (Woldorff et al. 1998), however, can also be modulated by cognitive and/or executive functions (Posner et al. 1988). In attention, being greatly reduced when attention is strongly addition, human lesion studies suggest the prefrontal cortex focused elsewhere, thus providing converging evidence that is important for modulating the activity in the ipsilateral attention can influence early auditory sensory analysis. On auditory cortex during auditory attention (Knight et al. the other hand, the elicitation of at least some MMN/MMF 1981). It may be that some of the slower-frequency, endoge- for many different feature deviations in a strongly ignored nous ERP auditory attention effects reflect the activation of auditory channel has been interpreted as evidence that con- these areas as they serve to modulate or otherwise control siderable feature analysis is still performed even for unat- auditory processing. Whether these mechanisms actually tended auditory stimuli (Alho 1992). An intermediate view employ thalamic gating, some other modulatory mecha- that may accommodate these findings is that various aspects nism, or a combination, is not yet known. of early auditory sensory processing and feature analysis See also ATTENTION IN THE ANIMAL BRAIN; ATTEN- may be “partially” or “weakly” automatic, occurring even in TION IN THE HUMAN BRAIN; AUDITORY PHYSIOLOGY; the absence of attention but still subject to top-down atten- AUDITORY PLASTICITY; ELECTROPHYSIOLOGY, ELECTRIC tional modulation (Woldorff et al. 1991; Hackley 1993). AND MAGNETIC EVOKED FIELDS; TOP-DOWN PROCESSING Under this view, the very earliest stimulus processing (i.e., IN VISION peripheral and brainstem levels) tends to be strongly auto- matic, but at the initial cortical levels there is a transition —Marty G. Woldorff from strong to weak automaticity, wherein some amount of analysis is generally obligatory but is nevertheless modifi- References able by attention (reviewed in Hackley 1993). Alho, K. (1992). Selective attention in auditory processing as There are also various slower-frequency, longer-latency reflected in event-related brain potentials. Psychophysiology ERP auditory attention effects that are not modulations of 29: 247–263. early sensory activity, but rather appear to reflect “endoge- Benson, D. A., and R. D. Heinz. (1978). Single-unit activity in the nous,” additional activations from both auditory and nonau- auditory cortex of monkeys selectively attending left vs. right ditory association cortex (e.g., “processing negativity,” ear stimuli. Brain Research 159: 307–320. target-related “N2b,” “P300”). This type of activity occurs Benson, D. A., R. D. Heinz, and M. H. Goldstein, Jr. (1981). only or mainly for attended-channel stimuli or only for tar- Single-unit activity in the auditory cortex actively localizing get stimuli within an attended channel and might reflect sound sources: spatial tuning and behavioral dependency. Brain Research 219: 249–267. later selection, classification, or decision processes that Broadbent, D. E. (1970). Stimulus set and response set: Two kinds also occur during auditory attention (reviewed in Alho of selective attention. In D. I. Mostofsky, Ed., Attention: Con- 1992; Näätänen 1992). Attention to less discriminable fea- temporary Theory and Analysis. New York: Appleton-Century- tures of auditory stimuli (Hansen and Hillyard 1983) or to a Crofts, pp. 51–60. conjunction of auditory features (Woods et al. 1991) also Brodal, A. (1981). Neurological Anatomy. New York: Oxford Uni- produces longer-latency differential activation that may versity Press. reflect later selection processes. In addition, there is a Deutsch, J. A., and D. Deutsch. (1963). Attention: some theoretical build-up of endogenous brain electrical activity (a “DC considerations. Psychological Review 70: 80–90. shift”) as subjects begin to attend to a short stream of audi- Giard, M. H., L. Collet, P. Bouchet, and J. Pernier. (1994). Audi- tory stimuli (Hansen and Hillyard 1988), which could tory selective attention in the human cochlea. Brain Research 633: 353–356. reflect some sort of initiation of the controlling executive Hackley, S. A. (1993). An evaluation of the automaticity of sen- function. sory processing using event-related potentials and brain-stem In contrast to electrophysiological studies, relatively few reflexes. Psychophysiology 30: 415–428. hemodynamically-based functional neuroimaging studies Hansen, J. C., and S. A. Hillyard. (1983). Selective attention to have been directed at studying auditory attention in humans. multidimensional auditory stimuli. J. of Exp. Psychology: In a recent study using POSITRON EMISSION TOMOGRAPHY Human Perc. and Perf 9: 1–18. (PET), O’Leary et al. (1996) reported enhanced activity in Hansen, J. C., and S. A. Hillyard. (1988). Temporal dynamics of the auditory cortex contralateral to the direction of attention human auditory selective attention. Psychophysiology 25: 316– during a dichotic listening task. PET studies have also 329. shown that attention to different aspects of speech sounds Hernandez-Peón, R., H. Scherrer, and M. Jouvet. (1956). Modifi- cation of electrical activity in the cochlear nucleus during atten- (e.g., phonetics vs. pitch) can affect the relative activation of tion in unanesthetized cats. Science 123: 331–332. the two hemispheres (Zatorre, Evans, and Meyer 1992). In Hillyard, S. A., R. F. Hink, V. L. Schwent, and T. W. Picton. addition, functional MAGNETIC RESONANCE IMAGING has (1973). Electrical signs of selective attention in the human indicated that intermodal attention can modulate auditory brain. Science 182: 177–179. cortical processing (Woodruff et al. 1996). Knight, R. T., S. A. Hillyard, D. L. Woods, and H. J. Neville. Most neurophysiological studies of auditory attention in (1981). The effects of frontal cortex lesions on event-related humans have focused on the effects of attention on the pro- potentials during auditory selective attention. Electroenceph. cessing of sounds in auditory cortical areas. Less work has Clin. Neurophysiol. 52: 571–582. been directed toward elucidating the neural structures and Naatanen, R. (1990). The role of attention in auditory information mechanisms that control auditory attention. Based on vari- processing as revealed by event-related potentials and other 52 Auditory Physiology brain measures of cognitive function. Behavior and Brain Sci- Alho, K., K. Tottola, K. Reinikainen, M. Sams, and R. Naatanen. ence 13: 201–288. (1987). Brain mechanisms of selective listening reflected by Naatanen, R. (1992). Attention and Brain Function. Hillsdale, NJ: event-related potentials. Electroenceph. Clin. Neurophysiol. 49: Erlbaum. 458–470. Oatman, L. C., and B. W. Anderson. (1977). Effects of visual Arthur, D. L., P. S. Lewis, P. A. Medvick, and A. Flynn. (1991). A attention on tone-burst evoked auditory potentials. Experimen- neuromagnetic study of selective auditory attention. Electroen- tal Neurology 57: 200–211. ceph. Clin. Neurophysiol. 78: 348–360. O’Leary, D. S., N. C. Andreasen, R. R. Hurtig, R. D. Hichwa, G. L. Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Watkins, L. L. B. Ponto, M. Rogers, and P. T. Kirchner. (1996). Organization of Sound. Cambridge, MA: MIT Press. A positron emission tomography study of binaurally- and Hackley, S. A., M. Woldorff, and S. A. Hillyard. (1987). Combined dichotically-presented stimuli: Effects of level of language and use of microreflexes and event-related brain potentials as mea- directed attention. Brain and Language 53: 20–39. sures of auditory selective attention. Psychophysiology 24: Picton, T. W., and S. A. Hillyard. (1974). Human auditory evoked 632–647. potentials: II. Effects of attention. Electroenceph. Clin. Neuro- Hackley, S. A., M. Woldorff, and S. A. Hillyard. (1990). Cross- physiol 36: 191–199. modal selective attention effects on retinal, myogenic, brain- Posner, M. I., S. E. Petersen, P. T. Fox, and M. E. Raichle. (1988). stem and cerebral evoked potentials. Psychophysiology 27: Localization of cognitive operations in the human brain. Sci- 195–208. ence 240: 1627–1631. Hansen, J. C., and S. A. Hillyard. (1980). Endogenous brain poten- Skinner, J. E., and C. D. Yingling. (1977). Central gating mecha- tials associated with selective auditory attention. Electroen- nisms that regulate event-related potentials and behavior. In J. ceph. Clin. Neurophysiol. 49: 277–290. E. Desmedt, Ed., Attention, Voluntary Contraction and Event- Johnston, W. A., and V. J. Dark. (1986). Selective attention. Related Cerebral Potentials. Progress in Clinical Neurophysiol- Annual Rev. of Psychol. 37: 43–75. ogy, vol. 1. New York: S. Karger, pp. 30–69. Okita, T. (1979). Event-related potentials and selective attention to Treisman, A. (1969). Stategies and models of selective attention. auditory stimuli varying in pitch and localization. Biological Psych. Review 76: 282–299. Psychology 9: 271–284. Woldorff, M. G., C. C. Gallen, S. A. Hampson, S. A. Hillyard, C. Rif, J., R. Hari, M. S. Hamalainen, and M. Sams. (1991). Auditory Pantev, D. Sobel, and F. E. Bloom. (1993). Modulation of early attention affects two different areas in the human supratemporal sensory processing in human auditory cortex during auditory cortex. Electroenceph. Clin. Neurophysiol 79: 464–472. selective attention. Proc. Natl. Acad. Sci. 90: 8722–8726. Roland, P. E. (1982). Cortical regulation of selective attention in Woldorff, M., S. A. Hackley, and S. A. Hillyard. (1991). The man: A regional blood flow study. Journal of Neurophysiology effects of channel-selective attention on the mismatch negativ- 48: 1059–1078. ity wave elicited by deviant tones. Psychophysiology 28: 30–42. Trejo, L. J., D. L. Ryan-Jones, and A. F. Kramer. (1995). Atten- Woldorff M., J. C. Hansen, and S. A. Hillyard. (1987). Evidence tional modulation of the mismatch negativity elicited by fre- for effects of selective attention in the mid-latency range of the quency differences between binaurally presented tone bursts. human auditory event-related potential. In R. Johnson, Jr., R. Psychophysiology 32: 319–328. Parasuraman, and J. W. Rohrbaugh, Eds., Current Trends in Woods, D. L., K. Alho, and A. Algazi. (1994). Stages of auditory Event-Related Potential Research (EEGJ Suppl. 40). Amster- feature conjunction: an event-related brain potential study. J. of dam: Elsevier, pp. 146–54. Exper. Psychology: Human Perc. and Perf 20: 81–94. Woldorff, M. G., and S. A. Hillyard. (1991). Modulation of early auditory processing during selective listening to rapidly pre- Auditory Physiology sented tones. Electroenceph. and Clin. Neurophysiology 79: 170–191. Woldorff, M. G., S. A. Hillyard, C. C. Gallen, S. A. Hampson, and The two main functions of hearing lie in auditory communi- F. E. Bloom. (1998). Magnetoencephalographic recordings cation and in the localization of sounds. Auditory physiol- demonstrate attentional modulation of mismatch-related neural activity in human auditory cortex. Psychophysiology 35: 283– ogy tries to understand the perception, storage, and 292. recognition of various types of sounds for both purposes in Woodruff, P. W., R. R. Benson, P. A. Bandetinni, K. K. Kwong, R. terms of neural activity patterns in the auditory pathways. J. Howard, T. Talavage, J. Belliveua, and B. R. Rosen. (1996). The following article will try to analyze what auditory rep- Modulation of auditory and visual cortex by selective attention resentations may have in common with other sensory sys- is modality-dependent. Neuroreport 7: 1909–1913. tems, such as the visual system (see VISUAL ANATOMY AND Woods, D. L., K. Alho, and A. Algazi. (1991). Brain potential PHYSIOLOGY), and what may be special about them. signs of feature processing during auditory selective attention. Since the days of HELMHOLTZ (1885) the auditory system Neuroreport 2: 189–192. has been considered to function primarily as a frequency Worden, F. G. (1966). Attention and auditory electrophysiology. In analyzer. According to von Békésy’s work (1960), which F. Stellar and J. M. Sprague, Eds., Progress in Physiological Psychology. New York: Academic Press, pp. 45–116. was awarded the Nobel Prize in 1961, sound reaching the Zatorre, R. J., A. C. Evans, and E. Meyer. (1992). Lateralization of tympanic membrane generates a traveling wave along the phonetic and pitch discrimination in speech processing. Science basilar membrane in the cochlea of the inner ear. Depending 256: 846–849. on the frequency of the sound, the traveling wave achieves maximum amplitude in different locations. Thus frequency Further Readings gets translated into a place code, with high frequencies rep- resented near the base and low frequencies near the apex of Alain, C., and D. L. Woods. (1994). Signal clustering modulates the cochlea. Although the traveling wave has a rather broad auditory cortical activity in humans. Perception and Psycho- peak, various synergistic resonance mechanisms assure physics 56: 501–516. Auditory Physiology 53 effective stimulation of the cochlear hair cells at very pre- Psychophysical studies have indeed provided evidence cise locations. for the existence of neural mechanisms tuned to the rate and Electrophysiological studies using tones of a single direction of FM glides (Liberman et al. 1967, Kay 1982) as frequency (pure tones) led to a multitude of valuable data well as to specific bands of noise (Zwicker 1970). Neuro- on the responses of neurons to such stimuli and to the rec- physiologically, neurons selective to the same parameters ognition of tonotopic organization in the auditory path- have been identified in the auditory cortex of various spe- ways. Tonotopy, the neural representation of tones of best cies. Most notably, a large proportion of FM selective neu- frequency in a topographic map, is analogous to retino- rons as well as neurons tuned to certain bandwidths have topy in the visual and somatotopy in the somatosensory recently been found in the lateral belt areas of the superior system. The map is preserved by maintaining neighbor- temporal gyrus (STG) in rhesus monkeys (Rauschecker, hood relationships between best frequencies from the Tian, and Hauser 1995). The posterior STG region has also cochlea and auditory nerve through the initial stages of been found to contain mechanisms selective for phoneme the central auditory system, such as cochlear nuclei, infe- identification in humans, using functional neuroimaging rior colliculus and medial geniculate nucleus, to primary techniques (see PHONOLOGY, NEURAL BASIS OF). auditory cortex (A1; fig. 1). The standard assumption in Many neurons in the lateral belt or STG region of rhesus pure-tone studies is that in order to understand stimulus monkeys (fig. 1B) also respond well and quite selectively to coding at each subsequent level, one has to completely the monkey calls themselves. The question arises by what analyze the lower levels and then establish the transfor- neural mechanisms such selectivity is generated. Studies in mations taking place from one level to the next (Kiang which monkey calls are dissected into their constituent ele- 1965). While this approach sounds logical, it assumes that ments (both in the spectral and temporal domains), and the the system is linear, which cannot always be taken for elements are played to the neurons separately or in combi- granted. Another problem that this theory has not solved nation can provide an answer to this question (Rauschecker is how information from different frequency channels gets 1998). A sizable proportion of neurons in the STG (but not integrated, that is, how complex sounds are analyzed by in A1) responds much better to the whole call than to any of the auditory system. the elements. These results are indicative of nonlinear sum- The use of complex sound stimuli, therefore, is of the mation in the frequency and time domain playing a crucial essence in the analysis of higher auditory pathways. This has role in the generation of selectivity for specific types of been done successfully in a number of specialized systems, calls. Coincidence detection in the time domain is perhaps such as frogs, songbirds, owls, and bats. For all these species, the most important mechanism in shaping this selectivity. a neuroethological approach has been adopted based on func- Temporal integration acts over several tens (or hundreds) of tional-behavioral data (Capranica 1972; see also ANIMAL milliseconds, as most “syllables” in monkey calls (as well COMMUNICATION, ECHOLOCATION, and ETHOLOGY). The as in human speech) are of that duration. same approach has been used only sparingly in higher mam- There is some limited evidence for a columnar or patchy mals, including primates (Winter and Funkenstein 1973). representation of specific types of monkey calls in the lat- The neurophysiological basis in humans for processing eral belt areas. Rhesus calls can be subdivided into three complex sounds, such as speech (see SPEECH PERCEPTION), coarse classes: tonal, harmonic, and noisy calls. Neurons cannot be studied directly with invasive methods. Therefore, responsive to one or another category are often found animal models (e.g., nonhuman primates) have to be used. grouped together. It would be interesting to look for an The question then arises to what extent human speech sounds orderly “phonetic map” of the constituent elements them- can be applied validly as stimuli for the study of neurons in a selves, whereby interactions in two-dimensional arrays of different species. From a biological-evolutionary vantage time and frequency might be expected. point, it is more meaningful to employ the types of complex It is very likely that the lateral belt areas are not yet the sounds that are used for communication in those same species ultimate stage in the processing of communication sounds. (see ANIMAL COMMUNICATION). In using conspecific vocal- They may just present an intermediate stage, similar to V4 izations we can be confident that the central auditory system in the visual system, which also contains neurons selective of the studied species must be capable of processing these for the size of visual stimuli. Such size selectivity is obvi- calls. By contrast, human speech sounds may not be pro- ously of great importance for the encoding of visual patterns cessed in the same way by that species. or objects, but the differentiation into neurons selective for When comparing human speech sounds with communi- even more specific patterns, such as faces, is not accom- cation sound systems in other species it is plain to see that plished until an even higher processing stage, namely, the most systems have certain components in common, which inferotemporal cortex (Desimone 1991; see also FACE REC- are used as carriers of (semantic) information. Among these OGNITION). In the auditory cortex, areas in the anterior or DISTINCTIVE FEATURES are segments of frequency changing lateral parts of the STG or in the dorsal STS may be target over time (FM sweeps or “glides”) and bandpass noise areas for the exploration of call-specific neurons. bursts with specific center frequencies and bandwidths (fig. The second main task of hearing is to localize sound 2). Such universal elements of auditory communication sig- sources in space. Because the auditory periphery does not a nals can be used as stimuli with a degree of complexity that priori possess a two-dimensional quality, as do the visual and is intermediate between the pure tones used in traditional somatosensory peripheries, auditory space has to be com- auditory physiology and the whole signal whose representa- puted from attributes of sound that vary systematically with tion one really wants to understand. spatial location and are thus processed differentially by the 54 Auditory Physiology Figure 1. Schematic illustration of the major structures and pathways in the auditory system of higher mammals. (A) Pathways up to the level of primary auditory cortex (from Journal of NIH Research 9 [October 1997], with permission). (B) Cortical processing pathways in audition (from Rauschecker 1998b, Current Opinion in Neurobiology 8: 516–521, with permission). Auditory Physiology 55 Figure 2. Sound spectrograms human speech samples (A) and monkey calls (B) illustrating the common occurrence of FM glides and band- pass noise bursts in vocalizations from both species. central auditory system. This problem is logistically similar the brainstem, such as the superior olivary complex (Irvine to the computation of 3-D information from two-dimensional 1992). In addition, the spectral composition of sound arriving sensory information in the visual system. Sound attributes at the two ears varies with position due to the spectral filter most commonly assigned to spatial quality are differences characteristics of the external ears (pinnae) and the head. between sound arriving at the two ears. Both the intensity and Even monaurally, specific spectral “fingerprints” can be the time of arrival of sound originating from the same source assigned to spatial location, with attenuation of particular fre- differ when the sound source is located outside the median quency bands (“spectral notches”) varying systematically plane. Interaural time and intensity differences (ITD and IID, with azimuth or elevation (Blauert 1996). Neurons in the dor- respectively) are registered and mapped already in areas of sal cochlear nuclei are tuned to such spectral notches and may 56 Auditory Plasticity thus be involved in extracting spatial information from com- Liberman, A. M., F. S. Cooper, D. P. Shankweiler, and M. Studdert- Kennedy. (1967). Perception of the speech code. Psychological plex sounds (Young et al. 1992). Review 74: 431–461. The information computed by these lower brainstem Middlebrooks, J. C., A. E. Clock, L. Xu, and D. M. Green. (1994). structures is used by higher centers of the midbrain, such as A panoramic code for sound location by cortical neurons. Sci- the inferior and superior colliculi, to guide orienting move- ence 264: 842–844. ments toward sounds. For more “conscious” spatial percep- Mishkin, M., L. G. Ungerleider, and K. A. Macko. (1983). Object tion in higher mammals, including humans, auditory cortex vision and spatial vision: two cortical pathways. Trends in Neu- seems to be indispensable, as cortical lesions almost com- rosciences 6: 414–417. pletely abolish the ability to judge the direction of sound in Rauschecker, J. P. (1998a). Parallel processing in the auditory cor- space. Neurons in the primary auditory cortex of cats show tex of primates. Audiology and Neurootology 3: 86–103. tuning to the spatial location of a sound presented in free Rauschecker, J. P. (1998b). Cortical processing of complex sounds. Current Opinion in Neurobiology 8: 516–521. field (Imig, Irons, and Samson 1990). Most recently, an Rauschecker, J. P., B. Tian, and M. Hauser. (1995). Processing of area in the anterior ectosylvian sulcus (AES), which is part complex sounds in the macaque nonprimary auditory cortex. of the cat’s parietal cortex, has been postulated to be cru- Science 268: 111–114. cially involved in sound localization (Korte and Rausch- Winter, P., and H. H. Funkenstein. (1973). The effects of species- ecker 1993, Middlebrooks et al. 1994). Functional specific vocalization on the discharge of auditory cortical cells neuroimaging studies in humans also demonstrate specific in the awake squirrel monkey (Saimiri sciureus). Experimental activation in the posterior parietal cortex of the right hemi- Brain Research 18: 489–504. sphere by virtual auditory space stimuli (Rauschecker Young, E. D., G. A. Spirou, J. J. Rice, and H. F. Voigt. (1992). 1998a,b). Neural organization and responses to complex stimuli in the Both animal and human studies suggest, therefore, that dorsal cochlear nucleus. Philosophical Transactions of the Royal Society Lond B 336(1278): 407–413. information about auditory patterns or objects gets pro- Zwicker, E. (1970). Masking and psychological excitation as con- cessed, among others, in the superior temporal gyrus (STG). sequences of the ear’s frequency analysis. In R. Plomp and G. By contrast, auditory spatial information seems to get pro- F. Smoorenburg, Eds., Frequency Analysis and Periodicity cessed in parietal regions of cortex (fig. 1B). This dual pro- Detection in Hearing. Leiden: Sijthoff, pp. 376–394. cessing scheme is reminiscent of the visual pathways, where a ventral stream has been postulated for the processing of Further Readings visual object information and a dorsal stream for the process- ing of visual space and motion (Mishkin, Ungerleider, and Hauser, M. D. (1996). The Evolution of Communication. Cam- Macko 1983; see also VISUAL PROCESSING STREAMS). bridge, MA: MIT Press. Kaas, J. H., and T. A. Hackett. (1998). Subdivisions of auditory See also AUDITION; AUDITORY PLASTICITY; AUDITORY cortex and levels of processing in primates. Audiology Neuro- ATTENTION; PHONOLOGY, NEURAL BASIS; SPEECH PERCEP- otology 3: 73–85 TION; SINGLE-NEURON RECORDING Konishi, M., et al. (1998). Neurophysiological and anatomical sub- strates of sound localization in the owl. In Edelman, G. M., W. —Josef P. Rauschecker E. Gall, and W. M. Cowan, Eds., Auditory Function: Neurobio- logical Bases of Hearing. New York: Wiley, pp. 721–745. References Merzenich, M. M., and J. F. Brugge. (1973). Representation of the cochlear partition on the superior temporal plane of the Békésy, G. von. (1960). Experiments in Hearing. New York: macaque monkey. Brain Research 50: 275–296. McGraw-Hill. Morel, A., P. E. Garraghty, and J. H. Kaas. (1993). Tonotopic orga- Blauert, J. (1996). Spatial Hearing. 2d ed. Cambridge, MA: MIT nization, architectonic fields, and connections of auditory cor- Press. tex in macaque monkeys. Journal of Comparative Neurology Capranica, R. R. (1972). Why auditory neurophysiologists should 335: 437–459. be more interested in animal sound communication. Physiolo- Pandya, D. N., and F. Sanides. (1972). Architectonic parcellation gist 15: 55–60. of the temporal operculum in rhesus monkey and its projection Desimone, R. (1991). Face-selective cells in the temporal cortex of pattern. Zeitschrift für Anatomie und Entwiklungs-Geschichte monkeys. Journal of Cognitive Neuroscience 3: 1–8. 139: 127–161. Helmholtz, H. von. (1885). On the Sensation of Tones. Reprinted Peters, A., and E. G. Jones (1985). Cerebral Cortex vol. 4, Associ- 1954. New York: Dover Publications. ation and Auditory Cortices. New York: Plenum. Imig, T. J., W. A. Irons, and F. R. Samson. (1990). Single-unit Rauschecker, J. P., and P. Marler. (1987). Imprinting and Cortical selectivity to azimuthal direction and sound pressure level of Plasticity: Comparative Aspects of Sensitive Periods. New noise bursts in cat high-frequency primary auditory cortex. York: Wiley. Journal of Neurophysiology 63: 1448–1466. Suga, N. (1992). Philosophy and stimulus design for neuroethology of Irvine, D. (1992). Auditory brainstem processing. In A. N. Popper complex-sound processing. Phil Trans R Soc Lond 336: 423–428. and R. R. Fay, Eds., The Mammalian Auditory Pathway: Neuro- Woolsey, C. M. (1982). Cortical Sensory Organization. vol. 3, physiology. New York: Springer, pp. 153–231. Multiple Auditory Areas. New Jersey: Humana Press. Kay, R. H. (1982). Hearing of modulation in sounds. Physiological Reviews 62: 894–975. Auditory Plasticity Kiang, N. Y-S. (1965). Stimulus coding in the auditory nerve and cochlear nucleus. Acta Otolaryngologica 59: 186–200. Korte, M., and J. P. Rauschecker. (1993). Auditory spatial tuning The perception of acoustic stimuli can be altered as a conse- of cortical neurons is sharpened in cats with early blindness. quence of age, experience, and injury. A lasting change in Journal of Neurophysiology 70: 1717–1721. Auditory Plasticity 57 either the perception of acoustic stimuli or in the responses A different experimental strategy is to observe the effects of neurons to acoustic stimuli is known as auditory plastic- of the representation of frequencies across a wide region of ity, a form of NEURAL PLASTICITY. This plasticity can be the cerebral cortex. For example, if a limited region of the demonstrated at both the perceptual and neuronal levels hair cells in the cochlea are destroyed, there is an expansion through behavioral methods such as operant and classical of the representation of the neighboring spared frequencies CONDITIONING or by lesions of the auditory periphery both in the auditory cortex (Robertson and Irvine 1989; Rajan et in the adult and during development. The plasticity of neu- al. 1993). These results indicate that the cerebral cortex is ronal responses in the auditory system presumably reflects able to adjust the representation of different frequencies the ability of humans and animals to adjust their auditory depending on the nature of the input. This same type of cor- perceptions to match the perceptual world around them as tical reorganization probably occurs in normal humans dur- defined by the other sensory modalities, and to perceive the ing the progressive loss of high frequency hair cells in the different acoustic-phonetic patterns of the native lan- cochlea with aging. guages(s) learned during development (see Kuhl 1993). Cortical reorganization can also occur following a period There are several examples within the psychophysical lit- of operant conditioning, where the perception of acoustic erature where human subjects can improve their perfor- stimuli is improved over time, similar to the studies on human mance at making specific judgments of acoustic stimulus subjects described above. Monkeys trained at a frequency dis- features over the course of several days to weeks, presum- crimination task show a continual improvement in perfor- ably due to changes in the cortical representation of the rele- mance during several weeks of daily training. After training, vant stimulus parameters. For example, it has recently been the area within the primary auditory cortex that is most sensi- shown that normal human subjects will improve their per- tive to the trained frequencies is determined. This area of rep- formance at a temporal processing task over the course of resentation is the greatest in the monkeys trained to several days of training (Wright et al. 1997). discriminate that frequency when compared to the representa- Training-induced changes in perceptual ability have tion of untrained monkeys. This occurs regardless of what recently been tested as a treatment strategy for children with particular frequency the monkey was trained to discriminate, language-based learning disabilities (cf. LANGUAGE IMPAIR- but it occurs only at those trained frequencies, and no others. MENT, DEVELOPMENTAL). It had been suggested that children The cortical area of the representation of these trained and with this type of learning disability are unable to determine untrained frequencies is correlated with the behavioral ability the order of two rapidly presented sounds (Tallal and Piercy (Recanzone, Schreiner, and Merzenich 1993). A further find- 1973). Recent studies have demonstrated that some children ing was that only animals that attended to and discriminated with this type of learning disability can make such discrimi- the acoustic stimuli showed any change in the cortical repre- nations following several weeks of practice (Merzenich et al. sentations; monkeys stimulated in the same manner while 1996), and these children also showed significant improve- engaged at an unrelated task had normal representations of ments in language comprehension (Tallal et al. 1996). the presented frequencies. Thus, stimulus relevance is impor- Several approaches in experimental animals have been tant in both operant and classical conditioning paradigms. employed to address the neuronal mechanisms that underlie Auditory plasticity also occurs during development, auditory plasticity. Single auditory neurons have been shown which has been investigated by taking advantage of the nat- to change their response properties following classical con- ural orienting behavior of barn owls. These birds can locate ditioning paradigms. As an example, SINGLE-NEURON the source of a sound extremely accurately. Rearing young owls with optically displacing prisms results in a shift in the RECORDING techniques define the response of a single neu- owl’s perception of the source of an acoustic stimulus ron to a range of frequencies, and then a tone that was not (Knudsen and Knudsen 1989). This shift can also be dem- optimal in exciting the neuron (the conditioned stimulus, onstrated electrophysiologically as an alignment of the CS) is paired with an unconditioned stimulus (US, e.g., a visual and displaced auditory receptive fields of neurons in mild electrical shock). When the frequency response profile the optic tectum (Knudsen and Brainard 1991). of the neuron is defined after conditioning, the response of The converging neuronal data from experimental ani- the neuron to the conditioned tone can be much larger, in mals suggest that similar changes in response properties of some cases to the point that the paired tone is now the best cortical and subcortical neurons also occur in humans. The stimulus at exciting the neuron. These changes require a improvements in performance of human subjects in audi- pairing of the CS and the US (Bakin and Weinberger 1990). tory discrimination tasks, the normal high frequency hear- This response plasticity has been demonstrated in both the ing loss during aging, the change from “language-general” auditory THALAMUS and auditory divisions of the CEREBRAL to “language-specific” processing of phonetic information CORTEX (Ryugo and Weinberger 1978; Diamond and Wein- during language acquisition, and injuries to the cochlea or berger 1986; Bakin and Weinberger 1990). central auditory structures, are presumably resulting in It has also been demonstrated that the modulatory neu- changes in single neuron responses and in cortical and sub- rotransmitter acetylcholine is an important contributor to cortical representations. It is quite likely that neuronal plas- this effect. If the acetylcholine receptor is blocked, there is ticity across other sensory modalities and other cognitive no change in the response properties of the neurons (Mc- functions, particularly in the cerebral cortex, underlies the Kenna et al. 1989). Similarly, activation of the acetylcholine ability of humans and other mammals to adapt to a chang- receptor produces a similar enhancement of the neuronal ing environment and acquire new skills and behaviors response to the conditioned stimulus (Metherate and Wein- throughout life. berger 1990). 58 Autism See also Eds., Dynamic Aspects of Neocortical Function. New York: AUDITION; AUDITORY ATTENTION; CONDITION- Wiley, pp. 375–396. ING AND THE BRAIN; PHONOLOGY, NEURAL BASIS OF; Knudsen, E. I., and M. S. Brainard. (1995). Creating a unified rep- SPEECH PERCEPTION resentation of visual and auditory space in the brain. Ann. Rev. —Gregg Recanzone Neurosci. 18: 19–43. Merzenich, M. M., C. Schreiner, W. Jenkins, and X. Wang. (1993). Neural mechanisms underlying temporal integration, segmenta- References tion, and input sequence representation: some implications for Bakin, J. S., and N. M. Weinberger. (1990). Classical conditioning the origin of learning disabilities. Annal New York Acad. Sci. induces CS-specific receptive field plasticity in the auditory 682: 1–22. cortex of the guinea pig. Brain Res. 536: 271–286. Neville, H. J., S. A. Coffey, D. S. Lawson, A. Fischer, K. Emmo- Diamond, D. M., and N. M. Weinberger. (1986). Classical condi- rey, and U. Bellugi. (1997). Neural systems mediating Ameri- tioning rapidly induces specific changes in frequency receptive can sign language: Effects of sensory experience and age of fields of single neurons in secondary and ventral ectosylvian acquisition. Brain and Language 57: 285–308. auditory cortical fields. Brain Res. 372: 357–360. Recanzone, G. H. (1993). Dynamic changes in the functional orga- Knudsen, E. I., and M. S. Brainard. (1991). Visual instruction of nization of the cerebral cortex are correlated with changes in the neural map of auditory space in the developing optic tec- psychophysically measured perceptual acuity. Biomed. Res. 14, tum. Science 253: 85–87. Suppl. 4: 61–69. Knudsen, E. I., and P. F. Knudsen. (1989). Vision calibrates sound Weinberger, N. M. (1995). Dynamic regulation of receptive fields localization in developing barn owls. J. Neurosci. 9: 3306–3313. and maps in the adult sensory cortex. Ann. Rev. Neurosci. 18: Kuhl, P. K. (1993). Developmental speech perception: implications 129–158. for models of language impairment. Annal New York Acad. Sci. Weinberger, N. M., J. H. Ashe, R. Metherate, T. M. McKenna, D. 682: 248–263. M. Diamond, and J. S. Bakin. (1990). Retuning auditory cortex McKenna, T. M., J. H. Ashe, and N. M. Weinberger. (1989). Cho- by learning: A preliminary model of receptive field plasticity. linergic modulation of frequency receptive fields in auditory Concepts Neurosci. 1: 91–131. cortex: 1. Frequency-specific effects of muscarinic agonists. Synapse 4: 30–43. Autism Merzenich, M. M., W. M. Jenkins, P. Johnston, C. Schreiner, S. L. Miller, and P. Tallal. (1996). Temporal processing deficits of language-learning impaired children ameliorated by training. A developmental disorder of the brain, autism exists from Science 271: 77–81. birth and persists throughout life. The etiology of the disor- Metherate, R., and N. M. Weinberger. (1990). Cholinergic modula- der is still unknown, but is believed to be largely genetic, tion of responses to single tones produces tone-specific recep- tive field alterations in cat auditory cortex. Synapse 6: 133–145. while different organic factors have been implicated in a Rajan, R., D. R. Irvine, L. Z. Wise, and P. Heil. (1993). Effect of substantial proportion of cases (for reviews see Ciaranello unilateral partial cochlear lesions in adult cats on the represen- and Ciaranello 1995; Bailey, Phillips, and Rutter 1996). tation of lesioned and unlesioned cochleas in primary auditory Autism was identified and labeled by Kanner (1943) and cortex. J. Comp. Neurol. 338: 17–49. Asperger (1944). Recanzone, G. H., C. E. Schreiner, and M. M. Merzenich. (1993). The diagnosis of autism is based on behavioral criteria. Plasticity in the frequency representation of primary auditory The chief criteria as set out in ICD-10 (WHO 1992) and in cortex following discrimination training in adult owl monkeys. DSM-IV (APA 1994) include: abnormalities of social inter- Journal of Neuroscience 13: 87–103. action, abnormalities of verbal and nonverbal communica- Robertson, D., and D. R. Irvine. (1989). Plasticity of frequency tion, and a restricted repertoire of interests and activities. organization in auditory cortex of guinea pigs with partial uni- lateral deafness. J. Comp. Neurol. 282: 456–471. Behavior suggestive of these impairments can already be Ryugo, D. K., and N. M. Weinberger. (1978). Differential plastic- discerned in infancy. A recent screening instrument, based ity of morphologically distinct neuron populations in the medi- on a cognitive account of autism, appears to be remarkably cal geniculate body of the cat during classical conditioning. successful at eighteen months, involving failure of gaze Behav. Biol. 22: 275–301. monitoring, protodeclarative pointing, and pretend play Tallal, P., S. L. Miller, G. Bedi, G. Byma, X. Wang, S. S. Nagara- (Baron-Cohen et al. 1996). These appear to be the first clear jan, C. Schreiner, W. M. Jenkins, and M. M. Merzenich. (1996). behavioral manifestations of the disorder. Contrary to popu- Language comprehension in language-learning impaired chil- lar belief, failure of bonding or attachment is not a distin- dren improved with acoustically modified speech. Science 271: guishing characteristic of autism. 81–84. The autistic spectrum refers to the wide individual varia- Tallal, P., and M. Piercy. (1973). Defects of non-verbal auditory perception in children with developmental aphasia. Nature 241: tion of symptoms from mild to severe. Behavior not only 468–469. varies with age and ability, but is also modified by a multi- Wright, B. A., D. V. Buonomano, H. W. Mahncke, and M. M. tude of environmental factors. For this reason, one of the Merzenich. (1997). Learning and generalization of auditory major problems with behaviorally defined developmental temporal-interval discrimination in humans. J. Neurosci. 17: disorders is how to identify primary, associated, and second- 3956–3963. ary features. Three highly correlated features, namely char- acteristic impairments in socialization, communication, and Further Readings imagination, were identified in a geographically defined population study (Wing and Gould 1979). These impair- Knudsen, E. I. (1984). Synthesis of a neural map of auditory space ments appear to persist in development even though their in the owl. In G. M. Edelman, W. M. Cowan, and W. E. Gall, Autism 59 outward manifestation is subject to change. For example, a confirmed in a number of studies (see chapters in Baron- socially aloof child may at a later age become socially inter- Cohen, Tager-Flusberg, and Cohen 1993) and has become ested and show “pestering” behavior; a child with initially known as the THEORY OF MIND deficit. Most individuals little speech may become verbose with stilted, pedantic lan- with autism fail to appreciate the role of mental states in the guage. The triad of impairments appears to be a common explanation and prediction of everyday behavior, including denominator throughout a spectrum of autistic disorders deception, joint attention, and those emotional states which (Wing 1996). depend on monitoring other people’s attitudes, for example The prevalence of autistic disorder has been studied in a pride (Kasari et al. 1993). The brain basis for the critical number of different countries, and is between 0.16 and 0.22 cognitive ability that enables a theory of mind to develop percent, taking into account the most recent estimates. has begun to be investigated by means of functional brain Males predominate at approximately 3 to 1, and this ratio imaging (Fletcher et al. 1995; Happé et al. 1996). Other becomes more extreme with higher levels of ability. The explanations of social communication impairments in prevalence of a milder variant of autism, Asperger syn- autism have emphasized a primary emotional deficit in drome, is estimated as between 0.3 and 0.7 percent of the INTERSUBJECTIVITY (Hobson 1993). general population on the basis of preliminary findings. The nonsocial features of autism, in particular those These individuals are sometimes thought to be merely encompassed by the diagnostic sign restricted repertoire of eccentric and may not be diagnosed until late childhood or interests, are currently tackled by two cognitive theories. even adulthood. Because they have fluent language and nor- The first proposes a deficit in executive functions. These mal, if not superior verbal IQ, they can compensate to some include planning and initiation of action and impulse con- extent for their problems in social communication. trol, and are thought to depend on intact prefrontal cortex. MENTAL RETARDATION, a sign of congenital brain abnor- Evidence for poor performance on many “frontal” tasks in mality, is one of the most strongly associated features of autism is robust (Ozonoff, Pennington, and Rogers 1991; autism; IQ is below 70 in about half the cases, and below 80 Pennington and Ozonoff 1996). For instance, individuals in three quarters. Epilepsy is present in about a third of indi- with autism often fail to inhibit prepotent responses and to viduals, while other neurological and neuropsychological shift response categories (Hughes, Russell, and Robbins signs are almost always detectable (for reviews see Gillberg 1994). Poor performance on these tasks appears to be and Coleman 1992). Postmortem brain studies have shown a related to stereotyped and perseverative behavior in every- number of abnormalities in cell structure in different parts day life. The site of brain abnormality need not necessarily of the brain, including temporal and parietal lobes, and in be in prefrontal cortex, but could be at different points in a particular, limbic structures, as well as the CEREBELLUM. distributed system underlying executive functions, for Findings indicate a curtailment of neuronal development at example the dopamine system (Damasio and Maurer 1978). or before thirty weeks of gestation (Bauman and Kemper A second cognitive theory that attempts to address islets 1994). No consistent and specific structural or metabolic of ability and special talents that are present in a significant abnormalities have as yet been revealed, but overall brain proportion of autistic individuals is the theory of weak cen- volume and weight tend to be increased. tral coherence (Frith and Happé 1994). This theory proposes A genetic basis for autism is strongly indicated from twin that the observed performance peaks in tests such as block and family studies favoring a multiplicative multilocus design and embedded figures, and the savant syndrome, model of inheritance, perhaps involving only a small num- shown for instance in outstanding feats of memory or ber of genes (reviewed by Bailey et al. 1995). There is evi- exceptional drawing, are due to a cognitive processing style dence for a broader cognitive phenotype with normal that favors segmental over holistic processing. Some evi- intelligence and varying degrees of social and communica- dence exists that people with autism process information in tion impairments which may be shared by family members. an unusually piecemeal fashion (e.g., start a drawing from Other disorders of known biological origin, such as fragile an unusual detail). Likewise, they fail to integrate informa- X-syndrome, phenylketonuria, tuberous sclerosis, can lead tion so as to derive contextually relevant meaning. For to the clinical picture of autism in conjunction with severe instance, when reading aloud “the dog was on a long lead,” mental retardation (Smalley, Asarnow, and Spence 1988). they may pronounce the word lead as led. There is no known medical treatment. However, special Clearly, the explanation of autism will only be complete education and treatment based on behavior management and when the necessary causal links have been traced between modification often have beneficial effects (see chapters in gene, brain, mind and behavior. This is as yet a task for the Schopler and Mesibov 1995). Whatever the treatment, the future. developmental progress of children with autism is quite See also COGNITIVE DEVELOPMENT; FOLK PSYCHOLOGY; variable. MODULARITY OF MIND; NEUROTRANSMITTERS; PROPOSI- Cognitive explanations of the core features of autism TIONAL ATTITUDES; SOCIAL COGNITION provide a vital interface between brain and behavior. The —Uta Frith proposal of a specific neurologically based problem in understanding minds was a significant step in this endeavor. References The hypothesis that autistic children lack the intuitive understanding that people have mental states was originally American Psychiatric Association. (1994). Diagnostic and Statisti- tested with the Sally-Ann false belief paradigm (Baron- cal Manual of Mental Disorders (DSM-IV). Fourth edition. Cohen, Leslie, and Frith 1985). This impairment has been Washington, DC: American Psychiatric Association. 60 Autocatalysis Asperger, H. (1944). “Autistic psychopathy” in childhood. In U. and classification. Journal of Autism and Developmental Disor- Frith, Ed., Autism and Asperger Syndrome. Translated and ders 9: 11–29. annotated by U. Frith. Cambridge: Cambridge University Press. World Health Organization. (1992). The ICD-10 Classification of Bailey, A., A. Le Couteur, I. Gottesman, P. Bolton, E. Simonoff, E. Mental and Behavioral Disorders. Geneva: World Health Orga- Yuzda, and M. Rutter. (1995). Autism as a strongly genetic dis- nization. order: evidence from a British twin study. Psychological Medi- cine 25: 63–78. Further Readings Bailey, A. J., W. Phillips, and M. Rutter. (1996). Autism: integrat- Frith, U. (1989). Autism: Explaining the Enigma. Oxford: Black- ing clinical, genetic, neuropsychological, and neurobiological well. perspectives. Journal of Child Psychology and Psychiatry 37: Happé, F. (1994). Autism: An Introduction to Psychological The- 89–126. ory. London: UCL Press. Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Happé, F., and U. Frith. (1996). The neuropsychology of autism. Theory of Mind. Cambridge, MA: MIT Press. Brain 119: 1377–1400. Baron-Cohen, S., A. Cox, G. Baird, and J. Swettenham. (1996). Russell, J., Ed. (1998). Autism as an Executive Disorder. Oxford: Psychological markers in the detection of autism in infancy in a Oxford University Press. large population. British Journal of Psychiatry 168: 158–163. Sigman, M., and L. Capps. (1997). Children with Autism. A Devel- Baron-Cohen, S., H. Tager-Flusberg, and D. J. Cohen, Eds. (1993). opmental Perspective. Cambridge, MA: Harvard University Understanding Other Minds: Perspectives from Autism. Press. Oxford: Oxford University Press. Baron-Cohen, S., A. Leslie, and U. Frith. (1985). Does the autistic child have a “theory of mind”? Cognition 21: 37–46. Autocatalysis Bauman, M., and T. Kemper, Eds. (1994). The Neurobiology of Autism. Baltimore: Johns Hopkins University Press. Ciaranello, A. L., and R. D. Ciaranello. (1995). The neurobiology of See SELF-ORGANIZING SYSTEMS infantile autism. Annual Review of Neuroscience 18: 101–128. Damasio, A. R., and R. G. Maurer. (1978). A neurological model Automata for childhood autism. Archives of Neurology 35: 777–786. Fletcher, P. C., F. Happé, U. Frith, S. C. Baker, R. J. Dolan, R. S. J. Frackowiak, and C. D. Frith. (1995). Other minds in the brain: a An automaton (pl. automata) was originally anything with functional imaging study of “theory of mind” in story compre- the power of self-movement, then, more specifically, a hension. Cognition 57: 109–128. machine with the power of self-movement, especially a fig- Frith, U., and F. Happé. (1994). Autism: beyond “theory of mind.” ure that simulated the motion of living beings. Perhaps the Cognition 50: 115–132. most impressive such automata were those of Jacques de Frith, U., J. Morton, and A. Leslie. (1991). The cognitive basis of a biological disorder. Trends in Neuroscience 14: 433–438. Vaucanson (1709–1782), including a duck that ate and Gillberg, C., and M. Coleman. (1992). The Neurobiology of the drank with realistic motions of head and throat, produced Autistic Syndromes. 2nd ed. London: Mac Keith Press. the sound of quacking, and could pick up cornmeal and Happé, F., S. Ehlers, P. Fletcher, U. Frith, M. Johansson, C. Gill- swallow, digest, and excrete it. berg, R. Dolan, R. Frackowiak, and C. Frith. (1996). “Theory of People acting in a mechanical, nonspontaneous way mind” in the brain. Evidence from a PET scan study of came to be called automata, but this begs the very question Asperger syndrome. NeuroReport 8: 197–201. for which cognitive science seeks a positive answer: “Is the Hobson, R. P. J. (1993). Autism and the Development of Mind. working of the human mind reducible to information pro- Hove, Sussex: Erlbaum. cessing embodied in the workings of the human brain?” Hughes, C., J. Russell, and T. W. Robbins. (1994). Evidence for that is, is human spontaneity and intelligence a purely executive dysfunction in autism. Neuropsychologia 32: 477–492. Kanner, L. (1943). Autistic disturbances of affective contact. Ner- material phenomenon? René DESCARTES (1596–1650) saw vous Child 2: 217–250. the functioning of nonhuman animals, and much of human Kasari, C., M. D. Sigman, P. Baumgartner, and D. J. Stipek. function, as being explainable in terms of the automata of (1993). Pride and mastery in children with autism. Journal of his day but drew the line at cognitive function. However, Child Psychology and Psychiatry 34: 353–362. whereas Descartes’s view was based on clockwork and Ozonoff, S., B. F. Pennington, and S. J. Rogers. (1991). Executive hydraulic automata, most cognitive science is based on a function deficits in high-functioning autistic children: relation- view of automata as “information processing machines” ship to theory of mind. Journal of Child Psychology and Psy- (though there is now a welcome increase of interest in chiatry 32: 1081–1106. embodied automata). Pennington, B. F., and S. Ozonoff. (1996). Executive functions and The present article describes key concepts of information developmental psychopathology. Journal of Child Psychology and Psychiatry 37: 51–87. processing automata from 1936 through 1956 (the year of Schopler, E., and G. Mesibov, Eds. (1995). Learning and Cogni- publication of Automata Studies by Shannon and McCar- tion in Autism. New York: Plenum. thy), including Turing machines, finite automata, automata Smalley, S., R. Asarnow, and M. Spence. (1988). Autism and for formal languages, McCulloch-Pitts neural networks, and genetics: a decade of research. Archives of General Psychiatry self-reproducing automata. 45: 953–961. TURING (1936) and Post (1936) introduced what is now Wing, L. (1996). The Autistic Spectrum. A Guide for Parents and called a Turing machine (TM), consisting of a control box Professionals. London: Constable. containing a finite program; an indefinitely extendable tape Wing, L., and J. Gould. (1979). Severe impairments of social inter- divided lengthwise into squares; and a device for scanning action and associated abnormalities in children: epidemiology Automata 61 and then printing on one square of the tape at a time and able. To understand the latter claim, note that any real num- subsequently moving the tape one square left or right or not ber between zero and one can be represented by an infinite at all. We start the machine with a finite sequence of sym- decimal expansion 0.d0d1d2d3 . . . dn . . . and thus as a func- tion f:N ¡ N with f(n) = dn—and thus the uncountable set of bols on the tape, and a program in the control box. The sym- bol scanned, and the instruction now being executed from these real numbers can be viewed as a subset of the set of all the program, determine what new symbol is printed on the functions from N to N. However, the interest of computabil- square, how the tape is moved, and what instruction is exe- ity theory is that there are “computationally interesting” cuted next. functions that are not computable. We associate with a TM Z a numerical function fZ by The following provides a simple example of an explicit placing a number n encoded as a string on the tape and proof of noncomputability. We define a total function h start Z scanning the leftmost square of . If and when Z (i.e., h(x) is defined for every x in N) as follows. Let h(n) = stops, we decode the result to obtain the number fZ (n). If Z n if fn is itself total, while h(n) = n0 if fn is not total—where never stops, we leave fZ (n) undefined. Just consider the pro- n0 is a fixed choice of an integer for which fn0 is a total gram Z, which always moves right on its tape (new squares computable function; h thus has the interesting property that are added whenever needed), as an example of a machine a computable function fn is total if and only if n = h(m) for that never stops computing, no matter what its input. More some m. h is certainly a well-defined total function, but is h subtle machines might, for example, test to see whether n is a computable function? The answer is no. For if h were prime and stop computing only if that is the case. (We can computable, then so too would be the function f defined by only associate fZ with Z after we have chosen our encoding. f(n) = fh(n)(n) + 1, and f would also be total. Then f is total Nonnumerical functions can be associated with a TM once and computable, and so f = fh(m) for some m so that f(m) = we have a “straightforward” way of unambiguously coding fh(m)(m). But, by definition, f(m) = fh(m)(m) + 1, a contradic- the necessarily finite or countable input and output struc- tion! This is one example of the many things undecidable by tures on the tape.) any effective procedure. Turing’s Hypothesis (also called Church’s thesis; see The most famous example—proved by an extension of CHURCH-TURING THESIS) is that a function is effectively the above proof—is that we cannot tell effectively for arbi- computable if and only if it is computable by some TM. trary (n, x) whether Zn will stop computing for input x. Thus This statement is informal, but each attempt to formalize the the halting problem for TMs is unsolvable. notion of effectiveness using finite “programs” has yielded Finite automata provide, essentially, the input-output procedures equivalent to those implementable by TMs (or a behavior of the TM control box without regard for its inter- subclass thereof). action with the tape. A finite automaton is abstractly described as a quintuple M = (X, Y, Q, δ, ß) where X, Y, and Because each TM is described by a finite list of instructions we may effectively enumerate the TMs as Zl, Q are finite sets of inputs, outputs, and states, and if at time t Z2, Z3, . . .—given n we may effectively find Zn, and given the automaton is in state q and receives input x, it will emit the list of instructions for Z, we may effectively find the n output ß(q) at time t, and then make the transition to state δ(q,x) at time t + l. for which Z = Zn. For example, we might list all the one- instruction programs first, then all those with two instruc- Given this, we can view the control box of a Turing tions, and so on, listing all programs of a given length in Machine as a finite automaton: let X be the set of tape sym- bols, Q the set of control-box instructions, and Y the set X × some suitable generalization of alphabetical order. Turing showed that there is a universal Turing Machine M of pairs (x, m) where x is symbol to be printed on the tape that, given a coded description of Zn on its tape as well as x, and m is a symbol for one of the three possible tape moves will proceed to compute fzn(x) = fn(x), if it is defined. This is (left, right, or not at all). obvious if we accept Turing’s hypothesis, for given n and x Another special case of the finite automaton definition we find Zn effectively, and then use it to compute fn(x), and takes Y to be the set {0,1}—this is equivalent to dividing the so there should exist a TM to implement the effective proce- set Q into the set F of designated final states for which dure of going from the pair (n, x) to the value fn(x). More ß(q) = 1, and its complement—and then designates a spe- directly, we can program a Turing machine U that divides cific state q0 of Q to be the “initial state.” In this case, inter- the data on its tape into two parts, that on the left providing est focuses on the language accepted by M, namely the set the instructions for Zn, and that on the right providing the of strings w consisting of a sequence of 0 or more elements string x(t) on which Zn would now be computing. U is pro- of X (we use X* to denote the set of such strings) with the grammed to place markers against the current instruction property that if we start M in state q0 and apply input from Zn and the currently scanned symbol of x(t), and to sequence w, then M will end up in a state—denoted by δ∗(q0,w)—that belongs to F. move back and forth between instructions and data to simu- late the effect of Zn on x. For further details see Minsky The above examples define two sets of “languages” (in (1967) and Arbib (1969), the second of which generalizes the sense of subsets of X*, i.e., strings on some specified Turing machines to those that can work in parallel on multi- alphabet): finite-state languages defined as those languages ple, possibly multidimensional tapes. accepted by some finite automaton, and recursively enu- Is every function mapping natural numbers (N = {0, 1, 2, merable sets that have many equivalent definitions, includ- 3, . . .}) to natural numbers computable? Obviously not, for ing “a set R is recursively enumerable if and only if there is we have seen that the number of computable functions is a Turing machine Z such that Z halts computation on initial countable, whereas the number of all functions is uncount- tape x if and only if x belongs to R.” Without going into 62 Automata details and definitions here, it is interesting to note that two computable with available resources (Garey and Johnson intermediate classes of formal languages—the context-free 1979). Distributed computation may render tractable a prob- languages and the context-sensitive languages—have also lem that a serial TM would solve too slowly. When some been associated with classes of automata. The former are people say “the brain is a computer,” they talk as if the associated with push-down automata and the latter with notion of “computer” were already settled. However, a linear-bounded automata. Discussion of the relation of human brain is built of hundreds of regions, each with mil- these language classes to human language was a staple lions of cells, each cell with tens of thousands of connec- topic for discussion in the 1960s (Chomsky and Miller tions, each connection involving subtle neurochemical 1963; Chomsky 1963; Miller and Chomsky 1963), and processes. To understand this is to push the theory of autom- Chomsky has subsequently defined a number of other sys- ata—and cognitive science—far beyond anything available tems of formal grammar to account for human language today. Our concepts of automata will grow immensely over competence. However, much work in PSYCHOLINGUISTICS the coming decades as we better understand how the brain now focuses on the claim that adaptive networks provide a functions. better model of language performance than does any sys- Arbib (1987) provides much further information on neu- tem based on using some fixed automaton structure which ral nets, finite automata, TMs, and automata that construct embodies such a formal grammar (e.g., Seidenberg 1995). as well as compute, as well as a refutation of the claim that McCulloch and Pitts (1943) modeled the neuron as a Gödel’s Incompleteness theorem (see GÖDEL’S THEOREMS) logic element, dividing time into units so small that in each sets limits to the intelligence of automata. Many articles in time period at most one spike can be initiated in the axon of Arbib (1995) and in the present volume explore the theme a given neuron, with output 1 or 0 coding whether or not the of connectionist approaches to cognitive modeling (see neuron “fires” (i.e., has a spike on its axon). Each connec- COGNITIVE MODELING, CONNECTIONIST). tion from neuron i to neuron j has an attached synaptic See also COMPUTATION; COMPUTATION AND THE BRAIN; weight. They also associate a threshold with each neuron, COMPUTATIONAL COMPLEXITY; COMPUTATIONAL THEORY and assume exactly one unit of delay in the effect of all pre- OF MIND; FORMAL GRAMMARS; VISUAL WORD RECOGNITION synaptic inputs on the cell’s output. A McCulloch-Pitts neu- —Michael Arbib ron (MP-neuron) fires just in case the weighted value of its inputs at that time reaches threshold. Clearly, a network of MP-neurons functions like a finite References automaton, as each neuron changes state synchronously on each tick of the time scale. Conversely, it was shown, Arbib, M. A. (1969). Theories of Abstract Automata. Englewood Cliffs, NJ: Prentice-Hall. though inscrutably, by McCulloch and Pitts that any finite Arbib, M. A. (1987). Brains, Machines and Mathematics. Second automaton can be simulated by a suitable network of MP- edition. New York: Springer. neurons—providing formal “brains” for the TMs, which can Arbib, M. A., Ed. (1995). The Handbook of Brain Theory and Neu- carry out any effective procedure. Knowledge of these ral Networks. Cambridge, MA: MIT Press. (See pp. 4–11.) results inspired VON NEUMANN’s logical design for digital Chomsky, N. (1963). Formal properties of grammars. In R. D. computers with stored programs (von Neumann, Burks, and Luce, R. R. Bush, and E. Galanter, Eds., Handbook of Mathe- Goldstine 1947–48). matical Psychology, vol. 2. New York: Wiley, pp. 323–418. Intuitively, one has the idea that a construction machine Chomsky, N., and G. A. Miller. (1963). Introduction to the formal must build a simpler machine. But in biological systems, the analysis of natural languages. In R. D. Luce, R. R. Bush, and E. complexity of an offspring matches that of the parent; and Galanter, Eds., Handbook of Mathematical Psychology, vol. 2. New York: Wiley, pp. 269–321. evolution may yield increasing biological complexity. Von Garey, M. R., and D. S. Johnson. (1979). Computers and Intracta- Neumann (1951) outlined the construction of an automaton bility: A Guide to the Theory of NP-Completeness. New York: A that, when furnished with the description of any other W. H. Freeman. automaton M (composed from some suitable collection of McCulloch, W. S., and W. H. Pitts. (1943). A logical calculus of elementary parts) would construct a copy of M. However, A the ideas immanent in nervous activity. Bull. Math. Biophys. 5: is not self-reproducing: A, supplied with a copy of its own 115–133. description, will build a copy of A without its own descrip- Miller, G. A., and N. Chomsky. (1963). Finitary models of lan- tion. The passage from a universal constructor to a self- guage users. In R. D. Luce, R. R. Bush, and E. Galanter, Eds., reproducing automaton was spelled out in von Neumann Handbook of Mathematical Psychology, vol. 2. New York: (1966) in which the “organism” is a pattern of activity in an Wiley, pp. 419–491. Minsky, M. L. (1967). Computation: Finite and Infinite Machines. unbounded array of identical finite automata, each with only Englewood Cliffs, NJ: Prentice-Hall. twenty-nine states. This was one root of the study of cellu- Post, E. L. (1936). Finite combinatory processes—formulation I. lar automata. Journal of Symbolic Logic 1: 103–105. As embodied humans we have far richer interactions with Seidenberg, M. (1995). Linguistic morphology. In M. A. Arbib, the world than does a TM; our “computations” involve all of Ed., The Handbook of Brain Theory and Neural Networks. our bodies, not just our brains; and biological neurons are Cambridge, MA: MIT Press. amazingly more subtle than MP-neurons (Arbib 1995, pp. 4– Shannon, C. E., and J. McCarthy, Eds. (1956). Automata Studies. 11). A TM might in principle be able to solve a given prob- Princeton: Princeton University Press. lem—but take far too long to solve it. Thus it is not enough Turing, A. M. (1936). On computable numbers. Proc. London that something be computable—it must be tractable, that is, Math. Soc. Ser. 2, 42: 230–265. Automaticity 63 serial. Automatic processing requires minimal effort, which von Neumann, J. (1951). The general and logical theory of autom- ata. In L. A. Jeffress, Ed., Cerebral Mechanisms in Behaviour. enables multitask processing. Automatic processing is New York: Wiley. robust and highly reliable relative to controlled processing von Neumann, J. (1966). The Theory of Self-Reproducing Autom- despite fatigue, exhaustion, and the effects of alcohol. On ata. Edited and completed by Arthur W. Burks. Urbana: Uni- the other hand, automatic processing requires substantial versity of Illinois Press. consistent practice, typically hundreds of trials for a single von Neumann, J., A. Burks, and H. H. Goldstine. (1947–48). Plan- task before accuracy is attained, whereas controlled pro- ning and Coding of Problems for an Electronic Computing cessing often attains accuracy for a single task in a few tri- Instrument. Institute for Advanced Study, Princeton. (Reprinted als. Subjects have reduced control of automatic processing, in von Neumann’s Collected Works 5: 80–235.) which attracts attention or elicits responses if task demands change relative to the subject’s previous consistent training. Automatic Processing Automatic processing produces less memory modification than controlled processing, which causes a stimulus to be processed without MEMORY of the processing (e.g., Did you See AUTOMATICITY lock the door when leaving the car?). Models of automaticity seek to account for the charac- Automaticity teristics noted above and, in particular, for the contrasts between automatic and controlled processing. They divide Automaticity is a characteristic of cognitive processing in into two kinds: incremental learning and instance-based. In which practiced consistent component behaviors are per- the incremental learning models (e.g., James 1890/1950; formed rapidly, with minimal effort or with automatic allo- Laberge 1975; Schneider, Dumais, and Shiffrin 1984), the cation of attention to the processing of the stimulus. Most strength of association between the stimulus and a priority skilled behavior requires the development of automatic pro- of the signal increases each time a positive stimulus- cesses (e.g., walking, READING, driving, programming). response sequence occurs. After a sufficient number of Automatic processes generally develop slowly, with practice such events occur, the priority of the response is sufficient over hundreds of trials. An example of an automatic process to result in an output of that stage of processing with the for the skilled reader is encoding letter strings into their minimal need for attention. Stimuli not consistently semantic meaning. As your eyes fixate on the word “red,” a attended to do not obtain a high priority, hence do not pro- semantic code representing a color and an acoustic image of duce an automatic response. In contrast, the instance-based the phonemes /r/ /e/ /d/ are activated. Automatic processes model of Logan (1992), for example, assumes that all may occur unintentionally, such as the refocusing of your instances are stored and the response time is determined by ATTENTION when you hear your name used in a nearby con- a parallel memory access in which the first retrieved versation at a party. Automatic processing can release unin- instance determines the reaction time. In this model, the tentional behaviors, such as automatic capture errors (e.g., importance of consistency is due to response conflict walking out of an elevator when the doors open on an unin- between the instances slowing the response. tended floor). The concept of automaticity has been widely applied to Automaticity develops when there is a consistent map- many areas of psychology to interpret processing differ- ping (CM) between the stimuli and responses at some stage ences. In the area of attentional processing, it has been of processing. For example, in a letter search task, a subject applied to interpret effects of processing speed, effort, visual responds to a set of letters called the “target set” and ignores search, and interference effects. In skill acquisition, it has the “distracter set.” If certain letter stimuli are consistently been applied to interpret changes in performance with prac- the target set, they will be attended and responded to when- tice and the development of procedural knowledge. In the ever they occur. Automatic processing will develop with understanding of human error, it has been applied to under- practice and the consistent target letters will attract attention stand unintended automatic behaviors such as capture errors and activate response processes. Automatic targets can be and workload-related errors for controlled processing. In found rapidly in cluttered displays with little effort. Auto- clinical disorders such as schizophrenia, difficulties in main- maticity does not develop when stimuli have a varied map- taining attention can result from too frequent or too few ping (VM) (e.g., when a letter that is a target on one trial is a automatic attention shifts, and preservative behavior can distracter on the next). result from automatic execution of component skills or lack Automatic processing (AP) is often contrasted with con- of memory modification for automatic behaviors. In addic- trolled or attentive processing. Controlled processing (CP) tions such as smoking, a major obstacle in breaking a habit occurs early in practice, is maintained when there is a varied is the difficulty of inhibiting automatic behaviors linked to mapping, and is relatively slow and effortful. social contexts. In the aging literature, there is evidence that Automatic processing shows seven qualitatively and automatic and controlled behaviors may develop and decline quantitatively different processing characteristics relative to differentially with age and that the aged may have more dif- controlled processing. Automatic processing can be much ficulty learning and altering automatic behaviors. faster than controlled processing (e.g., 2 ms per category for The concept of automatic processing has had a long his- AP versus 200 ms for CP). Automatic processing is parallel tory in cognitive psychology. The topic of automaticity was across perceptual channels, memory comparisons, and a major focus in WILLIAM JAMES’s Principles of Psychology across levels of processing, whereas controlled processing is (1890/1950). In modern times, automatic processing has 64 Autonomy of Psychology been an important issue in the attention literature (Posner Autonomy of Psychology and Snyder 1975; Schneider and Shiffrin 1977; Shiffrin 1988) and the skill acquisition literature (Laberge 1975), and the skill acquisition and memory literature (Anderson Psychology has been considered an autonomous science in 1992; Schneider and Detweiler 1987; Logan 1992). at least two respects: its subject matter and its methods. To say that its subject matter is autonomous is to say that psy- See also AGING AND COGNITION; ATTENTION IN THE chology deals with entities—properties, relations, states— HUMAN BRAIN; AUDITORY ATTENTION; EXPERTISE; EYE that are not dealt with or not wholly explicable in terms of MOVEMENTS AND VISUAL ATTENTION; MOTOR CONTROL physical (or any other) science. Contrasted with this is the —Walter Schneider idea that psychology employs a characteristic method of explanation, which is not shared by the other sciences. I References shall label the two senses of autonomy “metaphysical auton- omy” and “explanatory autonomy.” Anderson, J. R. (1992). Automaticity and the ACT theory. Ameri- Whether psychology as a science is autonomous in either can Journal of Psychology 105: 165–180. sense is one of the philosophical questions surrounding the James, W. (1890/1950). The Principles of Psychology, vol. 1. (somewhat vague) doctrine of “naturalism,” which concerns Authorized edition. New York: Dover. itself with the extent to which the human mind can be LaBerge, D. (1975). Acquisition of automatic processing in per- brought under the aegis of natural science. In their contem- ceptual and associative learning. In P. M. A. Rabbit and S. Dor- nic, Eds., Attention and Performance V. New York: Academic porary form, these questions had their origin in the “new Press. science” of the seventeenth century. Early materialists like Logan, G. D. (1992). Attention and preattention in theories of Hobbes (1651) and La Mettrie (1748) rejected both explana- automaticity. American Journal of Psychology 105: 317–339. tory and metaphysical autonomy: Mind is matter in motion, Posner, M. I., and C. R. R. Snyder. (1975). Attention and cognitive and the mind can be studied by the mathematical methods of control. In R. L. Solso, Ed., Information Processing and Cogni- the new science just as any matter can. But while material- tion: The Loyola Symposium. Hillsdale, NJ: Erlbaum, pp 55–85. ism (and therefore the denial of metaphysical autonomy) Schneider, W., and M. Detweiler. (1987). A connectionist/control had to wait until the nineteenth century before becoming architecture for working memory. In G. H. Bower, Ed., The widely accepted, the denial of explanatory autonomy Psychology of Learning and Motivation, vol. 21. New York: remained a strong force in empiricist philosophy. HUME Academic Press, pp. 54–119. Schneider, W., S. T. Dumais, and R. M. Shiffrin. (1984). Auto- described his Treatise of Human Nature (1739–1740) as an matic and control processing and attention. In R. Parasuraman, “attempt to introduce the experimental method of reasoning R. Davies, and R. J. Beatty, Eds., Varieties of Attention. New into moral subjects”—where “moral” signifies “human.” York: Academic Press, pp. 1–27. And subsequent criticism of Hume’s views, notably by Schneider, W., and R. M. Shiffrin. (1977). Automatic and con- KANT and Reid, ensured that the question of naturalism— trolled information processing in vision. In D. LaBerge and S. whether there can be a “science of man”—was one of the J. Samuels, Eds., Basic Processes in Reading: Perception and central questions of nineteenth–century philosophy, and a Comprehension. Hillsdale, NJ: Erlbaum, pp. 127–154. question that hovered over the emergence of psychology as Shiffrin, R. M. (1988). Attention. In R. C. Atkinson, R. J. Herrn- an independent discipline (see Reed 1994). stein, G. Lindzey, and R. D. Luce, Eds., Steven’s Handbook of In the twentieth century, much of the philosophical Human Experimental Psychology, vol. 2, Learning and Cogni- tion. New York: Wiley, pp. 739–811. debate over the autonomy of psychology has been inspired by the logical positivists’ discussions of the UNITY OF SCI- ENCE (see Carnap 1932–1933; Feigl 1981; Oppenheim and Further Readings Putnam 1958). For the positivists, physical science had a Bargh, J. A. (1992). The ecology of automaticity: Toward estab- special epistemological and ontological authority: The other lishing the conditions needed to produce automatic processing sciences (including psychology) must have their claims effects. American Journal of Psychology 105: 181–199. about the world vindicated by being translated into the lan- Healy, A. F., D. W. Fendrich, R. J. Crutcher, W. T. Wittman, A. T. guage of physics. This extreme REDUCTIONISM did not sur- Gest, K. R. Ericcson, and L. E. Bourne, Jr. (1992). The long- vive long after the decline of the positivist doctrines which term retention of skills. In A. F. Healy, S. M. Kosslyn, and R. generated it—and it cannot have helped prevent this decline M. Shiffrin, Eds., From Learning Processes to Cognitive Pro- that no positivist actually succeeded in translating any psy- cesses: Essays in Honor of William K. Estes, vol. 2. Hillsdale, chological claims into the language of physics. Thus even NJ: Erlbaum, pp. 87–118. Naatanen, R. (1992). Attention and Brain Function. Hillsdale, NJ: though positivism was a major influence on the rise of post- Erlbaum. war PHYSICALISM, later physicalists tended to distinguish Neumann, O. (1984). Automatic processing: A review of recent their metaphysical doctrines from the more extreme positiv- findings and a plea for an old theory. In W. Prinz and A. F. ist claims. J. J. C. Smart (1959), for example, asserted that Sanders, Eds., Cognition and Motor Processes. Berlin and mental and physical properties are identical, but denied that Heidelberg: Springer–Verlag. the psychological language we use to describe these proper- Norman, D. A., and D. G. Bobrow. (1975). On data-limited and ties can be translated into physical language. This is not yet resource-limited processes. Cognitive Psychology 7: 44–64. to concede psychology’s explanatory autonomy. That psy- Schneider, W., M. Pimm-Smith, and M. Worden. (1994). The neu- chology employs a different language does not mean it must robiology of attention and automaticity. Current Opinion in employ a different explanatory method. And Smart’s iden- Neurobiology 4: 177–182. Autonomy of Psychology 65 tity claim obviously implies the denial of psychology’s scientific. Davidson argues that psychological explanations metaphysical autonomy. attributing PROPOSITIONAL ATTITUDES are governed by nor- On the other hand, many philosophers think that the pos- mative principles: In ascribing a propositional attitude to a sibility of multiple realization forces us to accept the meta- person, we aim to make that person’s thought and action as physical autonomy of psychology. A property is multiply reasonable as possible (for related views, see McDowell realized by underlying physical properties when not all of 1985; Child 1994). In natural science, no comparable nor- the instances of that property are instances of the same mative principles are employed. It is this dependence on the physical property. This is contrasted with property identity, “constitutive ideal of rationality” that prevents a psychology where a brain property being identical with a mental prop- purporting to deal with the propositional attitudes from ever erty, for example, entails that all and only instances of the becoming scientific—in the sense that physics is scientific. one property are instances of the other. Hilary Putnam According to Davidson, decision theory is an attempt to sys- (1975) argued influentially that there are good reasons for tematize ordinary explanations of actions in terms of belief thinking that psychological properties are multiply realized and desire, by employing quantitative measures of degrees by physical properties, on the grounds that psychological of belief and desire. But because of the irreducibly norma- properties are functional properties of organisms—proper- tive element involved in propositional attitude explanation, ties identified by the causal role they play in the organism’s decision theory can never be a natural science (for more on psychological organization (see FUNCTIONALISM). this subject, see Davidson 1995). Where the “layered This kind of functionalist approach implies a certain world” picture typically combines a defense of metaphysi- degree of metaphysical autonomy: Because psychological cal autonomy with an acceptance of the properly scientific properties are multiply realized, it seems that they cannot be (or potentially scientific) nature of all psychological expla- identical with physical properties of the brain (but contrast nation, Davidson’s ANOMALOUS MONISM combines strong Lewis 1994). It does not, however, imply a Cartesian dualist explanatory autonomy with an identity theory of mental and account of the mind, because all these properties are proper- physical events. ties of physical objects, and the physical still has a certain See also BRENTANO; INTENTIONALITY; PSYCHOLOGICAL ontological priority, sometimes expressed by saying that LAWS; RATIONALISM VS. EMPIRICISM everything supervenes on the physical (see SUPERVENIENCE — Tim Crane and MIND-BODY PROBLEM). The picture that emerges is a “layered world”: The properties of macroscopic objects are multiply realized by more microscopic properties, eventu- References ally arriving at the properties which are the subject matter of Carnap, R. (1932–1933). Psychology in physical language. fundamental physics (see Fodor 1974; Owens 1989). Erkenntnis 3. With the exception of some who hold to ELIMINATIVE Child, W. (1994). Causality, Interpretation and the Mind. Oxford: MATERIALISM, who see the metaphysical autonomy of com- Clarendon Press. monsense (or “folk”) psychological categories as a reason Cummins, R. (1983). The Nature of Psychological Explanation. for rejecting the entities psychology talks about, the “layered Cambridge, MA: MIT Press. world” picture is a popular account of the relationship Davidson, D. (1970). Mental events. In L. Foster and J. Swanson, between the subject matters of the various sciences. But Eds., Experience and Theory. London: Duckworth, pp. 79–101. what impact does this picture have on the question of the Davidson, D. (1995). Can there be a science of rationality? Inter- explanatory autonomy of psychology? Here matters become national Journal of Philosophical Studies 3. a little complex. The “layered world” picture does suggest Feigl, H. (1981). Physicalism, unity of science and the foundations of psychology. In Feigl, Inquiries and Provocations. Dordrecht: that the theories of the different levels of nature can be rela- Reidel. tively independent. There is also room for different styles of Fodor, J. (1974). Special sciences: The disunity of science as a explanation. Robert Cummins (1983) argues that psycholog- working hypothesis. Synthèse 28. ical explanation does not conform to the “covering law” pat- Hobbes, T. (1651). Leviathan. Harmondsworth, England: Penguin tern of explanation employed in the physical sciences Books, 1968. (where to explain a phenomenon is to show it to be an Hume, D. (1739–1740). A Treatise of Human Nature. 2nd ed. instance of a law of nature). And some influential views of Oxford: Clarendon Press, 1978. the nature of computational psychology treat it as involving La Mettrie, J. (1748). Man the Machine. Cambridge: Cambridge three different levels of EXPLANATION (see Marr 1982). But University Press, 1996. in general, nothing in the “layered world” picture prevents Lewis, D. (1994). Reduction of mind. In S. Guttenplan, Ed., A Companion to the Philosophy of Mind. Oxford: Blackwell, pp. psychology from having a properly scientific status; it is still 412–431. the subject matter (psychological properties and relations) Marr, D. (1982). Vision. San Francisco: Freeman. of psychology that ultimately sets it apart from physics and McDowell, J. (1985). Functionalism and anomalous monism. In E. the other sciences. In short, the “layered world” conception LePore and B. Mclaughlin, Eds., Actions and Events: Perspec- holds that psychological explanation has its autonomy in the tives on the Philosophy of Donald Davidson. Oxford: Black- sense that it does not need to be reduced to physical explana- well. tion, but nonetheless it is properly scientific. Oppenheim, P., and H. Putnam. (1958). The unity of science as a This view can be contrasted with Davidson’s (1970) view working hypothesis. In H. Feigl and G. Maxwell, Eds., Minne- that there are features of our everyday psychological expla- sota Studies in the Philosophy of Science. Minneapolis: Univer- nations that prevent these explanations from ever becoming sity of Minnesota Press. 66 Autopoiesis lett’s early writings on these topics, the field would now be Owens, D. (1989). Levels of explanation. Mind 98. Putnam, H. (1975). The nature of mental states. In Putnam, Philo- much more of a social-based discipline than it is. sophical Papers, vol. 2. Cambridge: Cambridge University As a young faculty member, Bartlett had extensive inter- Press. actions with the neurologist Henry Head. While Bartlett Reed, E. (1994). The separation of psychology from philosophy: never directly concerned himself with neurophysiological Studies in the sciences of mind, 1815–1879. In S. Shanker, Ed., research (Broadbent 1970; Zangwill 1972), he provided Routledge History of Philosophy: The 19th Century. London: intellectual support for a number of students who went on to Routledge. become the first generation of British neuropsychologists Smart, J. J. C. (1959). Sensations and brain processes. Philosophi- (e.g., Oliver Zangwill, Brenda Milner). Bartlett’s discus- cal Review 68. sions with Henry Head about physiological “schemata” (used to account for aspects of human posture) was another Autopoiesis important source of Bartlett’s thinking on the psychological construct of “schema” (cf. Bartlett 1932). Through a complex series of events that occurred late in See SELF-ORGANIZING SYSTEMS his career, Bartlett had a direct hand in initiating the information-processing framework that is a major compo- Backpropagation nent of current cognitive science. During World War II, a brilliant young student named Kenneth Craik came to Cam- bridge to work with Bartlett. Craik carried out early work on See NEURAL NETWORKS; RECURRENT NETWORKS control engineering and cybernetics, only to be killed in a bicycle accident the day before World War II ended. Bartlett Bartlett, Frederic Charles was able to see the importance of Craik’s approach and took over its development. Donald BROADBENT, in his autobiog- Frederic C. Bartlett (1886–1969) was Britain’s most out- raphy (1980), notes that when he arrived at Cambridge after standing psychologist between the World Wars. He was a the war, he was exposed to a completely original point of cognitive psychologist long before the cognitive revolution view about how to analyze human behavior. Broadbent went of the 1960s. His three major contributions to current cogni- on to develop the first information-processing box models tive science are a methodological argument for the study of of human behavior (cf. Weiskrantz 1994). “ecologically valid” experimental tasks, a reconstructive Bartlett worked on applied problems throughout his approach to human memory, and the theoretical construct of career, believing that in an effort to isolate and gain control the “schema” to represent generic knowledge. of psychological processes, much laboratory research in Receiving a bachelor’s degree in philosophy from the psychology missed important phenomena that occurred in University of London (1909), Bartlett carried out additional more natural settings. This argument for ECOLOGICAL undergraduate work in the moral sciences at Cambridge VALIDITY (e.g., Neisser 1978) makes up an important sub- University (1914), where he later became director of the theme in current cognitive science (see also ECOLOGICAL Cambridge Psychological Laboratory (1922) and was even- PSYCHOLOGY). tually appointed the first Professor of Experimental Psy- Bartlett’s methodological preference for ecologically chology at Cambridge (1931). He was made Fellow of the valid tasks led him to reject the traditional approach to the Royal Society in 1932 and knighted in 1948. study of human memory that involved learning lists of non- Bartlett’s unique position in the development of psychol- sense syllables. In his book Remembering (1932), Bartlett ogy derived in part from his multidisciplinary background. reported a series of memory studies that used a broader At London and also later at Cambridge, Bartlett was influ- range of material, including texts of folktales from Native enced by the philosophy of James Ward and George Stout American cultures. Bartlett focused not on the number of (Bartlett 1936), who developed systems that were antiato- correct words recalled, but on the nature of the changes mist and antiassociationist, as opposed to the traditional Brit- made in the recalls. He found that individuals recalling this ish empiricist view. At Cambridge, Bartlett’s major type of material made inferences and other changes that led intellectual influences were C. S. Myers and W. H. R. Rivers. to a more concise and coherent story (conventionalization). Although both had been trained as physicians, Myers was an Overall, Bartlett concluded that human memory is not a experimental psychologist and Rivers a cultural and physical reproductive but a reconstructive process. Although Bart- anthropologist when Bartlett studied with them there. It was lett’s approach made little impact on laboratory memory Myers who introduced Bartlett to German laboratory psy- research at the time, with the advent of the cognitive revolu- chology with a particular focus on PSYCHOPHYSICS. tion (e.g., Neisser 1967), his ideas became an integral part His work with Rivers had a strong impact on Bartlett’s of the study of human MEMORY, and by the early 1980s, his thinking. He published a number of books and papers book Remembering was the second most widely cited work devoted to social issues and the role of psychology in anthro- in the area of human memory (White 1983). pological research (Harris and Zangwill 1973). The anthro- To account for his memory data, Bartlett developed the pological study of the conventionalization of human cultural concept of the “schema” (see SCHEMATA). He proposed that artifacts over time served as a principal source of Bartlett’s much of human knowledge consists of unconscious mental ideas about schemata. Recently social constructivists (Cos- structures that capture the generic aspects of the world (cf. tall 1992) have argued that if psychology had followed Bart- Brewer and Nakamura 1984). He argued that the changes Basal Ganglia 67 he found in story recall could be accounted for by assuming Basal Ganglia that “schemata” operate on new incoming information to fill in gaps and rationalize the resulting memory representa- tion. Bartlett’s schema concept had little impact on memory The CEREBRAL CORTEX is massively interconnected with a research in his lifetime, and, in fact, at the time of his death, large group of subcortical structures known as the “basal his own students considered it to have been a failure ganglia.” In general, the basal ganglia can be described as a (Broadbent 1970; Zangwill 1972). However, the schema set of input structures that receive direct input from the cere- construct made an impressive comeback in the hands of bral cortex, and output structures that project back to the computer scientist Marvin Minsky. In the early stages of the cerebral cortex via the THALAMUS. Thus a major feature of development of the field of artificial intelligence (AI), Min- basal ganglia anatomy is their participation in multiple sky was concerned about the difficulty of designing com- loops with the cerebral cortex, termed cortico-basal gan- puter models to exhibit human intelligence. He read glia-thalamo-cortical circuits (see Alexander, DeLong, and Bartlett’s 1932 book and concluded that humans were using Strick 1986, figure 1). top-down schema-based information to carry out many psy- Although the term basal ganglia was first used to indi- chological tasks. In a famous paper, Minsky (1975) pro- cate the putamen and globus pallidus (Ringer 1879), it now posed the use of frames (i.e., schemata) to capture the refers to the striatum, globus pallidus, subthalamic nucleus needed top-down knowledge. In its new form, the schema (STN), and substantia nigra. The striatum has three subdivi- construct has widely influenced psychological research on sions, the caudate, putamen, and ventral striatum, that human memory (Brewer and Nakamura 1984) and the field together form the main input structures of the basal ganglia. of AI. The globus pallidus consists of an external segment (GPe) and an internal segment (GPi). The GPe and STN are See also FRAME-BASED SYSTEMS; INFORMATION THE- thought to represent “intermediate” basal ganglia structures, ORY; MENTAL MODELS; TOP-DOWN PROCESSING IN VISION although the STN also receives some direct cortical inputs. — William F. Brewer The substantia nigra comprises two major cell groups, the pars compacta (SNpc) and pars reticulata (SNpr). The SNpr and GPi are the major output structures of the basal ganglia. References There has been considerable progress in defining the intrinsic organization of basal ganglia circuits (see Parent Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge Uni- and Hazrati 1995). Briefly, inputs from the cerebral cortex versity Press. to the striatum use glutamate (GLU) as an excitatory neu- Bartlett, F. C. (1936). Frederic Charles Bartlett. In C. Murchison, Ed., A History of Psychology in Autobiography, vol. 3. Worces- rotransmitter to synapse on medium-sized (12–20 µm) ter, MA: Clark University Press, pp. 39–52. spiny stellate neurons, which are also the projection or out- Brewer, W. F., and G. V. Nakamura. (1984). The nature and func- put neurons of the striatum. Most of the cortical input ter- tions of schemas. In R. S. Wyer, Jr., and T. K. Srull, Eds., Hand- minates in striatal regions known as the “matrix,” which book of Social Cognition, vol. 1. Hillsdale, NJ: Erlbaum, pp. contain high levels of acetylcholinesterase, the enzyme 119–160. responsible for breaking down the neurotransmitter acetyl- Broadbent, D. E. (1970). Frederic Charles Bartlett. In Biographical choline (Ragsdale and Graybiel 1981). Efferents from other Memoirs of Fellows of the Royal Society, vol. 16. London: cortical areas terminate in striatal regions termed “patches” Royal Society, pp. 1–13. or “striosomes,” which have low levels of acetylcholinest- Broadbent, D. E. (1980). Donald E. Broadbent. In G. Lindzey, Ed., erase. In addition to these differences in afferent input, the A History of Psychology in Autobiography, vol. 7. San Fran- cisco: Freeman, pp. 39–73. medium spiny stellate cells in the striosomes and matrix Costall, A. (1992). Why British psychology is not social: Frederic also have different efferent connections. Output cells in Bartlett's promotion of the new academic discipline. Canadian striosomes project to neurons in SNpc that produce the Psychology 33: 633–639. neurotransmitter dopamine. The axons of these SNpc cells Harris, A. D., and O. L. Zangwill. (1973). The writings of Sir Fre- project back to the striatum, where they release dopamine. deric Bartlett, C.B.E., F.R.S.: An annotated handlist. British The net effect of dopamine on striatal cells depends on the Journal of Psychology 64: 493–510. type of receptors present. Minsky, M. (1975). A framework for representing knowledge. In P. Output neurons in the matrix project to GPe, GPi, or H. Winston, Ed., The Psychology of Computer Vision. New SNpr. These striatal cells use the neurotransmitter gamma- York: McGraw-Hill, pp. 211–277. aminobutyric acid (GABA) to inhibit their targets. The Neisser, U. (1967). Cognitive Psychology. New York: Appleton- Century-Crofts. matrix cells projecting to GPi or SNpr express high levels of Neisser, U. (1978). Memory: What are the important questions? In a neuropeptide called “substance P (SP)” and are excited by M. M. Gruneberg, P. E. Morris, and R. N. Sykes, Eds., Practi- the action of dopamine on their D1 receptors. In contrast, cal Aspects of Memory. London: Academic Press, pp. 3–14. matrix cells projecting to GPe express high levels of Weiskrantz, L. (1994). Donald Eric Broadbent. In Biographical enkephalin (ENK) and are inhibited by the action of dopa- Memoirs of Fellows of the Royal Society, vol. 40. London: mine on their D2 receptors (figure 1). Royal Society, pp. 33–42. Efferents from GPe project largely to the STN, GPi and White, M. J. (1983). Prominent publications in cognitive psychol- SNpr and use GABA to inhibit their targets. Neurons in the ogy. Memory and Cognition 11: 423–427. STN project to GPi or SNpr where they use GLU to excite Zangwill, O. L. (1972). Remembering revisited. Quarterly Journal neurons in both structures (figure 1). There is also evidence of Experimental Psychology 24: 123–138. 68 Basal Ganglia Subsequent experiments have supported this proposal and also suggested that basal ganglia-thalamocortical pro- jections to the frontal lobe are topographically organized into discrete output channels (Hoover and Strick 1993; Mid- dleton and Strick 1994). Furthermore, it is now apparent that basal ganglia output is directed to cortical areas outside the frontal lobe, including a region of the temporal lobe involved in visual processing (Middleton and Strick 1996a). Thus the anatomical substrate exists for basal ganglia output to influence multiple motor and nonmotor areas of the cere- bral cortex. Consequently, current views of basal ganglia function emphasize the impact these subcortical nuclei may have on a broad spectrum of behavior. Several lines of evidence implicate the basal ganglia in forms of “habit learning” that involve the creation of novel associations between stimuli and responses. For example, individuals with Parkinson’s disease (PD) or Huntington's disease (HD) have been shown to be impaired in the perfor- mance of tasks that depend on habit learning (Knowlton, Mangels, and Squire 1996; Knowlton et al. 1996). Both PD and HD arise from the degeneration of specific cell groups in the basal ganglia (the SNpc and striatum, respectively). Interestingly, SINGLE-NEURON RECORDING studies in mon- keys have shown that “tonically active neurons” in the stria- for GPe projections to the reticular nucleus of the thalamus, tum change their firing properties as an association is built but the significance of this projection is unknown. between a specific sensory stimulus and an appropriate Neurons in the GPi and SNpr are the principal outputs of motor response (Aosaki et al. 1994). These neurons are the basal ganglia. These neurons innervate a specific set of thought to be large (50–60 µm) aspiny cholinergic interneu- thalamic nuclei and use GABA as an inhibitory transmitter. rons. Similarly, some neurons in the SNpc are preferentially Output neurons in the thalamus that receive basal ganglia activated by appetitive rewards or stimuli that predict the input use GLU as a neurotransmitter to excite their targets. occurrence of such rewards. Together, these striatal and Although some of these thalamic neurons project back to nigral neurons may form part of the neural substrate under- the striatum, and thus form a closed feedback loop with the lying behavioral reinforcement (Schultz, Dayan, and Mon- basal ganglia, the major output from the basal ganglia is to tague 1997). thalamic neurons that in turn project to the cerebral cortex. Other forms of learning also appear to be influenced by This pathway forms the efferent limb of the cortico-basal the basal ganglia. Physiological studies have shown that ganglia-thalamocortical circuit. Output neurons in SNpr portions of the striatum and pallidum are activated during and GPi also project to brain stem nuclei such as the supe- the performance of tasks that require learning a sequence of rior colliculus and pedunculopontine nucleus. The projec- movements (Jenkins et al. 1994; Mushiake and Strick 1995; tion to the colliculus appears to play a role in the generation Kermadi and Joseph 1995). Moreover, some patients with of eye and head movements. The function of the peduncu- PD and HD are selectively impaired on motor learning lopontine projection is more obscure. Pedunculopontine tasks, but not on other forms of learning (Heindel et al. neurons appear to largely project back upon SNpc, GPi, 1989). These observations suggest that the basal ganglia and STN. may play a critical role in what has been termed procedural Recently, there have been some dramatic changes in con- or motor-skill learning. cepts about the function of basal ganglia loops with the There is also evidence to support the involvement of the cerebral cortex. These loops were thought to collect inputs basal ganglia in non-motor cognitive processes. First, some from widespread cortical areas in the frontal, parietal, and neurons in the basal ganglia display activity related to sen- temporal lobes and to “funnel” this information back to the sory and cognitive functions but not motor responses (Hiko- primary motor cortex or other cortical motor areas for use in saka and Wurtz 1983; Mushiake and Strick 1995; Brown, MOTOR CONTROL (Kemp and Powell 1971). New observa- Desimone, and Mishkin 1996). Second, some individuals tions have led to the suggestion that basal ganglia loops are with PD and HD have striking cognitive and visual deficits, involved in a much more diverse range of behavior includ- such as impaired recognition of faces and facial expressions, ing MOTOR LEARNING and cognition. For example, Alex- that actually precede the development of prominent motor ander, DeLong and Strick (1986) have proposed that basal symptoms (Jacobs, Shuren, and Heilman 1995a,b). Third, ganglia output targeted at least five regions of the frontal other patients with basal ganglia lesions exhibit profound lobe: two cortical areas concerned with skeletomotor and cognitive, visual, and sensory disturbances. For example, OCULOMOTOR CONTROL, and three regions of the prefrontal lesions of the globus pallidus or SNpr have been reported to cortex involved in WORKING MEMORY, ATTENTION, and produce working memory deficits, obsessive-compulsive emotional behavior. behavior, apathy, and visual hallucinations (Laplane et al. Basal Ganglia 69 1989; McKee et al. 1990). There is also growing evidence positron-emission tomography. Journal of Neuroscience 14: 3775–3790. that alterations in the basal ganglia accompany disorders Kemp, J. M., and T. P. S. Powell. (1971). The connexions of the such as schizophrenia, depression, obsessive-compulsive striatum and globus pallidus: synthesis and speculation. Philo- disorder, Tourette's syndrome, AUTISM, and attention deficit sophical Transactions of the Royal Society of London, B262: disorder (for references, see Middleton and Strick 1996b; 441–457. Castellanos et al. 1996). Finally, the current animal model Kermadi, I., and J. P. Joseph. (1995). Activity in the caudate of PD uses high doses of a neurotoxin called MPTP (1- nucleus of monkey during spatial sequencing. Journal of Neu- methyl-4-phenyl-1,2,3,6-tetrahydropyridine) to reproduce rophysiology 74: 911–933. the neuropathology and motor symptoms of this disorder Knowlton, B. J., J. A. Mangels, and L. R. Squire. (1996). A neo- with remarkable fidelity. However, chronic low-dose treat- striatal habit learning system in humans. Science 273: 1399– ment of monkeys with this compound has been shown to 1402. Knowlton, B. J., L. R. Squire, J. S. Paulsen, N. R. Swerdlow, M. cause cognitive and visual deficits, without gross motor Swenson, and N. Butters. (1996). Dissociations within non- impairments (Schneider and Pope-Coleman 1995). declarative memory in Huntington’s disease. Neuropsychology Taken together, existing anatomical, physiological, and 10: 538–548. behavioral data suggest that the basal ganglia are not only Laplane, D., M. Levasseur, B. Pillon, B. Dubois, M. Baulac, B. involved in the control of movement, but also have the Mazoyer, S. Tran Dinh, G. Sette, F. Danze, and J. C. Baron. potential to influence diverse aspects of behavior. Future (1989). Obsessive-compulsive and other behavioural changes research will be needed to determine the full extent of the with bilateral basal ganglia lesions. Brain 112: 699–725. cerebral cortex influenced by basal ganglia output, the phys- McKee, A. C., D. N. Levine, N. W. Kowall, and E. P. Richardson. iological consequences of this influence, and the functional (1990). Peduncular hallucinosis associated with isolated infarc- operations performed by basal ganglia circuitry. tion of the substantia nigra pars reticulata. Annals of Neurology 27: 500–504. See also NEUROTRANSMITTERS Middleton, F. A., and P. L. Strick. (1994). Anatomical evidence for —Peter L. Strick and Frank Middleton cerebellar and basal ganglia involvement in higher cognitive function. Science 266: 458–461. References Middleton, F. A., and P. L. Strick. (1996a). The temporal lobe is a target of output from the basal ganglia. Proceedings of the Alexander, G. E., M. R. DeLong, and P. L. Strick. (1986). Parallel National Academy of Sciences, U.S.A. 93: 8683–8687. organization of functionally segregated circuits linking basal Middleton, F. A., and P.L. Strick. (1996b). Basal ganglia and cere- ganglia and cortex. Annual Review of Neuroscience 9: 357– bellar output influences non-motor function. Molecular Psychi- 381. atry 1: 429–433. Aosaki, T., H. Tsubokawa, A. Ishida, K. Watanabe, A. M. Gray- Mushiake, H., and P.L. Strick. (1995). Pallidal neuron activity dur- biel, and M. Kimura. (1994). Responses of tonically active neu- ing sequential arm movements. Journal of Neurophysiology 74: rons in the primate’s striatum undergo systematic changes 2754–2758. during behavioral sensorimotor conditioning. Journal of Neuro- Parent, A., and L.-N. Hazrati. (1995). Functional anatomy of the science 14: 3969-3984. basal ganglia: 1. The cortico-basal ganglia-thalamo-cortical Brown, V. J., R. Desimone, and M. Mishkin. (1996). Responses of loop. Brain Research Reviews 20: 91–127. cells in the tail of the caudate nucleus during visual discrimina- Ragsdale, C. W., and A. M. Graybiel. (1981). The fronto-striatal tion learning. Journal of Neurophysiology 74:1083–1094. projection in the cat and monkey and its relationship to inho- Castellanos, F. X., J. N. Giedd, W. L. Marsh, S. D. Hamburger, A. mogeneities established by acetylcholinesterase histochemistry. C. Vaituzis, D. P. Dickstein, S. E. Sarfatti, Y. C Vauss, J. W. Brain Research 208: 259–266. Snell, N. Lange, D. Kaysen, A. L. Krain, G. F. Ritchie, J. C. Ringer, S. (1879). Notes of a postmortem examination on a case of Rajapakse, and J. L. Rapoport. (1996). Quantitative brain mag- athetosis. Practitioner 23: 161. netic resonance imaging in attention deficit hyperactivity disor- Schneider, J. S., and A. Pope-Coleman. (1995). Cognitive deficits der. Archives of General Psychiatry 53: 607–616. precede motor deficits in a slowly progressing model of parkin- Heindel, W. C., D. P. Salmon, C. W. Shults, P. A. Walicke, and N. sonism in the monkey. Neurodegeneration 4: 245–255. Butters. (1989). Neuropsychological evidence for multiple Schultz, W., P. Dayan, and P. R. Montague. (1997). A neural sub- implicit memory systems: A comparison of Alzheimer’s, Hun- strate of prediction and reward. Science 275: 1593–1599. tington’s, and Parkinson’s disease patients. Journal of Neuro- science 9: 582–587. Further Readings Hikosaka, O., and R. H. Wurtz. (1983). Visual and oculomotor functions of monkey substantia nigra pars reticulata: 1. Rela- Albin, R. L., A. B. Young, and J. B. Penney. (1989). The functional tion of visual and auditory responses to saccades. Journal of anatomy of basal ganglia disorders. Trends in Neuroscience 12: Neurophysiology 49: 1230–1253. 355-375. Hoover, J. E., and P. L. Strick. (1993). Multiple output channels in Brown, L. L., J. S. Schneider, and T.I. Lidsky. (1997). Sensory and the basal ganglia. Science 259: 819–821. cognitive functions of the basal ganglia. Current Opinion in Jacobs, D. H., J. Shuren, and K. M. Heilman. (1995a). Impaired Neurobiology 7: 157–163. perception of facial identity and facial affect in Huntington’s Carpenter, M. B., K. Nakano, and R. Kim. (1976). Nigrothalamic disease. Neurology 45: 1217–1218. projections in the monkey demonstrated by autoradiographic Jacobs, D. H., J. Shuren, and K. M. Heilman. (1995b). Emotional technics. Journal of Comparative Neurology 165: 401–416. facial imagery, perception, and expression in Parkinson’s dis- Cummings, J. L. (1993). Frontal-subcortical circuits and human ease. Neurology 45: 1696–1702. behavior. Archives of Neurology 50: 873–880. Jenkins, I. H., D. J. Brooks, P. D. Nixon, R. S. Frackowiak, and R. DeLong, M. R. (1990). Primate models of movement disorders. E. Passingham. (1994). Motor sequence learning: A study with Trends in Neuroscience 13: 281–285. 70 Bayesian Learning Bayesian learning has two distinct advantages over clas- DeVito, J. L., and M. E. Anderson. (1982). An autoradiographic study of efferent connections of the globus pallidus in Macaca sical learning. First, it combines prior knowledge and data, mulatta. Experimental Brain Research 46: 107–117. as opposed to classical learning, which does not explicitly Divac, I., H. E. Rosvold, and M. K. Swarcbart. (1967). Behavioral incorporate user or prior knowledge, so that ad hoc methods effects of selective ablation of the caudate nucleus. Journal of are needed to combine knowledge and data (see also Comparative and Physiological Psychology 63: 184–190. MACHINE LEARNING). Second, Bayesian learning methods Dubois, B., and B. Pillon. (1997). Cognitive deficits in Parkinson’s have a built-in Occam’s razor—there is no need to introduce disease. Journal of Neurology 244: 2–8. external methods to avoid overfitting (see Heckerman Eblen, F., and A. M. Graybiel. (1995). Highly restricted origin of 1995). prefrontal cortical inputs to striosomes in the macaque monkey. To illustrate with an example taken from Howard 1970, Journal of Neuroscience 15: 5999–6013. consider a common thumbtack—one with a round, flat head Gerfen, C. R. (1984). The neostriatal mosaic: Compartmentaliza- that can be found in most supermarkets. If we toss the tion of corticostriatal input and striatonigral output systems. Nature 311: 461–464. thumbtack into the air, it will come to rest either on its point Goldman, P. S., and W. J. H. Nauta. (1977). An intricately pat- (heads) or on its head (tails). Suppose we flip the thumbtack terned prefronto-caudate projection in the rhesus monkey. Jour- N + 1 times, making sure that the physical properties of the nal of Comparative Neurology 171: 369–386. thumbtack and the conditions under which it is flipped Graybiel, A. M. (1995). Building action repertoires: Memory and remain stable over time. From the first N observations, we learning functions of the basal ganglia. Current Opinion in want to determine the probability of heads on the (N + 1)th Neurobiology 5: 733–741. toss. Ilinsky, I. A., M. Jouandet, and P. S. Goldman-Rakic. (1985). In a classical analysis of this problem, we assert that there Organization of the nigrothalamocortical system in the rhesus is some true probability of heads, which is unknown. We monkey. Journal of Comparative Neurology 236: 315–330. estimate this true probability from the N observations using Kim, R., K. Nakano, A. Jayarman, and M. B. Carpenter. (1976). Projections of the globus pallidus and adjacent structures: An criteria such as low bias and low variance. We then use this autoradiographic study in the monkey. Journal of Comparative estimate as our probability for heads on the (N + 1)th toss. In Neurology 169: 263–290. the Bayesian approach, we also assert that there is some true Lawrence, A. D., B. J. Sahakian, J. R. Hodges, A. E. Rosser, K. W. probability of heads, but we encode our uncertainty about Lange, and T. W. Robbins. (1996). Executive and mnemonic this true probability using the rules of probability to compute functions in early Huntington’s disease. Brain 119: 1633–1645. our probability for heads on the (N + 1)th toss. Nauta, W. J. H., and W. R. Mehler. (1966). Projections of the lenti- To undertake a Bayesian analysis of this problem, we form nucleus in the monkey. Brain Research 1: 3–42. need some notation. We denote a variable by an uppercase Percheron, G., C. Francois, B. Talbi, J. Yelnik, and G. Fenelon. letter (e.g., X, Xi, Θ), and the state or value of a correspond- (1996). The primate motor thalamus. Brain Research Reviews ing variable by that same letter in lowercase (e.g., x, xi, θ). 22: 93–181. Pillon, B., S. Ertle, B. Deweer, M. Sarazin, Y. Agid, and B. Dubois. We denote a set of variables by a boldface uppercase letter (e.g., X, Xi, Θ). We use a corresponding boldface lowercase (1996). Memory for spatial location is affected in Parkinson’s letter (e.g., x, xi, θ) to denote an assignment of state or value disease. Neuropsychologica 34: 77–85. Saint-Cyr, J. A., L. G. Ungerleider, and R. Desimone. (1990). to each variable in a given set. We use p(X = x |ξ), or p(x |ξ) Organization of visual cortical inputs to the striatum and subse- as a shorthand, to denote the probability that X = x of a per- quent outputs to the pallido-nigral complex in the monkey. son with state of information ξ. We also use p(x |ξ) to denote Journal of Comparative Neurology 298: 129–156. the probability distribution for X (both mass functions and Salmon, D. P., and N. Butters. (1995). Neurobiology of skill and density functions). Whether p(x |ξ) refers to a probability, a habit learning. Current Opinion in Neurobiology 5: 184–190. probability density, or a probability distribution will be clear Schultz, W. (1997). Dopamine neurons and their role in reward from context. mechanisms. Current Opinion in Neurobiology 7: 191–197. Returning to the thumbtack problem, we define Θ to be a Selemon, L. D., and P. S. Goldman-Rakic. (1985). Longitudinal variable whose values θ correspond to the possible true val- topography and interdigitation of corticostriatal projections in the rhesus monkey. Journal of Neuroscience 5: 776–794. ues of the probability of heads. We express the uncertainty Strub, R. L. (1989). Frontal lobe syndrome in a patient with bilateral about Θ using the probability density function p(θ | ξ). In globus pallidus lesions. Archives of Neurology 46: 1024–1027. addition, we use X1 to denote the variable representing the Taylor, A. E., and J. A. Saint-Cyr. (1995). The neuropsychology of outcome of the 1th flip, 1 = 1, . . . , N + 1, and D = {X1 = x1, Parkinson’s disease. Brain and Cognition 23: 281–296. . . . , XN = xN} to denote the set of our observations. Thus, in Wise, S. P., E. A. Murray, and C. R. Gerfen. (1996). The frontal Bayesian terms, the thumbtack problem reduces to comput- cortex-basal ganglia system in primates. Critical Reviews in ing p (xN+1|D,ξ) from p (θ |ξ). Neurobiology 10: 317–356. To do so, we first use Bayes's rule to obtain the probabil- ity distribution for Θ given D and background knowledge ξ: Bayesian Learning p ( θ ξ )p ( D θ, ξ ) p ( θ D, ξ ) = ---------------------------------------- - p(D ξ ) (1) The Bayesian approach views all model learning—whether of parameters, structure, or both—as the reduction of a where user’s UNCERTAINTY about the model given data. Further- more, it encodes all uncertainty about model parameters and ∫ p ( D θ, ξ )p ( θ ξ ) dθ p(D ξ) = structure as probabilities. (2) Bayesian Learning 71 Next, we expand the likelihood p(D|θ, ξ). Both Baye- tribution: sians and classical statisticians agree on this term. In partic- p ( θ D, ξ ) (6) ular, given the value of Θ, the observations in D are Γ(α + N) α +h–1 αt + t – 1 mutually independent, and the probability of heads (tails) on (1 – θ) -θ h = --------------------------------------------- any one observation is θ (1– θ). Consequently, equation 1 Γ ( α h + h )Γ ( α t + t ) becomes = Beta ( θ α h + h ,α t + t ) h t p ( θ ξ )θ ( 1 – θ ) p ( θ D, ξ ) = ---------------------------------------- - (3) We term the set of beta distributions a conjugate family of p( D ξ ) distributions for binomial sampling. Also, the expectation of θ with respect to this distribution has a simple form: where h and t are the number of heads and tails observed in αh D, respectively. The observations D represent a random ∫ θBeta ( θ α h, α t ) dθ = ----- sample from the binomial distribution parameterized by θ; - (7) α the probability distributions p(θ | ξ) and p(θ | D, ξ) are com- monly referred to as the “prior” and “posterior” for Θ, Hence, given a beta prior, we have a simple expression for the probability of heads in the (N + 1)th toss: respectively; and the quantities h and t are said to be “suffi- cient statistics” for binomial sampling because they summa- αh + h p ( X N + 1 = heads D, ξ ) = -------------- rize the data sufficiently to compute the posterior from the - (8) α+N prior. To determine the probability that the (N+1)th toss of the Assuming p(θ|ξ) is a beta distribution, it can be assessed thumbtack will come up heads, we use the expansion rule of in a number of ways. For example, we can assess our proba- probability to average over the possible values of Θ: bility for heads in the first toss of the thumbtack (e.g., using a probability wheel). Next, we can imagine having seen the p ( XN + 1 = heads D, ξ ) (4) outcomes of k flips, and reassess our probability for heads in ∫ the next toss. From equation 8, we have, for k = 1, = p( X N + 1 =heads ( θ, ξ )p ( θ D, ξ )dθ αh p ( X N + 1 = heads ξ ) = ---------------- = ∫ θp ( θ D, ξ ) dθ ≡ E p ( θ D, ξ ) ( θ ) ah + αt α2 + 1 p ( X 2 = heads X 1 = heads,ε ) = -------------------------- where Ep(θ |D, x)(θ) denotes the expectation of θ with respect - α h + αt + 1 to the distribution p(θ|D,ξ). Given these probabilities, we can solve for αh and αt. This To complete the Bayesian analysis for this example, we need a method to assess the prior distribution for Θ. A com- assessment technique is known as “the method of imagined future data.” Other techniques for assessing beta distribu- mon approach, usually adopted for convenience, is to tions are discussed by Winkler (1967). assume that this distribution is a beta distribution: Although the beta prior is convenient, it is not accurate p ( θ ξ ) = Beta ( θ α h, α t ) ≡ (5) for some problems. For example, suppose we think that the thumbtack may have been purchased at a magic shop. In this Γ(α ) α –1 α –1 ---------------------------- θ h ( 1 – θ ) t - case, a more appropriate prior may be a mixture of beta dis- Γ ( α h )Γ ( α t ) tributions—for example, where αh > 0 and αt > 0 are the parameters of the beta dis- p(θ ξ ) tribution, α = αh + αt, and Γ (·) is the gamma function that = 0.4 Beta ( 20, 1 ) + 0.4 Beta ( 1, 20 ) + 0.2 Beta ( 2, 2 ) satisfies Γ (x + 1) = xΓ(x) and Γ(1) = 1. The quantities αh and αt are often referred to as “hyperparameters” to distin- where 0.4 is our probability that the thumbtack is heavily guish them from the parameter θ. The hyperparameters αh weighted toward heads (tails). In effect, we have introduced and αt must be greater than zero so that the distribution can an additional hidden or unobserved variable H, whose states be normalized. Examples of beta distributions are shown in correspond to the three possibilities: (1) thumbtack is biased figure 1. toward heads, (2) thumbtack is biased toward tails, or (3) The beta prior is convenient for several reasons. By thumbtack is normal; and we have asserted that θ condi- equation 3, the posterior distribution will also be a beta dis- tioned on each state of H is a beta distribution. In general, there are simple methods (e.g., the method of imagined future data) for determining whether or not a beta prior is an accurate reflection of one’s beliefs. In those cases where the beta prior is inaccurate, an accurate prior can often be assessed by introducing additional hidden variables, as in this example. So far, we have only considered observations drawn from a binomial distribution. To be more general, suppose our problem domain consists of variables X = (X1, . . . , Xn). In addition, suppose that we have some data D = (x1, . . . , xN), Figure 1. Several beta distributions. 72 Bayesian Networks which represent a random sample from some unknown can search for one or more model structures with large pos- (true) probability distribution for X. We assume that the terior probabilities, and use these models as if they were unknown probability distribution can be encoded by some exhaustive—an approach known as “model selection.” statistical model with structure m and parameters θm. Examples of search methods applied to Bayesian networks Uncertain about the structure and parameters of the model, are given by Heckerman, Geiger, and Chickering (1995) and we take the Bayesian approach—we encode this uncertainty Madigan et al. (1996). using probability. In particular, we define a discrete variable See also HIDDEN MARKOV MODELS; PROBABILITY, FOUN- M, whose states m correspond to the possible true models, DATIONS OF; PROBABILISTIC REASONING and encode our uncertainty about M with the probability —David Heckerman distribution p (m | ξ). In addition, for each model structure m, we define a continuous vector-valued variable Θm, References whose configurations θm correspond to the possible true parameters. We encode our uncertainty about Θm using the Bernardo, J., and A. Smith. (1994). Bayesian Theory. New York: probability density function p (θm | m,ξ). Wiley. Given random sample D, we compute the posterior dis- Heckerman, D. (1998). A tutorial on learning with Bayesian net- tributions for each m and θm using Bayes’s rule: works. In M. Jordan, Ed., Learning in Graphical Models. Klu- wer, pp. 301–354 p ( m ξ )p ( D m, ξ ) Heckerman, D., D. Geiger, and D. Chickering. (1995). Learning p ( m D,ξ ) = -------------------------------------------------------- - (9) Σ m′ p ( m′ ξ )p ( D m′, ξ ) Bayesian networks: The combination of knowledge and statisti- cal data. Machine Learning 20: 197–243. Howard, R. (1970). Decision analysis: Perspectives on inference, p ( θ m m, ξ )p ( D θ m, m, ξ ) decision, and experimentation. Proceedings of the IEEE 58: p ( θ m D, m, ξ ) = -------------------------------------------------------------- - (10) p ( D m, ξ ) 632–643. Lauritzen, S. (1996). Graphical Models. Oxford: Clarendon Press. where Madigan, D., A. Raftery, C. Volinsky, and J. Hoeting. (1996). Bayesian model averaging. Proceedings of the AAAI Workshop ∫ p ( D m, ξ ) p ( D θ m, m, ξ )p ( θ m m, ξ ) dθ m on Integrating Multiple Learned Models. Portland, OR. (11) Winkler, R. (1967). The assessment of prior distributions in Baye- sian analysis. American Statistical Association Journal 62: is the marginal likelihood. Given some hypothesis of inter- 776–800. est, h, we determine the probability that h is true given data D by averaging over all possible models and their parame- ters according to the rules of probability: Bayesian Networks ∑ p ( m D, ξ )p ( h D, m, ξ ) p ( h D, ξ ) = (12) Bayesian networks were conceptualized in the late 1970s to m model distributed processing in READING comprehension, ∫ p ( h θ , m, ξ )p ( θ where both semantical expectations and perceptual evidence p ( h D, m, ξ ) = D, m, ξ ) dθ m (13) m m must be combined to form a coherent interpretation. The ability to coordinate bidirectional inferences filled a void in For example, h may be the event that the next observation is EXPERT SYSTEMS technology of the early 1980s, and Baye- xN+1. In this situation, we obtain sian networks have emerged as a general representation scheme for uncertain knowledge (Pearl 1988; Shafer and (14) p ( x N + 1 D, ξ ) Pearl 1990; Heckerman, Mamdani, and Wellman 1995; Jensen 1996; Castillo, Gutierrez, and Hadi 1997). ∑ p ( m D, ξ ) ∫ p ( x θ m, m, ξ )p ( θ m D, m, ξ ) dθ m Bayesian networks are directed acyclic graphs (DAGs) in = N+1 which the nodes represent variables of interest (e.g., the m temperature of a device, the gender of a patient, a feature of where p(xN+1 |θm, m, ξ) is the likelihood for the model. an object, the occurrence of an event) and the links represent This approach is often referred to as “Bayesian model aver- informational or causal dependencies among the variables. aging.” Note that no single model structure is learned. The strength of a dependency is represented by conditional Instead, all possible models are weighted by their posterior probabilities that are attached to each cluster of parent-child probability. nodes in the network. Under certain conditions, the parameter posterior and Figure 1 describes a simple yet typical Bayesian network: marginal likelihood can be computed efficiently and in the causal relationships between the season of the year (X1), closed form. For example, such computation is possible whether rain falls (X2) during the season, whether the sprin- when the likelihood is given by a BAYESIAN NETWORK (e.g., kler is on (X3) during that season, whether the pavement Heckerman 1998) and several other conditions are met. See would get wet (X4), and whether the pavement would be slip- Bernardo and Smith 1994, Laurtizen 1996, and Heckerman pery (X5). Here the absence of a direct link between X1 and 1998 for a discussion. X5, for example, captures our understanding that the influ- When many model structures are possible, the sums in ence of seasonal variations on the slipperiness of the pave- equations 9 and 12 can be intractable. In such situations, we ment is mediated by other conditions (e.g., wetness). Bayesian Networks 73 to represent and respond to changing configurations. Any local reconfiguration of the mechanisms in the environment can be translated, with only minor modification, into an iso- morphic reconfiguration of the network topology. For exam- ple, to represent a disabled sprinkler, we simply delete from the network all links incident to the node “Sprinkler.” To represent the policy of turning the sprinkler off when it rains, we simply add a link between “Rain” and “Sprinkler” and revise P(x3 | x1, x2). This flexibility is often cited as the ingredient that marks the division between deliberative and reactive agents, and that enables deliberative agents to man- Figure 1. A Bayesian network representing causal influences among five variables. age novel situations instantaneously, without requiring retraining or adaptation. Organizing one’s knowledge around stable causal mech- As this example illustrates, a Bayesian network consti- anisms provides a basis for planning under UNCERTAINTY tutes a model of the environment rather than, as in many (Pearl 1996). Once we know the identity of the mechanism other KNOWLEDGE REPRESENTATION schemes (e.g., rule- altered by the intervention and the nature of the alteration, based systems and NEURAL NETWORKS), a model of the rea- the overall effect of an intervention can be predicted by soning process. It simulates, in fact, the mechanisms that modifying the corresponding factors in equation 1 and using operate in the environment, and thus facilitates diverse the modified product to compute a new probability function. modes of reasoning, including prediction, abduction, and For example, to represent the action “turning the sprinkler control. ON” in the network of figure 1, we delete the link X1 →X3 Prediction and abduction require an economical repre- and fix the value of X3 to ON. The resulting joint distribu- sentation of a joint distribution over the variables involved. tion on the remaining variables will be Bayesian networks achieve such economy by specifying, for P ( x 1, x 2, x 4, x 5 ) each variable Xi, the conditional probabilities P(xi | pai) (2) where pai is a set of predecessors (of Xi) that render Xi inde- = P ( x 1 )P ( x 2 x 1 )P ( x 4 x 2, X 3 = ON)P ( x 5 x 4 ) pendent of all its other predecessors. Variables judged to be the direct causes of Xi satisfy this property, and these are Note the difference between the observation X3 = ON, depicted as the parents of Xi in the graph. Given this specifi- encoded by ordinary Bayesian conditioning, and the action cation, the joint distribution is given by the product do(X3 = ON), encoded by conditioning a mutilated graph, ∏ P(x with the link X1→X3 removed. This indeed mirrors the dif- P ( x 1, …, x n ) = pa i ) (1) i ference between seeing and doing: after observing that the i sprinkler is ON, we wish to infer that the season is dry, that it probably did not rain, and so on; no such inferences from which all probabilistic queries (e.g., Find the most should be drawn in evaluating the effects the contemplated likely explanation for the evidence) can be answered coher- action “turning the sprinkler ON.” ently using probability calculus. One of the most exciting prospects in recent years has The first algorithms proposed for probabilistic calcula- been the possibility of using Bayesian networks to discover tions in Bayesian networks used message-passing architec- causal structures in raw statistical data (Pearl and Verma ture and were limited to trees (Pearl 1982; Kim and Pearl 1991; Spirtes, Glymour, and Schienes 1993). Although any 1983). Each variable was assigned a simple processor and inference from association to CAUSATION is bound to be less permitted to pass messages asynchronously with its neigh- reliable than one based on controlled experiment, we can bors until equilibrium was achieved. Techniques have since still guarantee an aspect of reliability called “stability”: Any been developed to extend this tree propagation method to alternative structure compatible with the data must be less general networks. Among the most popular are Lauritzen stable than the structure inferred, which is to say, slight fluc- and Spiegelhalter's (1988) method of join-tree propagation, tuations in parameters will render that structure incompati- and the method of cycle-cutset conditioning (see Pearl ble with the data. With this form of guarantee, the theory 1988, 204–210; Jensen 1996). provides criteria for identifying genuine and spurious While inference in general networks is NP-hard, the causes, with or without temporal information, and yields COMPUTATIONAL COMPLEXITY for each of the methods cited algorithms for recovering causal structures with hidden vari- above can be estimated prior to actual processing. When the ables from empirical data. estimates exceed reasonable bounds, an approximation In mundane decision making, beliefs are revised not by method such as stochastic simulation (Pearl 1987; 1988, adjusting numerical probabilities but by tentatively accept- 210–223) can be used instead. Learning techniques have ing some sentences as “true for all practical purposes.” Such also been developed for systematically updating the condi- sentences, called “plain beliefs,” exhibit both logical and tional probabilities P(xi | pai) and the structure of the net- probabilistic character. As in classical LOGIC, they are prop- work, so as to match empirical data (see Spiegelhalter and ositional and deductively closed; as in probability, they are Lauritzen 1990; Cooper and Herskovits 1990). subject to retraction and to varying degrees of entrench- The most distinctive feature of Bayesian networks, stem- ment. Bayesian networks can be adopted to model the ming largely from their causal organization, is their ability 74 Behavior-Based Robotics dynamics of plain beliefs by replacing ordinary probabilities Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. San Mateo, CA: Morgan Kaufmann. with nonstandard probabilities, that is, probabilities that are Pearl, J. (1996). Causation, action, and counterfactuals. In Y. Sho- infinitesimally close to either zero or one (Goldszmidt and ham, Ed., Theoretical Aspects of Rationality and Knowledge: Pearl 1996). Proceedings of the Sixth Conference. San Francisco: Morgan Although Bayesian networks can model a wide spectrum of Kaufmann, pp. 51–73. cognitive activity, their greatest strength is in CAUSAL REASON- Pearl, J., and T. Verma. (1991). A theory of inferred causation. In J. ING, which in turn facilitates reasoning about actions, explana- A. Allen, R. Fikes, and E. Sandewall, Eds., Principles of tions, counterfactuals, and preferences. Such capabilities are Knowledge Representation and Reasoning: Proceedings of the not easily implemented in neural networks, whose strengths Second International Conference. San Mateo, CA: Morgan lie in quick adaptation of simple motor-visual functions. Kaufmann, pp. 441–452. Some questions arise: Does an architecture resembling Shafer, G., and J. Pearl, Eds. (1990). Readings in Uncertain Rea- soning. San Mateo, CA: Morgan Kaufmann. that of Bayesian networks exist anywhere in the human Spirtes, P., C. Glymour, and R. Schienes. (1993). Causation, Pre- brain? If not, how does the brain perform those cognitive diction, and Search. New York: Springer. functions at which Bayesian networks excel? A plausible Spiegelhalter, D. J., and S. L. Lauritzen. (1990). Sequential updat- answer to the second question is that fragmented structures ing of conditional probabilities on directed graphical structures. of causal organizations are constantly being assembled on Networks: An International Journal 20(5): 579–605. the fly, as needed, from a stock of functional building blocks. For example, the network of figure 1 may be assem- Behavior-Based Robotics bled from several neural networks, one specializing in the experience surrounding seasons and rains, another in the properties of wet pavements, and so forth. Such specialized Behavior-based robotics (BBR) bridges the fields of artifi- networks are probably stored permanently in some mental cial intelligence, engineering, and cognitive science. The library, from which they are drawn and assembled into the behavior-based approach is a methodology for designing structure shown in figure 1 only when a specific problem autonomous agents and robots; it is a type of INTELLIGENT presents itself, for example, to determine whether an operat- AGENT ARCHITECTURE. Architectures supply structure and ing sprinkler could explain why a certain person slipped and impose constraints on the way robot control problems are broke a leg in the middle of a dry season. solved. The behavior-based methodology imposes a general, Thus Bayesian networks are particularly useful in study- biologically inspired, bottom-up philosophy, allowing for a ing higher cognitive functions, where the problem of orga- certain freedom of interpretation. Its goal is to develop nizing and supervising large assemblies of specialized methods for controlling artificial systems (usually physical neural networks becomes important. robots, but also simulated robots and other autonomous software agents) and to use robotics to model and better See also BAYESIAN LEARNING; PROBABILISTIC REASON- understand biological systems (usually animals, ranging ING; PROBABILITY, FOUNDATIONS OF from insects to humans). —Judea Pearl Behavior-based robotics controllers consist of a collec- tion of behaviors that achieve and/or maintain goals. For References example, “avoid-obstacles” maintains the goal of preventing collisions; “go-home” achieves the goal of reaching some Castillo, E., J. M. Gutierrez, and A. S. Hadi. (1997). Expert Sys- home destination. Behaviors are implemented as control tems and Probabilistic Network Models. New York: Springer. laws (sometimes similar to those used in CONTROL THE- Cooper, G. F., and E. Herskovits. (1990). A Bayesian method for constructing Bayesian belief networks from databases. Pro- ORY), either in software or hardware, as a processing ele- ceedings of the Conference on Uncertainty in AI, pp. 86–94. ment or a procedure. Each behavior can take inputs from the Goldszmidt, M., and J. Pearl. (1996). Qualitative probabilities for robot's sensors (e.g., camera, ultrasound, infrared, tactile) default reasoning, belief revision, and causal modeling. Artifi- and/or from other behaviors in the system, and send outputs cial Intelligence 84(1–2): 57–112. to the robot's effectors (e.g., wheels, grippers, arm, speech) Heckerman, D., A. Mamdani, and M. P. Wellman, Guest Eds., and/or to other behaviors. Thus, a behavior-based controller (1995). Real-world applications of Bayesian networks. Com- is a structured network of interacting behaviors. munications of the ACM 38(3): 24–68. BBR is founded on subsumption architecture (Brooks Jensen, F. V. (1996). An Introduction to Bayesian Networks. New 1986) and other work in reactive robotics (RR). RR achieves York: Springer. rapid real-time responses by embedding the robot's control- Kim, J. H., and J. Pearl. (1983). A computational model for com- bined causal and diagnostic reasoning in inference systems. ler in a collection of preprogrammed, concurrent condition- Proceedings of IJCAI-83, pp. 190–193. Karlsruhe, Germany. action rules with minimal internal state (e.g., “if bumped, Lauritzen, S. L., and D. J. Spiegelhalter. (1988). Local computa- stop,” “if stopped, back up”; Brooks and Connell 1986; tions with probabilities on graphical structures and their appli- Agre and Chapman 1987). Subsumption architecture pro- cation to expert systems. Journal of the Royal Statistical vides a layered approach to assembling reactive rules into Society Series B 50(2): 157–224. complete control systems from the bottom up. Rules, and Pearl, J. (1982). Reverend Bayes on inference engines: A distrib- layers of rules, are added incrementally; lower layers can uted hierarchical approach. Proceedings of the AAAI National function independently of the higher ones, and higher ones Conference on AI, pp. 133–136. Pittsburgh. utilize the outputs of the lower ones, but do not override Pearl, J. (1987). Evidential reasoning using stochastic simulation them. For example, “avoid-collision” at the lowest level, and of causal models. Artificial Intelligence 32(2): 245–258. Behavior-Based Robotics 75 small-increment,” “turn-by-a-small-angle”), and they extend “move-to-light” at a higher level, when combined, result in in time and space. Some implemented behaviors include: a robust light-chasing behavior; the higher-level rule never “go-home,” “find-object,” “get-recharged,” “avoid-the-light,” overrides the lower-level one, thus guaranteeing collision “aggregate-with-group,” “pick-up-object,” “find-landmark,” avoidance. etc. Because behaviors can be defined at different levels of While robust, such reactive systems are limited by their abstraction and can represent various types of information, lack of internal state; they are incapable of using internal they are difficult to define precisely, but are also a rich representations and learning new behaviors. Behavior-based medium for innovative interpretations. systems overcome this limitation because their underlying Deciding what behavior to execute at a particular point in unit of representation, behaviors, can store state. The way time is called behavior arbitration, and is one of the central state is represented and distributed in BBR is one of the design challenges of BBR. For simplicity, most implemented sources of its novelty. Information is not centralized or cen- systems use a built-in, fixed priority for behaviors. More trally manipulated; instead, various forms of distributed rep- flexible solutions, which can be less computationally effi- resentations are used, ranging from static table structures cient and harder to analyze, are commonly based on comput- and networks to active, procedural processes implemented ing some function of the behavior activation levels, such as a within the behavior networks. voting or activation spreading scheme (Maes 1989; Payton et In contrast to RR and BBR, both of which are structured al. 1992). Behavior-based systems are typically designed so and developed bottom-up, PLANNING-based deliberative the effects of the behaviors largely interact in the environ- control systems are top-down, and require the agent/robot to ment rather than internally through the system, taking advan- perform a sequence of processing sense-plan-act steps (e.g., tage of the richness of interaction dynamics by exploiting the “combine the sensory data into a map of the world, then use properties of SITUATEDNESS/EMBEDDEDNESS. These dynam- the planner to find a path in the map, then send steps of the plan to the robot’s wheels”; Giralt, Chatila, and Vaisset ics are sometimes called emergent behaviors because they 1983; Moravec and Elfes 1985; Laird and Rosenbloom emerge from the interactions and are not internally specified 1990). Hybrid systems attempt a compromise between bot- by the robot's program. Therefore, the internal behavior tom-up and top-down by employing a reactive system for structure of a behavior-based system need not necessarily low-level control and a planner for high-level decision mak- mirror its externally manifested behavior. For example, a ing (Firby 1987; Georgeoff and Lansky 1987; Arkin 1989; robot that flocks with other robots may not have a specific Payton 1990; Connell 1991). Often called “three-layer archi- internal “flocking” behavior; instead, its interaction with the tectures,” they separate the control system into three com- environment and other robots may result in flocking, municating but independent parts: (i) the planner, (ii) the although its only behaviors may be “avoid collisions,” “stay reactive system, and (iii) the intermediate module, which close to the group,” and “keep going” (Mataric 1997). ´ reconciles the different time-scales and representations used Behavior-based robots have demonstrated various stan- by the other two and any conflicts between their outputs. dard robotic capabilities, including obstacle avoidance, nav- Behavior-based systems typically do not employ such a igation, terrain mapping, following, chasing/pursuit, object hierarchical division but are instead integrated through a manipulation, task division and cooperation, and learning homogeneous distributed representation. Like hybrid sys- maps, navigation and walking. They have also demonstrated tems, they also provide both low-level control and high-level some novel applications like large-scale group behaviors, deliberation; the latter is performed by one or more distrib- including flocking, foraging, and soccer playing, and mod- uted representations that compute over the other behaviors, eling insect and even human behavior (Agha and Bekey often directly utilizing low-level behaviors and their outputs. 1997; Webb 1994; Asada et al. 1994; Brooks and Stein The resulting system, built from the bottom-up, does not 1994). Application domains have included MOBILE ROBOTS, divide into differently represented and independent compo- underwater vehicles, space robotics, as well as robots capa- nents as in hybrid systems, but instead constitutes an integra- ble of MANIPULATION AND GRASPING, and some WALKING ted computational behavior network. The power, elegance, AND RUNNING MACHINES. and complexity of behavior-based systems all stem from the Variations and adaptations of MACHINE LEARNING, and in ways their constituent behaviors are defined and used. particular REINFORCEMENT LEARNING, have been effectively Consequently, the organizational methodology of behav- applied to behavior-based robots, which have demonstrated ior-based systems differs from other control methods in its learning to walk (Maes and Brooks 1990), navigate approach to modularity, the way in which the system is orga- (Mataric 1992; Millan 1994), communicate (Yanco and ´ nized and subdivided into modules. Behavior-based philoso- Stein 1993), divide tasks (Parker 1993; Mataric 1997), ´ phy mandates that the behaviors be relatively simple, added behave socially (Mataric 1994), and even identify oppo- ´ to the system incrementally, and not executed in a serial fash- nents and score goals in robot soccer (Asada et al. 1994). ion. Subsets of behaviors are executed concurrently so that Methods from ARTIFICIAL LIFE, EVOLUTIONARY COMPUTA- the system can exploit parallelism, both in the speed of com- TION, GENETIC ALGORITHMS, FUZZY LOGIC, VISION AND putation and in the resulting dynamics that arise within the LEARNING, MULTIAGENT SYSTEMS, and many others con- system itself (from the interaction among the behaviors) and tinue to be actively explored and applied to behavior-based with the environment (from the interaction of the behaviors robots as their role in animal modeling and practical appli- with the external world). Behaviors can be designed at a vari- cations continues to develop. ety of abstraction levels. In general they are higher than the —Maja J. Mataric ´ robot's atomic actions (i.e., typically above “go-forward-by-a- 76 Behavior-Based Robotics References ference on Simulation of Adaptive Behavior. Cambridge, MA: MIT Press, pp. 266–274. Agha, A., and G. Bekey. (1997). Phylogenetic and ontogenetic Moravec, H., and A. Elfes. (1985). High resolution maps from learning in a colony of interacting robots. Autonomous Robots wide angle sonar. Proceedings, IEEE International Conference 4(1). on Robotics and Automation, St. Louis, MO. Agre, P., and D. Chapman. (1987). Pengi: an implementation of a Parker, L. (1993). Learning in cooperative robot teams. Proceed- theory of activity. Proceedings, Sixth National Conference of ings, International Joint Conference on Artificial Intelligence, the American Association for Artificial Intelligence Confer- Workshop on Dynamically Interacting Robots, Chambery, ence. Seattle, WA, pp. 268–272. France, pp. 12–23. Arkin, R. (1989). Towards the unification of navigational planning Payton, D. (1990). Internalized plans: a representation for action and reactive control. Proceedings, American Association for resources. In P. Maes, Ed., Robotics and Autonomous Systems, Artificial Intelligence Spring Symposium on Robot Navigation, Special Issue on Designing Autonomous Agents: Theory and Palo Alto, CA, pp. 1–5. Practice from Biology to Engineering and Back 6 (1–2): 89–104. Asada, M., E. Uchibe, S. Noda, S. Tawaratsumida, and K. Hosoda. Payton, D., D. Keirsey, D. Kimble, J. Krozel, and K. Rosenblatt. (1994). Coordination of multiple behaviors acquired by a (1992). Do whatever works: a robust approach to fault-tolerant vision-based reinforcement learning. Proceedings, IEEE/RSJ/ autonomous control. Journal of Applied Intelligence 3: 226–249. GI International Conference on Intelligent Robots and Systems, Webb, B. (1994). Robotic experiments in cricket phonotaxis. Pro- Munich, Germany. ceedings of the Third International Conference on the Simula- Brooks, R. (1986). A robust layered control system for a mobile tion of Adaptive Behavior. Cambridge, MA: MIT Press. robot. IEEE Journal of Robotics and Automation RA-2 (April), Yanco, H., and L. Stein. (1993). An adaptive communication pro- pp. 14–23. tocol for cooperating mobile robots. In D. Cliff, P. Husbands, J. Brooks, R., and J. Connell. (1986). Asynchronous distributed con- A. Meyer, and S. Wilson, Eds., Proceedings, From Animals to trol system for a mobile robot. Proceedings, SPIE Intelligent Animats 3, Third International Conference on Simulation of Control and Adaptive Systems, Cambridge, MA, pp. 77–84. Adaptive Behavior. Cambridge, MA: MIT Press, pp. 478–485. Brooks, R., and L. Stein. (1994). Building brains for bodies. Autonomous Robots 1 (1): 7–25. Further Readings Connell, J. (1991). SSS: a hybrid architecture applied to robot nav- Arkin, R. (1987). Motor schema based navigation for a mobile igation. Proceedings, International Conference on Robotics robot: an approach to programming by behavior. IEEE Interna- and Automation, Nice, France, pp. 2719–2724. tional Conference on Robotics and Automatio. Raleigh, NC, pp. Firby, J. (1987). An investigation into reactive planning in complex 264–271. domains. Proceedings, Sixth National Conference of the Ameri- Arkin, R. (1990). Integrating behavioral, perceptual and world can Association for Artificial Intelligence Conference, Seattle, knowledge in reactive navigation. In P. Maes, Ed., Robotics and WA, pp. 202–206. Autonomous Systems, Special Issue on Designing Autonomous Georgeoff, M., and A. Lansky. (1987). Reactive reasoning and Agents: Theory and Practice from Biology to Engineering and planning. Proceedings, Sixth National Conference of the Amer- Back 6 (1–2): 105–122. ican Association for Artificial Intelligence Conference, Seattle, Asada, M., E. Uchibe, and K. Hosoda. (1995). Agents that learn WA, pp. 677–682. from other competitive agents. Proceedings, Machine Learning Giralt, G., R. Chatila, and M. Vaisset. (1983). An integrated navi- Conference Workshop on Agents That Learn From Other gation and motion control system for autonomous multisensory Agents. mobile robots. Proceedings, First International Symposium on Beer, R., H. Chiel, and L. Sterling. (1990). A biological perspec- Robotics Research, Cambridge, MA: MIT Press, pp. 191–214. tive on autonomous agent design. In P. Maes, Ed., Robotics and Laird, J., and P. Rosenbloom. (1990). An investigation into reactive Autonomous Systems, Special Issue on Designing Autonomous planning in complex domains. Proceedings, Ninth National Agents: Theory and Practice from Biology to Engineering and Conference of the American Association for Artificial Intelli- Back 6 (1–2): 169–186. gence Conference, Cambridge, MA: MIT Press, pp. 1022– Brooks, R. (1990). Elephants don’t play chess. In P. Maes, Ed., 1029. Robotics and Autonomous Systems, Special Issue on Designing Maes, P. (1989). The dynamics of action selection. Proceedings, Autonomous Agents: Theory and Practice from Biology to International Joint Conference on Artificial Intelligence, Engineering and Back 6 (1–2): 3–16. Detroit, MI, pp. 991–997. Brooks, A. (1991a). Intelligence without representation. Artificial Maes, P., and R. Brooks. (1990). Learning to coordinate behaviors. Intelligence 47: 139–160. Proceedings, Ninth National Conference of the American Asso- Brooks, A. (1991b). Intelligence without reason. Proceedings, ciation for Artificial Intelligence Conference, Cambridge, MA: International Joint Conference on Artificial Intelligence, Syd- MIT Press, pp. 796–802. ney, Australia, Cambridge, MA: MIT Press. Matari´ , M. (1992). Integration of representation into goal-driven c Connell, J. (1990). Minimalist Mobile Robotics: A Colony Archi- behavior-based robots. IEEE Transactions on Robotics and tecture for an Artificial Creature. Boston: Academic Press. Automation 8 (3): 304–312. Connell, J., and S. Mahadevan. (1993). Robot Learning. Kluwer Matari´ , M. (1994). Learning to behave socially. In D. Cliff, P. c Academic Publishers. Husbands, J-A. Meyer, and S. Wilson, Eds., Proceedings, From Floreano, D., and F. Mondada. (1996). Evolution of homing navi- Animals to Animats 3, Third International Conference on Simu- gation in a real mobile robot. IEEE Transactions on Systems, lation of Adaptive Behavior. Cambridge, MA: MIT Press, pp. Man, and Cybernetics. Los Alamitos, CA: IEEE Press. 453–462. Matari´ , M. (1997). Reinforcement learning in the multi-robot c Grefenstette, J., and A. Schultz. (1994). An evolutionary approach to learning in robots. Proceedings, Machine Learning Work- domain. Autonomous Robots 4 (1): 73–83. shop on Robot Learning. New Brunswick, NJ. Millan, J. (1994). Learning reactive sequences from basic reflexes. Jones, J., and A. Flynn. (1993). Mobile Robots, Inspiration to In D. Cliff, P. Husbands, J-A. Meyer, and S. Wilson, Eds., Pro- Implementation. Wellesley, MA: A. K. Peters, Ltd. ceedings, From Animals to Animats 3, Third International Con- Behaviorism 77 when it “ascertained the empirical correlation of the various Kaelbling, L. (1993). Learning in Embedded Systems. Cambridge, MA: MIT Press. sorts of thought or feeling with definite conditions of the Kaelbling, L., and S. Rosenschein. (1990). Action and planning in brain” (1890: vi). embedded agents. In P. Maes, Ed., Robotics and Autonomous His own primary interest in conscious mental experience Systems, Special Issue on Designing Autonomous Agents: The- notwithstanding, James did predict that “the data assumed by ory and Practice from Biology to Engineering and Back 6 (1– psychology, like those assumed by physics and the other nat- 2): 35–48. ural sciences, must some time be overhauled” (1890: vi). Maes, P. (1990). Situated agents can have goals. In P. Maes, Ed., James did not, however, predict that only a few short years Robotics and Autonomous Systems, Special Issue on Designing after publication of his Principles, just such a major overhaul Autonomous Agents: Theory and Practice from Biology to would be in full swing. Nor did James foresee that the impact Engineering and Back 6 (1–2) of this overhaul would be so revolutionary and controversial Malcolm, C., and T. Smithers. (1990). Symbol grounding via a hybrid architecture in an autonomous assembly system. In P. nearly a century later. Behaviorism is the name given to this Maes, Ed., Robotics and Autonomous Systems, Special Issue on dramatic shift away from psychology as the science of men- Designing Autonomous Agents: Theory and Practice from Biol- tal life and toward being the science of overt action. ogy to Engineering and Back 6 (1–2): 145–168. John B. Watson is usually credited with beginning behav- Marjanovic, M., B. Scassellati, and M. Williamson. (1996). Self- iorism; surely, his 1913 paper, “Psychology as the Behavior- taught visually-guided pointing for a humanoid robot. In P. ist Views It,” made the case for behaviorism in a most Maes, M. Mataric, J-A. Meyer, J. Pollack, and S. Wilson, Eds., ´ dramatic and forceful manner. But other scholars paved the Proceedings, From Animals to Animats 4, Fourth International way for behaviorism. Among them, H. S. Jennings, Watson's Conference on Simulation of Adaptive Behavior. Cambridge, biologist colleague at Johns Hopkins University, more MA: MIT Press, pp. 35–44. methodically and less polemically advocated a behavioristic Mataric, M. (1990). Navigating with a rat brain: a neurobiologi- ´ cally-inspired model for robot spatial representation. In J-A. approach for psychology in his 1906 book, Behavior of the Meyer, and S. Wilson, Eds., Proceedings, From Animals to Ani- Lower Organisms. Jennings’s views on the science of psy- mats 1, First International Conference on Simulation of Adap- chology still serve as a fitting introduction to the premises tive Behavior. Cambridge, MA: MIT Press, pp. 169–175. and methods of behaviorism. Mataric, M. (1997). Behavior-based control: examples from navi- ´ As did Watson, Jennings studied the behavior of nonhu- gation, learning, and group behavior. In Hexmoor, Horswill, man animals. Interest in nonhuman animals or even human and Kortenkamp, Eds., Journal of Experimental and Theoreti- infants poses very real limits on our readiest means for cal Artificial Intelligence, Special Issue on Software Architec- understanding the psychological phenomena of thinking tures for Physical Agents 9 (2–3): 1997. and feeling: namely, INTROSPECTION, what Edward B. Nolfi, S., D. Floreano, O. Miglino, and F. Mondada. (1994). Now Titchener (1896) called the one distinctively psychological to evolve autonomous robots: different approaches in evolu- tionary robotics. In R. Brooks and P. Maes, Eds., Proceedings, method. Without verbal report, how can we ever claim to Artificial Life IV, the Fourth International Workshop on the Syn- have gained access to another organism's mental life? thesis and Simulation of Living Systems. Cambridge, MA: MIT Indeed, just because we ask other people to report to us their Press, pp. 190–197. private thoughts and feelings, why should we be sanguine Smithers, T. (1995). On quantitative performance measures of that they are either willing or able to do so? robot behaviour. In L. Steels, Ed., The Biology and Technology Jennings took a decidedly cautious stance concerning the of Intelligent Autonomous Agents. Cambridge, MA: MIT Press, private world of conscious thoughts and feelings. “The con- pp. 107–133. scious aspect of behavior is undoubtedly most interesting. Steels, L. (1994a). Emergent functionality of robot behavior But we are unable to deal directly with this by the methods through on-line evolution. In R. Brooks and P. Maes, Eds., Pro- of observation and experiment. . . . Assertions regarding ceedings, Artificial Life IV, the Fourth International Workshop on the Synthesis and Simulation of Living Systems. Cambridge, consciousness in animals, whether affirmative or negative, MA: MIT Press, pp. 8–14. are not susceptible of verification” (1906: v). Contrary to Steels, L. (1994b). The artificial life roots of artificial intelligence. the claims of their critics, most behaviorists, like Jennings, Artificial Life 1 (1). deny neither the existence nor the importance of CON- Williamson, M. (1996). Postural primitives: interactive behavior SCIOUSNESS; rather, they hold that private data cannot be the for a humanoid robot arm. In P. Maes, M. Mataric, J. A. Meyer, ´ subject of public science. J. Pollack, and S. Wilson, Eds., Proceedings, From Animals to Having judged that the introspective investigation of con- Animats 4, Fourth International Conference on Simulation of sciousness is an unworkable methodology for objective sci- Adaptive Behavior. Cambridge, MA: MIT Press, pp. 124–134. ence, Jennings offered a new alternative to a science of mental life—a science of overt action. “Apart from their Behaviorism relation to the problem of consciousness and its develop- ment, the objective processes in behavior are of the highest “Psychology is the Science of Mental Life, both of its phe- interest in themselves” (1906: v). nomena and their conditions. The phenomena are such Jennings noted that behavior has historically been treated things as we call feelings, desires, cognitions, reasonings, as the neglected stepsister of consciousness. The treatment decisions, and the like” (1890: 1). So said William JAMES in of behavior as subsidiary to the problem of consciousness his Principles of Psychology, perhaps the most important has tended to obscure the fact that in behavior we have the and widely cited textbook in the history of psychology. most marked and perhaps the most easily studied of the James believed that psychology would have finished its job organic processes. Jennings observed that “in behavior we 78 Behaviorism are dealing with actual objective processes (whether accom- years before Edward C. Tolman (1936) did so more promi- panied by consciousness or not), and we need a knowledge nently, Jennings urged the operationalization of psychologi- of the laws controlling them, of the same sort as our knowl- cal terms and phenomena so as to make their study edge of the laws of metabolism” (1906: v). Discovering completely objective and permit their exact experimental general laws of behavior—in both human and nonhuman investigation—even in nonhuman animals and human animals—with the methods of natural science is the aim of a infants. (Clark L. Hull 1952, B. F. Skinner 1945, and Ken- behavioristic psychology. neth W. Spence 1956 later developed behaviorism in very Jennings’s consideration of nonhuman animal behavior different ways to deal with ideation and thinking.) (what we now call COMPARATIVE PSYCHOLOGY) was a key Take, for example, one of James’s favorite psychological extension of psychological science, an extension that was notions—ATTENTION. For Jennings, attention is not a con- effectively precluded through introspective investigation but scious mental state. Rather, “at the basis of attention lies was made possible by behavioristic study. This extension objectively the phenomenon that the organism may react to was controversial because it had important implications for only one stimulus even though other stimuli are present our understanding of human behavior. “From a discussion which would, if acting alone, likewise produce a response” of the behavior of the lower organisms in objective terms, (1906: 330). The organism can then be said to attend to the compared with a discussion of the behavior of man in sub- particular stimulus to which it responds. Or, take what to jective terms, we get the impression of a complete disconti- many is the hallmark of mental life—choice or DECISION nuity between the two” (1906: 329). Jennings believed that MAKING. For Jennings, choice is not a conscious mental this dualistic view of human and nonhuman psychology process. Instead, “choice is a term based objectively on the offered centuries earlier by DESCARTES was stale and incor- fact that the organism accepts or reacts positively to some rect; a fresh and proper answer to the question of whether things, while it rejects or reacts negatively or not at all to humans differed fundamentally from all other animals others” (1906: 330). In these and many other cases, Jen- required examining their behavior from a common and nings explained that “we shall not attempt to take into con- objective vantage point. “Only by comparing the objective sideration the scholastic definitions of the terms used, but factors can we determine whether there is a continuity or a shall judge them merely from the objective phenomena on gulf between the behavior of lower and higher organisms which they are based” (1906: 329). (including man), for it is only these factors that we know” What then are the limits of a behavioristic approach to (1906: 329). psychological phenomena? This key question has not yet Based on that objective evidence, Jennings agreed with been answered, but it has been vigorously debated. Watson Charles DARWIN and his theory of EVOLUTION through natu- believed that the matter would eventually be decided by ral selection that “there is no difference in kind, but a com- experimental study. “As our methods become better devel- plete continuity between the behavior of lower and of higher oped it will be possible to undertake investigations of more organisms [including human beings]” (1906: 335). Indeed, and more complex forms of behavior. Problems which are many years of assiduous study convinced Jennings that, “if now laid aside will again become imperative, but they can Amoeba were a large animal, so as to come within the be viewed as they arise from a new angle and in more con- everyday experience of human beings, its behavior would at crete settings” (1913: 175). once call forth the attribution to it of states of pleasure and A case study for looking at psychological issues from a pain, of hunger, desire, and the like, on precisely the same new angle and in a concrete setting is recent research into basis as we attribute these things to the dog” (1906: 336), CATEGORIZATION and conceptualization by nonhuman ani- however problematical for an objective psychology these mals. Building on powerful experimental methods pio- anthropomorphic attributions of MOTIVATION and EMOTION neered by Skinner (1938) and his student Richard J. might be. Herrnstein (1990), my colleagues and I have trained pigeons Jennings’s exhortation for us to limit our consideration of to categorize complex visual stimuli such as colored photo- both human and nonhuman behavior to objective factors graphs and detailed line drawings into different classes, underscores the key imperative of behaviorism. “The ideal of ranging from basic-level CONCEPTS (like cats, flowers, cars, most scientific men is to explain behavior in terms of matter and chairs), to superordinate concepts (like mammals, vege- and energy, so that the introduction of psychic implications tables, vehicles, and furniture), to abstract concepts (like is considered superfluous” (1906: 329). Mentalism was to same versus different). In all three cases, the pigeons not play no part in this new psychological science of the twenti- only acquired the visual discriminations through reinforce- eth century, although it is at the core of the current, but argu- ment learning, but they also generalized those discrimina- ably (see Blumberg and Wasserman 1995) reactionary tions to completely novel stimuli (Wasserman 1995); such school of nonhuman animal behavior, COGNITIVE ETHOL- generalization is the hallmark of conceptualization. Addi- OGY, founded by the biologist Donald R. Griffin (1976). tional extensions of behavioristic methods and analyses Critics of behaviorism nevertheless argue that excluding have been made to visual IMAGERY (Rilling and Neiworth the realm of private experience from psychological science 1987) and to the reporting of interoceptive stimuli induced is misguided. Doesn’t a behavioristic account omit most if by the administration of drugs (Lubinski and Thompson not all of the truly interesting and important aspects of psy- 1993). Here too, pigeons were taught with purely behavioral chological functioning? No, said Jennings. What is advo- methods to engage in behaviors which, when performed by cated is simply an objective analysis of psychological people, are conventionally considered to be the product of processes. With remarkable sophistication and some thirty conscious mental states and processes. Behaviorists, like Behaviorism 79 Skinner, take a different tack and ask, Isn’t it more produc- Herrnstein, R. J. (1990). Levels of stimulus control: a functional approach. Cognition 37: 133–166. tive and parsimonious to attribute these behaviors to the Hull, C. L. (1952). A Behavior System. New Haven, CT: Yale Uni- contingencies of reinforcement (which can be specified and versity Press. experimentally manipulated) than to mental entities and James, W. (1890, 1955). The Principles of Psychology, vol. 1. New psychic machinations (which cannot)? York: Dover. Some firmly resist this maneuver and emphatically say Jennings, H. S. (1906/1976). Behavior of the Lower Organisms. no. Empirical demonstrations such as these have done little Bloomington: Indiana University Press. to convert behaviorism's most trenchant critics, such as Lubinski, D., and T. Thompson. (1993). Species and individual dif- Noam Chomsky (1959). These individuals argue that behav- ferences in communication based on private states. Behavioral iorism is formally unable to explain complex human behav- and Brain Sciences 16: 627–680. ior, especially LANGUAGE AND COMMUNICATION. Rilling, M. E., and J. J. Neiworth. (1987). Theoretical and method- ological considerations for the study of imagery in animals. These critics note, for instance, that human verbal behav- Learning and Motivation 18: 57–79. ior exhibits remarkable variability and temporal organiza- Skinner, B. F. (1938). The Behavior of Organisms. New York: tion. They contend that CREATIVITY and grammar are Appleton-Century-Crofts. properties of linguistic performance that are in principle Skinner, B. F. (1945). The operational analysis of psychological beyond behavioristic explanation, and they instead argue terms. Psychological Review 52: 270–277. that these properties of language uniquely implicate the Skinner, B. F. (1957). Verbal Behavior. New York: Appleton- operation of creative mental structures and processes. In Century-Crofts. response, behaviorists note that all behaviors—from the Spence, K. W. (1956). Behavior Theory and Conditioning. New simplest acts like button pressing to the most complex like Haven, CT: Yale University Press. reciting poetry—involve intricate and changing topogra- Terrace, H. S. (1993). The phylogeny and ontogeny of serial mem- ory: list learning by pigeons and monkeys. Psychological Sci- phies of performance. In fact, variability itself is a property ence 4: 162–169. of behavior that research (Eisenberger and Cameron 1996) Titchener, E. B. (1896). An Outline of Psychology. New York: has shown is modifiable through the systematic delivery of Macmillan. reinforcement and punishment, in much the same way as Tolman, E. C. (1936). Operational behaviorism and current trends other properties of behavior like frequency, amplitude, and in psychology. Paper included in the Proceedings of the duration are conditioned by reinforcement contingencies. Twenty–fifth Anniversary Celebration of the Inauguration of As to the temporal organization of behavior, even nonhu- Graduate Studies at the University of Southern California, man animals like pigeons and monkeys have been taught to 1936. Reprinted in E. C. Tolman, Ed., Behavior and Psycholog- recognize and to produce structured sequences of stimuli ical Man: Essays in Motivation and Learning. Berkeley: Uni- and responses (Terrace 1993; Weisman et al. 1980). Such versity of California Press, 1961/1966. Wasserman, E. A. (1995). The conceptual abilities of pigeons. complex performances were again the result of elementary American Scientist 83: 246–255. LEARNING processes brought about by familiar CONDITION- Watson, J. B. (1913). Psychology as the behaviorist views it. Psy- ING techniques. chological Review 20: 158–177. More famously and directly, Skinner offered a behavior- Weisman, R. G., E. A. Wasserman, P. W. D. Dodd, and M. B. istic account of human language in his 1957 book, Verbal Larew. (1980). Representation and retention of two-event Behavior. sequences in pigeons. Journal of Experimental Psychology: Many theorists therefore conclude that behaviorism is the Animal Behavior Processes 6: 312–325. strongest alternative to a mentalistic account of human and nonhuman behavior. Far from being run out of business by Further Readings the premature proclamations of their mentalistic critics, Cook, R. G. (1993). The experimental analysis of cognition in ani- behaviorists have steadfastly proceeded with the task of mals. Psychological Science 4: 174–178. experimentally analyzing many of the most complex and Darwin, C. (1871/1920). The Descent of Man; And Selection in vexing problems of behavior using the most effective and Relation to Sex. 2nd ed. New York: D. Appleton and Com- current tools of natural science. pany. See also CONDITIONING AND THE BRAIN; ETHOLOGY; Griffin, D. R. (1992). Animal Minds. Chicago: University of Chi- FUNCTIONALISM; LEARNING; NATIVISM, HISTORY OF cago Press. Honig, W. K., and J. G. Fetterman, Eds. (1992). Cognitive Aspects —Edward Wasserman of Stimulus Control. Hillsdale, NJ: Erlbaum. Hull, C. L. (1943). Principles of Behavior. New York: Appleton- Century-Crofts. References Hulse, S. H., H. Fowler, and W. K. Honig, Eds. (1978). Cognitive Blumberg, M. S., and E. A. Wasserman. (1995). Animal mind and Processes in Animal Behavior. Hillsdale, NJ: Erlbaum. the argument from design. American Psychologist 50: 133–144. Kennedy, J. S. (1992). The New Anthropomorphism. Cambridge: Chomsky, N. (1959). Review of B. F. Skinner’s Verbal Behavior. Cambridge University Press. Language 35: 26–58. Lashley, K. S. (1923). The behaviorist interpretation of conscious- Eisenberger, R., and J. Cameron. (1996). Detrimental effects of ness, II. Psychological Review 30: 329–353. reward: reality or myth? American Psychologist 51: 1153–1166. Mackenzie, B. D. (1977). Behaviourism and the Limits of Scien- Griffin, D. R. (1976). The Question of Animal Awareness: Evolu- tific Method. Atlantic Highlands, NJ: Humanities Press. tionary Continuity of Mental Experience. New York: The Rock- Mackintosh, N. J., Ed. (1994). Animal Learning and Cognition. efeller University Press. San Diego, CA: Academic Press. 80 Belief Networks previous behavioral findings suggesting that second lan- Nisbett, R. E., and T. D. Wilson. (1977). Telling more than we can know: verbal reports on mental processes. Psychological guage learning is better in those who learn their second lan- Review 84: 231–259. guage early. Mclaughlin and Osterhout (1997) found that Pavlov, I. P. (1927). Conditioned Reflexes. Oxford: Oxford Univer- college students learning French progressively improve sity Press. from chance to near-native performance on lexical decision Rachlin, H. (1992). Teleological behaviorism. American Psycholo- (i.e., deciding if a letter string is a word or not); however, gist 47: 1371–1382. electrophysiological indices revealed sensitivity to French Richelle, M. N. (1993). B. F. Skinner: A Reappraisal. Hillsdale, words after only a few weeks of instruction. An increased NJ: Erlbaum. N400 (a waveform that indexes lexical-semantic process- Ristau, C. A., Ed. (1991). Cognitive Ethology: The Minds of Other ing) for words preceded by semantically unrelated words Animals. Hillsdale, NJ: Erlbaum. (coffee-dog) was found as the number of years of exposure Roitblat, H. L., T. G. Bever, and H. S. Terrace, eds. (1984). Animal Cognition. Hillsdale, NJ: Erlbaum. to French increased, but it never approached the levels seen Skinner, B. F. (1976). About Behaviorism. New York: Vintage. in native French speakers. Weber-Fox and Neville (1996) Skinner, B. F. (1977). Why I am not a cognitive psychologist. have found differences in the N400 to semantic violations, Behaviorism 5: 1–10. but only for those who learned a second language after the Sober, E. (1983). Mentalism and behaviorism in comparative psy- age of eleven. Changes in ERPs to grammatical violations, chology. In D. W. Rajecki, Ed., Comparing Behavior: Studying however, appeared even for those who learned their second Man Studying Animals. Hillsdale, NJ: Erlbaum, pp. 113–142. language before the age of four. Perani et al. (1996), using Spear, N. E., J. S. Miller, and J. A. Jagielo. (1990). Animal mem- POSITRON EMISSION TOMOGRAPHY (a measure of localized ory and learning. Annual Review of Psychology 41: 169–211. brain activity), have found that listening to passages in a Wasserman, E. A. (1993). Comparative cognition: beginning the first language results in an activation of areas that is not second century of the study of animal intelligence. Psychologi- cal Bulletin 113: 211–228. apparent in the second language for late second language Wasserman, E. A. (1997). Animal cognition: past, present and learners (e.g., increased activation in the left and right tem- future. Journal of Experimental Psychology: Animal Behavior poral pole, the left inferior frontal gyrus, and the left inferior Processes 23: 123–135. parietal lobe). Thus age of acquisition has an effect on elec- Wasserman, E. A., and R. R. Miller. (1997). What’s elementary trophysiological measures of brain activity as well as on the about associative learning? Annual Review of Psychology 48: neuroanatomical areas that are involved in second language 573–607. processing. Weiskrantz, L., Ed. (1985). Animal Intelligence. New York: Oxford Does a bilingual speaker represent each language in dif- University Press. ferent areas of the brain? Researchers have long wondered Zuriff, G. E. (1985). Behaviorism: A Conceptual Reconstruction. whether cognitive functions are processed by separate areas New York: Columbia University Press. of the brain (see CORTICAL LOCALIZATION, HISTORY OF). A similar question has been asked with respect to the cortical Belief Networks localization of the two languages in bilingual speakers. One way to answer this question is to look at the effects of brain See BAYESIAN NETWORKS; PROBABILISTIC REASONING lesions on the processing of a bilingual's two languages. Brain lesions that affect one language and not the other would lead to the conclusion that languages are represented Bilingualism and the Brain in different areas of the brain. Indeed, there is evidence of different degrees of recovery in each language after a stroke In recent years, there has been growing interest in the cogni- (Junque, Vendrell and Vendrell 1995; Paradis 1977). tive neuroscience of bilingualism. The two central questions Extreme cases have shown postoperative impairment in one in this literature have been: (1) Does a bilingual speaker rep- language with spontaneous recovery after eight months resent each language in different areas of the brain? (2) (Paradis and Goldblum 1989). A more recent case has been What effect does age of second language acquisition have used to suggest that there is a clear neuroanatomical dissoci- on brain representation? These questions have been consid- ation between the languages (Gomez-Tortosa et al. 1995). ered by using electrophysiological and functional neuro- Others, however, suggest that there are a number of other imaging measures as well as by looking at bilinguals who explanations for these data (see Paradis 1996 and Hines suffer strokes affecting the areas responsible for language 1996 for further discussion). processing in the brain. We will begin by considering the The notion that bilinguals’ two languages are represented effects of age of acquisition before considering the localiza- in overlapping brain areas has also been supported with tion of the first and second language in the brain. other methodologies. Ojemann and Whitaker (1978) found What effects does age of second language acquisition that electrical stimulation of certain areas in the cortex inter- have on brain representation? Researchers in cognitive sci- rupted naming in both languages, whereas stimulation of ence have considered whether there is a critical period for other areas interrupted naming in only one language. More learning a language (see also LANGUAGE ACQUISITION). recent work using measures that look at activation as a mea- This topic is also of interest to those learning a second lan- sure of blood flow have come to similar conclusions. Klein guage. Specifically, investigators have inquired about the et al. (1994), using PET, found that naming pictures in a differences between early and late second language learners. second language vs. naming pictures in a first language Recent work using event-related potentials (ERP) supports resulted in activation in the putamen, a subcortical area that Binding by Neural Synchrony 81 has been associated with phonological processing. Other References studies have found that bilinguals show activity in left fron- Gomez-Tortosa, E., E. Martin, M. Gaviria, F. Charbel, and J. Aus- tal areas of the brain for semantic and phonological analyses man. (1995). Selective deficit of one language in a bilingual of words in both their languages (Klein et al. 1995; Wagner patient following surgery in the left perisylvian area. Brain and et al. 1996). Taken together these findings suggest that Language 48: 320–325. whereas naming in L2 involves activation in areas that are Hernandez, A. E., A. Martinez, E. C. Wong, L. A. Frank, and R. B. not involved in L1, lexical and semantic judgments of words Buxton. (1997). Neuroanatomical correlates of single- and activate mostly overlapping areas of the brain. Although dual-language picture naming in Spanish–English bilinguals. there are some dissociations when surface tasks such as Poster presented at the fourth annual meeting of the Cognitive naming are used, these dissociations disappear when seman- Neuroscience Society, Boston, MA. Hines, T. M. (1996). Failure to demonstrate selective deficit in the tic tasks are used. native language following surgery in the left perisylvian area. Having two linguistic systems that overlap presents an Brain and Language 54: 168–169. interesting challenge for theories of bilingual language pro- Junque, C., P. Vendrell, and J. Vendrell. (1995). Differential cessing. If these two languages are located on overlapping impairments and specific phenomena in 50 Catalan–Spanish tissue, how do bilinguals manage to keep these languages bilingual aphasic patients. In M. Paradis, Ed., Aspects of Bilin- from constantly interfering with each other? A recent study gual Aphasia. Oxford: Pergamon. by Hernandez et al. (1997) was designed to look at this issue Klein, D., B. Milner, R. J. Zatorre, E. Meyer, and A. C. Evans. using functional MAGNETIC RESONANCE IMAGING (fMRI) (1995). The neural substrates underlying word generation: a for Spanish–English bilinguals. Participants were asked to bilingual functional-imaging study. Proceedings of the National name a picture in their first language, second language, or to Academy of Sciences of the United States of America 92: 2899– 2903. alternate between each language on successive trials. Klein, D., R. J. Zatorre, B. Milner, E. Meyer, and A. C Evans. Results revealed slower reaction times and an increase in the (1994). Left putanimal activation when speaking a second lan- number of cross-language errors in the alternating condition guage: evidence from PET. Neuroreport 5: 2295–2297. relative to the single-language condition (Kohnert-Rice and Kohnert-Rice, K. and A. E. Hernandez. (Forthcoming). Lexical Hernandez forthcoming). In the fMRI study, there was no retrieval and interference in Spanish–English bilinguals. difference when comparing activation for naming in the first McLaughlin, J., and L. Osterhout. (1997). Event-related potentials and second language. However, activation in the prefrontal reflect lexical acquisition in second language learners. Poster cortex increased significantly when participants were asked presented at the fourth annual meeting of the Cognitive Neuro- to alternate between languages. Thus it appears that the left science Society, Boston, MA. prefrontal cortex may also act to reduce the amount of inter- Ojemann, G. A., and A. A. Whitaker. (1978). The bilingual brain. Archives of Neurology 35: 409–412. ference between languages (as indexed by slower reaction Paradis, M. (1977). Bilingualism and aphasia. In H. Whitaker and times and increased cross-language errors; see also WORK- H. A. Whitaker, Eds., Studies in Neurolinguistics, vol. 3. New ING MEMORY, NEURAL BASIS OF). York: Academic Press, pp. 65–121. Languages can be represented across syntactic, phono- Paradis, M. (1996). Selective deficit in one language is not a dem- logical, orthographic, semantic, pragmatic, and DISCOURSE onstration of different anatomical representation: comments on dimensions. These distinctions can vary depending on the Gomez-Tortosa et al. (1995). Brain and Language 54: 170–173. two languages. For example, Chinese and English are very Paradis, M., and M. C. Goldblum. (1989). Selected crossed aphasia different orthographically and phonologically. However, in a trilingual aphasic patient followed by reciprocal antago- some aspects of SYNTAX are very similar (e.g., the lack of nism. Brain and Language 36: 62–75. morphological markers and the use of word order to indicate Perani, D., S. Dehaene, F. Grassi, L. Cohen, S. Cappa, E. Dupoux, F. Fazio, and J. Mehler. (1996). Brain processing of native and the agent of a sentence). Contrast this with Spanish and foreign languages. Neuroreport 7: 2439–2444. English, which are more similar orthographically but are Wagner, A. D., J. Illes, J. E. Desmond, C. J. Lee, G. H. Glover, and very different in syntax in that the former uses a very large J. D. E. Gabrieli. (1996). A functional MRI study of semantic number of morphological markers. Despite the progress that processing in bilinguals. NeuroImage 3: S465. has been made in addressing the relationship between bilin- Webber-Fox, C., and H. J. Neville. (1996). Maturational con- gualism and brain representation, and although strides have straints on functional specializations for language processing: been made in the PSYCHOLINGUISTICS and cognitive neuro- ERP and behavioral evidence in bilingual speakers. Journal of science of bilingualism, much work remains to be done. Cognitive Neuroscience 8: 231–256. This research will necessarily involve behavior and the brain. Clearly the issue of bilingual brain bases involves Further Readings both a rich multidimensional information space as well as a rich cerebral space. Understanding how the former maps Paradis, M. (1995). Aspects of Bilingual Aphasia. Oxford: Perga- mon. onto the latter is a question that should keep researchers occupied into the next century and beyond. See also ELECTROPHYSIOLOGY, ELECTRIC AND MAG- Binding by Neural Synchrony NETIC EVOKED FIELDS; GRAMMAR, NEURAL BASIS OF; INNATENESS OF LANGUAGE; LANGUAGE, NEURAL BASIS OF; Neuronal systems have to solve immensely complex combi- NEURAL PLASTICITY; NATURAL LANGUAGE PROCESSING natorial problems and require efficient binding mechanisms —Arturo E. Hernandez and Elizabeth Bates in order to generate representations of perceptual objects 82 Binding by Neural Synchrony chrony. All three mechanisms enhance the relative impact of and movements. In the context of cognitive functions, com- the grouped responses at the next higher processing level. binatorial problems arise from the fact that perceptual Selecting responses by modulating discharge rates is com- objects are defined by a unique constellation of features, the mon in labeled line coding where a particular cell always diversity of possible constellations being virtually unlimited signals the same content. However, this strategy may not (cf. BINDING PROBLEM). Combinatorial problems of similar always be suited for the distinction of assemblies because it magnitude have to be solved for the acquisition and execu- introduces ambiguities, reduces processing speed, and tion of motor acts. Although the elementary components of causes superposition problems (von der Malsburg 1981; motor acts, the movements of individual muscle fibers, are Singer et al. 1997). Ambiguities could arise because dis- limited in number, the spatial and temporal diversity of charge rates of feature-selective cells vary over a wide range movements that can be composed by combining the elemen- as a function of the match between stimulus and receptive tary components in ever-changing constellations is again field properties; these modulations of response amplitude virtually infinite. In order to establish neuronal representa- would not be distinguishable from those signalling the relat- tions of perceptual objects and movements, the manifold edness of responses. Processing speed would be reduced relations among elementary sensory features and movement because rate coded assemblies need to be maintained for components have to be encoded in neural responses. This some time in order to be distinguishable. Finally, superposi- requires binding mechanisms that can cope efficiently with tion problems arise, because rate coded assemblies cannot combinatorial complexity. Brains have acquired an extraor- overlap in time within the same processing stage. If they did, dinary competence to solve such combinatorial problems, it would be impossible to distinguish which of the enhanced and it appears that this competence is a result of the evolu- responses belong to which assembly. Simultaneous mainte- tion of the CEREBRAL CORTEX. nance of different assemblies over perceptual time scales is In the primary visual cortex of mammals, relations among required, however, to represent composite objects. the responses of retinal ganglion cells are evaluated and rep- Both the ambiguities and the temporal constraints can be resented by having the output of selected arrays of ganglion overcome if the selection and labeling of responses is cells converge in diverse combinations onto individual corti- achieved through synchronization of individual discharges cal neurons. Distributed signals are bound together by selec- (Gray et al. 1989; Singer and Gray 1995). Synchronicity can tive convergence of feed forward connections (Hubel and be adjusted independently of rates, and so the signature of Wiesel 1962). This strategy is iterated in prestriate cortical relatedness, if expressed through synchronization, is inde- areas in order to generate neurons that detect and represent pendent of rate fluctuations. Moreover, synchronization more complex constellations of features including whole enhances only the salience of those discharges that are pre- perceptual objects. cisely synchronized and generate coincident synaptic poten- However, this strategy of binding features together by tials in target cells at the subsequent processing stage. recombining input connections in ever-changing variations Hence the selected event is the individual spike or a brief and representing relations explicitly by responses of special- burst of spikes. Thus, the rate at which different assemblies ized cells results in a combinatorial explosion of the number can follow one another within the same neuronal network of required binding units. It has been proposed, therefore, without getting confounded is much higher than with rate that the cerebral cortex uses a second, complementary strat- coding. It is only limited by the duration of the interval over egy, commonly called assembly coding, that permits utiliza- which synaptic potentials summate effectively. If this inter- tion of the same set of neurons for the representation of val is in the range of 10 or 20 ms, several different assem- different relations (Hebb 1949). Here, a particular constella- blies can alternate within preceptually relevant time tion of features is represented by the joint and coordinated windows. activity of a dynamically associated ensemble of cells, each If synchronization serves as a selection and binding of which represents explicitly only one of the more elemen- mechanism, neurons must be sensitive to coincident input. tary features that characterize a particular perceptual object. Moreover, synchronization must occur rapidly and show a Different objects can then be represented by recombining relation to perceptual phenomena. neurons tuned to more elementary features in various con- Although the issue of coincidence detection is still contro- stellations (assemblies). For assembly coding, two con- versial (König, Engel, and Singer 1996; Shadlen and News- straints need to be met. First, a selection mechanism is ome 1994), evidence is increasing that neurons can evaluate required that permits dynamic, context dependent associa- temporal relations with precision among incoming activity tion of neurons into distinct, functionally coherent assem- (see e.g., Carr 1993). That cortical networks can handle tem- blies. Second, grouped responses must get labeled so that porally structured activity with high precision and low disper- they can be distinguished by subsequent processing stages as sion follows from the abundant evidence on the oscillatory components of one coherent representation and do not get patterning and precise synchronization of neuronal responses confounded with other unrelated responses. Tagging in the γ-frequency range (Singer and Gray 1995; König, responses as related is equivalent with raising their salience Engel, and Singer 1996). Synchronization at such high fre- jointly and selectively, because this assures that they are pro- quencies is only possible if integration time constants are cessed and evaluated together at the subsequent processing short. Precise synchronization over large distances is usually stage. This can be achieved in three ways. First, nongrouped associated with an oscillatory patterning of responses in the responses can be inhibited; second, the amplitude of the β- and γ-frequency range, suggesting a causal relation selected responses can be enhanced; and third, the selected (König, Engel, and Singer 1995). This oscillatory patterning cells can be made to discharge in precise temporal syn- Binding by Neural Synchrony 83 from the amblyopic eye, and by impairing binding, it could is associated with strong inhibitory interactions (Traub et al. reduce visual acuity and cause crowding. 1996), raising the possibility that the oscillations contribute to Another close correlation between response synchroniza- the shortening of integration time constants. tion and perception has been found in experiments on binoc- Simulations with spiking neurons reveal that networks of ular rivalry (Fries et al. 1997a). A highly significant appropriately coupled units can undergo very rapid transi- correlation exists between changes in the strength of tions from uncorrelated to synchronized states (Deppisch et response synchronization in primary visual cortex and the al. 1993; Gerstner 1996). Rapid transitions from indepen- outcome of rivalry. Cells mediating responses of the eye that dent to synchronized firing are also observed in natural net- won in interocular competition increased the synchronicity works. In visual centers, it is not uncommon that neurons of their responses upon presentation of the rival stimulus to engage in synchronous activity, often with additional oscil- the other, losing eye, while the reverse was true for cells latory patterning, at the very same time they increase their driven by the eye that became suppressed. discharge rate in response to the light stimulus (Neuen- These results support the hypothesis that precise tempo- schwander and Singer 1996; Gray et al. 1992). One mecha- ral relations between the discharges of spatially distributed nism is coordinated spontaneous activity that acts like a neurons matter in cortical processing and that synchroniza- dynamic filter and causes a virtually instantaneous synchro- tion may be exploited to jointly raise the salience of the nization of the very first discharges of responses Fries et al. responses selected for further processing, that is, for the 1997b). The spatio-temporal patterns of these spontaneous dynamic binding of distributed responses into coherent fluctuations of excitability reflect the architecture and the assemblies. actual functional state of intracortical association connec- The example of rivalry also illustrates how synchroniza- tions. Thus, grouping by synchronization can be extremely tion and rate modulation depend on each other. The signals fast and still occur as a function of both the prewired associ- from the suppressed eye failed to induce tracking EYE MOVE- ational dispositions and the current functional state of the cortical network. MENTS, indicating that the vigorous but poorly synchronized Evidence indicates that the probability and strength of responses in primary visual areas eventually failed to drive response synchronization reflects elementary Gestalt criteria the neurons responsible for the execution of eye movements. such as continuity, proximity, similarity in the orientation Thus, changes of synchronicity result in changes of response domain, colinearity, and common fate (Gray et al. 1989; amplitudes at subsequent processing stages. This convertibil- Engel, König, and Singer 1991; Engel et al. 1991; Freiwald, ity provides the option to use both coding strategies in paral- Kreiter, and Singer 1995; Kreiter and Singer 1996). Most lel in order to encode complementary information. importantly, the magnitude of synchronization exceeds that See also HIGH-LEVEL VISION; MID-LEVEL VISION; OCULO- expected from stimulus-induced rate covariations of MOTOR CONTROL responses, indicating that it results from internal coordina- —Wolf Singer tion of spike timing. Moreover, synchronization probability does not simply reflect anatomical connectivity but changes References in a context-dependent way (Gray et al. 1989; Engel, König, and Singer 1991; Freiwald, Kreiter, and Singer 1995; Kreiter Carr, C. E. (1993). Processing of temporal information in the brain. and Singer 1996), indicating that it is the result of a dynamic Annu. Rev. Neurosci. 16: 223–243. and context-dependent selection and grouping process. Most Deppisch, J., H.-U. Bauer, T. B. Schillen, P. König, K. of the early experiments on response synchronization have Pawelzik, and T. Geisel. (1993). Alternating oscillatory and stochastic states in a network of spiking neurons. Network 4: been performed in anesthetized animals, but more recent evi- 243–257. dence from cats and monkeys indicates that highly precise, Engel, A. K., P. König, and W. Singer. (1991). Direct physiological internally generated synchrony occurs also in the awake evidence for scene segmentation by temporal coding. Proc. brain, exhibits similar sensitivity to context (Kreiter and Natl. Acad. Sci. USA 88: 9136–9140. Singer 1996; Fries et al. 1997a; Gray and Viana Di Prisco Engel, A. K., A. K. Kreiter, P. König, and W. Singer. (1991). Syn- 1997), and is especially pronounced when the EEG is desyn- chronization of oscillatory neuronal responses between striate chronized (Munk et al. 1996) and the animals are attentive and extrastriate visual cortical areas of the cat. Proc. Natl. (Roelfsema et al. 1997). Direct relations between response Acad. Sci. USA 88: 6048–6052. synchronization and perception have been found in cats who Freiwald, W. A., A. K. Kreiter, and W. Singer. (1995). Stimulus suffered from strabismic amblyopia, a developmental dependent intercolumnar synchronization of single unit responses in cat area 17. Neuroreport 6: 2348–2352. impairment of vision associated with suppression of the Fries, P., P. R. Roelfsema, A. K. Engel, P. König, and W. Singer. amblyopic eye, reduced visual acuity, and disturbed percep- (1997a). Synchronization of oscillatory responses in visual cor- tual grouping (crowding) in this eye. Quite unexpectedly, the tex correlates with perception in interocular rivalry. Proc. Natl. discharge rates of individual neurons in the primary visual Acad. Sci. USA 94: 12699–12704. cortex fail to reflect these deficits (see Roelfsema et al. 1994 Fries, P., P. R. Roelfsema, W. Singer, and A. K. Engel. (1997b). for references). The only significant correlate of amblyopia Correlated variations of response latencies due to synchronous is the drastically reduced ability of neurons driven by the subthreshold membrane potential fluctuations in cat striate cor- amblyopic eye to synchronize their responses (Roelfsema et tex. Soc. Neurosci. Abstr. 23: 1266. al. 1994), and this accounts well for the perceptual deficits: Gerstner, W. (1996). Rapid phase locking in systems of pulse- by reducing the salience of responses, disturbed synchroni- coupled oscillators with delays. Phys. Rev. Lett. 76: 1755– 1758. zation could be responsible for the suppression of signals 84 Binding by Neural Synchrony Gray, C. M., A. K. Engel, P. König, and W. Singer. (1992). Syn- Braitenberg, V. (1978). Cell assemblies in the cerebral cortex. In chronization of oscillatory neuronal responses in cat striate cor- R. Heim and G. Palm, Eds., Architectonics of the Cerebral tex—temporal properties. Visual Neurosci. 8: 337–347. Cortex. Lecture Notes in Biomathematics 21, Theoretical Gray, C. M., P. König, A. K. Engel, and W. Singer. (1989). Oscilla- Approaches in Complex Systems. Springer-Verlag, pp. 171– tory responses in cat visual cortex exhibit inter-columnar syn- 188. chronization which reflects global stimulus properties. Nature Edelman, G. M. (1987). Neural Darwinism: The Theory of Neu- 338: 334–337. ronal Group Selection. New York: Basic Books. Gray, C. M., and G. Viana Di Prisco. (1997). Stimulus-dependent Engel, A. K., P. König, A. K. Kreiter, T. B. Schillen, and W. Singer. neuronal oscillations and local synchronization in striate cortex (1992). Temporal coding in the visual cortex: new vistas on of the alert cat. J. Neurosci. 17: 3239–3253. integration in the nervous system. Trends Neurosci.15: 218– Hebb, D. O. (1949). The Organization of Behavior. Wiley. 226. Hubel, D. H., and T. N. Wiesel. (1962). Receptive fields, binocular Frien, A., R. Eckhorn, R. Bauer, T. Woelbern, and H. Kehr. interaction and functional architecture in the cat’s visual cortex. (1994). Stimulus-specific fast oscillations at zero phase J. Physiol. (Lond.) 160: 106–154. between visual areas V1 and V2 of awake monkey. Neurore- König, P., A. K. Engel, and W. Singer. (1995). Relation between port 5: 2273–2277. oscillatory activity and long-range synchronization in cat visual Gerstein, G. L., and P. M. Gochin. (1992). Neuronal population cortex. Proc. Natl. Acad. Sci. USA 92: 290–294. coding and the elephant. In A. Aertsen and V. Braitenberg, König, P., A. K. Engel, and W. Singer. (1996). Integrator or coinci- Eds., Information Processing in the Cortex, Experiments and dence detector? The role of the cortical neuron revisited. Trends Theory. Springer-Verlag, pp. 139–173. Neurosci. 19: 130–137. Gerstner, W., R. Kempter, J. L. van Hemmen, and H. Wagner. Kreiter, A. K., and W. Singer. (1996). Stimulus-dependent syn- (1996). A neuronal learning rule for sub-millisecond temporal chronization of neuronal responses in the visual cortex of coding. Nature 383: 76–78. awake macaque monkey. J. Neurosci. 16: 2381–2396. Gerstner, W., and J. L. van Hemmen. (1993). Coherence and inco- Munk, M. H. J., P. R. Roelfsema, P. König, A. K. Engel, and W. herence in a globally coupled ensemble of pulse-emitting units. Singer. (1996). Role of reticular activation in the modulation of Phys. Rev. Lett. 7: 312–315. intracortical synchronization. Science 272: 271–274. Hopfield, J. J., and A. V. M. Hertz. (1995). Rapid local synchroni- Neuenschwander, S., and W. Singer. (1996). Long-range synchro- zation of action potentials: toward computation with coupled nization of oscillatory light responses in the cat retina and lat- integrate-and-fire neurons. Proc. Natl. Acad. Sci. USA 92: eral geniculate nucleus. Nature 379: 728–733. 6655–6662. Roelfsema, P. R., A. K. Engel, P. König, and W. Singer. (1997). Jagadeesh, B., H. S. Wheat, and D. Ferster. (1993). Linearity of Visuomotor integration is associated with zero time-lag syn- summation of synaptic potentials underlying direction selectiv- chronization among cortical areas. Nature 385: 157–161. ity in simple cells of the cat visual cortex. Science 262: 1901– Roelfsema, P. R., P. König, A. K. Engel, R. Sireteanu, and W. 1904. Singer. (1994). Reduced synchronization in the visual cortex Löwel, S., and W. Singer. (1992). Selection of intrinsic horizontal of cats with strabismic amblyopia. Eur. J. Neurosci. 6: 1645– connections in the visual cortex by correlated neuronal activity. Science 255: 209–212. 1655. Morgan, M. J., and E. Castet. (1995). Stereoscopic depth percep- Shadlen, M. N., and W. T. Newsome. (1994). Noise, neural codes tion at high velocities. Nature 378: 380–383. and cortical organization. Curr. Opin. Neurobiol. 4: 569–579. Palm, G. (1990). Cell assemblies as a guideline for brain research. Singer, W., A. K. Engel, A. K. Kreiter, M. H. J. Munk, S. Neuen- Concepts Neurosci. 1: 133–147. schwander, and P. R. Roelfsema. (1997). Neuronal assemblies: Phillips, W. A., and W. Singer. (1997). In search of common foun- necessity, signature and detectability. Trends Cognitive Sci. dations for cortical computation. Behav. Brain Sci. 20(4): 657– 1(7): 252–261. 722. Singer, W., and C. M. Gray. (1995). Visual feature integration and Schmidt, K., R. Goebel, S. Löwel, and W. Singer. (1997). The per- the temporal correlation hypothesis. Annu. Rev. Neurosci. 18: ceptual grouping criterion of colinearity is reflected by 555–586. anisotropies of connections in the primary visual cortex. Europ. Traub, R. D., M. A. Whittington, I. M. Stanford, and J. G. R. Jef- J. Neurosci. 9: 1083–1089. ferys. (1996). A mechanism for generation of long-range syn- Singer, W. (1993). Synchronization of cortical activity and its puta- chronous fast oscillations in the cortex. Nature 383: 621–624. tive role in information processing and learning. Annu. Rev. von der Malsburg, C. (1981). The correlation theory of brain func- Physiol. 55: 349–374. tion. Internal Report 81-2 Max-Planck-Institute for Biophysi- Singer, W. (1995). Development and plasticity of cortical process- cal Chemistry. Reprinted 1994, in E. Domany, J. L. van ing architectures. Science 270: 758–764. Hemmen and K. Schulten, Eds., Models of Neural Networks II. Singer, W. (1996). Neuronal synchronization: a solution to the Springer-Verlag, pp. 95–119. binding problem? In R. Llinas and P.S. Churchland, Eds., The Mind-Brain Continuum. Sensory Processes. Cambridge: MIT Further Readings Press, pp. 100–130. Singer, W., A. K. Engel, A. K. Kreiter, M. H. J. Munk, S. Neuen- Abeles, M. (1991). Corticonics. Cambridge: Cambridge University schwander, and P. R. Roelfsema. (1997). Neuronal assemblies: Press. necessity, signature and detectability. Trends in Cognitive Sci- Abeles, M., Y. Prut, H. Bergman, and E. Vaadia. (1994). Synchro- ences 1 (7): 252–261. nization in neuronal transmission and its importance for infor- Softky, W. R. (1995). Simple codes versus efficient codes. Curr. mation processing. Prog. Brain Res. 102: 395–404. Opin. Neurobiol. 5: 239–247. Barlow, H. B. (1972). Single units and sensation: a neuron doctrine Tallon-Baudry, C., O. Bertrand, C. Delpuech, and J. Pernier. for perceptual psychology? Perception 1: 371–394. (1996). Stimulus specificity of phase-locked and non-phase- Bauer, H.-U., and K. Pawelzik. (1993). Alternating oscillatory and locked 40 Hz visual responses in human. J. Neurosci. 16: 4240– stochastic dynamics in a model for a neuronal assembly. Phys- 4249. ica D. 69: 380–393. Binding Problem 85 evidence for this type of binding in biological nervous sys- Binding Problem tems (see König and Engel 1995). A more important limita- tion of dynamic binding is that it is impractical as a basis for binding in long-term MEMORY. For example, we may Binding is the problem of representing conjunctions of properties. It is a very general problem that applies to all remember where we parked our car last Tuesday, but it is types of KNOWLEDGE REPRESENTATION, from the most basic unlikely that the neurons representing our car have been fir- perceptual representations to the most complex cognitive ing in synchrony with those representing our parking space representations. For example, to visually detect a vertical continuously since then. (The memory might be coded by, red line among vertical blue lines and diagonal red lines, say, synaptic links between those neurons, and those links one must visually bind each line's color to its orientation may have been created at the time we parked the car, but (see Treisman and Gelade 1980). Similarly, to understand such links do not constitute dynamic bindings in the sense the statement, “John believes that Mary's anger toward Bill discussed here; see Holyoak and Hummel forthcoming.) A stems from Bill's failure to keep their appointment,” one third limitation of dynamic binding is that it requires more must bind John to the agent role of believes, and the struc- ATTENTION and WORKING MEMORY than static binding (see ture Bill's failure to keep their appointment to the patient Hummel and Holyoak 1997; Stankiewicz, Hummel and role of stems from (see THEMATIC ROLES). Binding lies at Cooper 1998; Treisman and Gelade 1980; Treisman and the heart of the capacity for symbolic representation (cf. Schmidt 1982). Although there is no theoretical limit on the Fodor and Pylyshyn 1988; Hummel and Holyoak 1997). number of conjunctive units (i.e., static bindings) that may A binding may be either static or dynamic. A static bind- be active at a given time, there are likely to be strong limits ing is a representational unit (such as a symbol or a node in on the number of distinct tags available for dynamic bind- a neural network) that stands for a specific conjunction of ing. In the case of synchrony, for example, only a finite properties. For example, a neuron that responds to vertical number of groups of neurons can be active and mutually out red lines at location x, y in the visual field represents a static of synchrony with one another. Attention may serve, in part, binding of vertical, red, and location x, y. Variants of this to control the allocation of this finite dynamic binding approach have been proposed in which bindings are coded resource (see Hummel and Stankiewicz 1996). as patterns of activation distributed over sets of units (rather To the extent that a process exploits dynamic binding, it than the activity of a single unit; e.g., Smolensky 1990). will profit from the isomorphism between its representa- Although this approach to binding appears very different tions and the represented structures, but it will be demand- from the localist (one-unit-one-binding) approach, the two ing of processing resources (attention and working are equivalent in all important respects. In both cases, bind- memory); to the extent that it binds properties statically, it ing is carried in the units themselves, so different bindings will be free to operate in parallel with other processes (i.e., of the same properties are represented by separate units. In a demanding few resources), but the resulting representations static binding, the capacity to represent how elements are will not be isomorphic with the represented structures. bound together trades off against the capacity to represent These properties of static and dynamic binding have impor- the elements themselves (see Holyoak and Hummel forth- tant implications for human perception and cognition. For coming). In an extreme case, the units coding, say, red diag- example, these (and other) considerations led Hummel and onal lines may not overlap at all with those representing red Stankiewicz (1996) to predict that attended object images vertical lines. will visually prime both themselves and their left-right Dynamic binding represents conjunctions of properties reflections, whereas ignored images will prime themselves as bindings of units in the representation. That is, represen- but not their reflections. In brief, the reason is that dynamic tational units are tagged with regard to whether they are binding (of features into object parts and object parts to spa- bound together or not. For example, let red be represented tial relations) is necessary to generate a left-right invariant by unit R, vertical by V, and diagonal by D, and let us denote structural description from an object's image (Hummel and a binding with the tag “+.” A red diagonal would be repre- Biederman 1992), and attention is necessary for dynamic sented as R + D and a red vertical as R + V. Dynamic bind- binding (Treisman and Gelade 1980); attention should ing permits a given unit (here, R) to participate in multiple therefore be necessary for left-right invariant structural bindings, and as a result (unlike static binding), it permits a description. Stankiewicz, Hummel, and Cooper (1998) representation to be isomorphic with the structure it repre- tested this prediction and the results were exactly as pre- sents (see Holyoak and Hummel forthcoming). dicted. Apparently, the human visual system uses both static Dynamic binding permits greater representational flexi- and dynamic codes for binding in the representation of bility than static binding, but it also has a number of proper- object shape, and these separate codes manifest themselves, ties that limit its usefulness. First, it is not obvious how to among other ways, as differing patterns of priming for do dynamic binding in a neural (or connectionist) network. attended and ignored object images. Similar tradeoffs The most popular proposed binding tag is based on tempo- between the strengths of static and dynamic binding are also ral synchrony: if two units are bound, then they fire in syn- apparent in aspects of human memory and thinking (cf. chrony with one another; otherwise they fire out of Hummel and Holyoak 1997). synchrony (cf. Gray and Singer 1989; Hummel and Bieder- See also BINDING BY NEURAL SYNCHRONY; BINDING man 1992; Hummel and Holyoak 1997; Milner 1974; Shas- THEORY; CONNECTIONISM, PHILOSOPHICAL ISSUES tri and Ajjanagadde 1993; von der Malsburg 1981). — John Hummel Although controversial (see Tovee and Rolls 1992), there is 86 Binding Theory expression (she, herself), and a potential antecedent (Lucie References or Lili). Fodor, J. A., and Z. W. Pylyshyn. (1988). Connectionism and cog- (1) a. Lucie thought that Lili hurt her. nitive architecture: a critical analysis. In S. Pinker and J. b. Lucie thought that Lili hurt herself. Mehler, Eds., Connections and Symbols. Cambridge, MA: MIT Press, pp. 3–71 c. *Lucie thought that herself hurt Lili. Gray, C. M. and W. Singer. (1989). Stimulus specific neuronal The two anaphoric expressions have different anaphora oscillations in orientation columns of cat visual cortex. Pro- options: In (1a), only Lucie can be the antecedent; in (1b), ceedings of the National Academy of Sciences USA 86: 1698– only Lili; in (1c), neither can. This pattern is universal. 1702. Holyoak, K. J., and J. E. Hummel. (forthcoming). The proper treat- All languages have the two anaphoric types in (2), though ment of symbols in connectionism. In E. Dietrich and A. Mark- not all have both anaphors. English does not have an SE man, Eds., Cognitive Dynamics: Conceptual Change in anaphor; the Dravidian languages of India do not have a Humans and Machines. Cambridge, MA: MIT Press. SELF anaphor; Germanic and many other languages have Hummel, J. E., and I. Biederman. (1992). Dynamic binding in a both. neural network for shape recognition. Psychological Review 99: 480–517 (2) Types of anaphoric expressions Hummel, J. E., and K. J. Holyoak. (1997). Distributed representa- Pronouns: (she, her) tions of structure: a theory of analogical access and mapping. Anaphors: Psychological Review 104: 427–466. a. complex SELF anaphors (herself) Hummel, J. E., and B. J. Stankiewicz. (1996). An architecture for b. SE (Simplex Expression) anaphors (zich, in Dutch) rapid, hierarchical structural description. In T. Inui and J. McClelland, Eds., Attention and Performance XVI: Information The core restrictions on binding are most commonly Integration in Perception and Communication. Cambridge, believed to be purely syntactic. It is assumed that bound MA: MIT Press, pp. 93–121. anaphora is possible only when the antecedent C- König, P., and A. K. Engel. (1995). Correlated firing in sensory- Commands the anaphoric expression. (Node A C- motor systems. Current Opinion in Neurobiology 5: 511–519. Commands node B iff the first branching node dominating Milner, P. M. (1974). A model for visual shape recognition. Psy- A also dominates B; Reinhart 1976.) In (1b), Lili C- chological Review. 81: 521–535. Commands herself, but in the illicit (1c), it does not. Shastri, L., and V. Ajjanagadde. (1993). From simple associations The central problem, however, is the different distribu- to systematic reasoning: a connectionist representation of rules, variables and dynamic bindings. Behavioral and Brain Sci- tion of the two anaphoric types. It was discovered in the sev- ences 16: 417–494. enties (Chomsky 1973) that the two anaphora types Smolensky, P. (1990). Tensor product variable binding and the rep- correspond to the two types of syntactic movement resentation of symbolic structures in connectionist systems. described below. Artificial Intelligence 46: 159–216. Stankiewicz, B. J., J. E. Hummel, and E. E. Cooper. (1998). The (3) WH-movement: Whoi did you suggest that we invite ti? role of attention in priming for left–right reflections of object (4) NP-movement images: evidence for a dual representation of object shape. a. Felixi was invited ti. Journal of Experimental Psychology: Human Perception and b. Felixi seems [ti happy]. Performance 24: 732–744. Tovee, M. J., and E. T. Rolls. (1992). Oscillatory activity is not evi- NP-movement is much more local than WH-MOVEMENT. dent in the primate visual cortex with static stimuli. NeuroRe- Chomsky's empirical generalization rests on observing the port 3: 369–372. relations between the moved NP and the trace left in its origi- Treisman, A., and G. Gelade. (1980). A feature integration theory nal position: in the syntactic domain in which a moved NP of attention. Cognitive Psychology 12: 97–136. Treisman, A. M., and H. Schmidt. (1982). Illusory conjunctions in can bind its trace, an NP can bind an anaphor, but it cannot the perception of objects. Cognitive Psychology 14: 107–141. bind a pronoun, as illustrated in (5) and (6). Where an anaphor von der Malsburg, C. (1981). The correlation theory of brain cannot be bound, NP movement is excluded as well, as in (7). function. Internal Report 81–2. Göttingen, Germany. Depart- (5) a. Felixi was invited ti ment of Neurobiology, Max-Plank-Institute for Biophysical Chemistry. b. Felixi invited himselfi c. *Felixi invited himi Further Readings (6) a. Felixi was heard [ti singing] b. Felixi heard [himselfi sing] Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: MIT Press. c. Felixi hoorde [zichi zingen] (Dutch) d. *Felixi heard [himi sing] Binding Theory (7) a. *Luciei believes that we should elect herselfi b. *Luciei is believed that we should elect ti Binding theory is the branch of linguistic theory that In the early implementations of binding theory (Chomsky explains the behavior of sentence-internal anaphora, which 1981), this was captured by defining NP-traces as anaphors. is labeled “bound anaphora” (see ANAPHORA). To illustrate Thus, the restrictions on NP-movement were believed to fol- the problem, the sentences in (1) each contain an anaphoric low from the binding conditions. Skipping the technical defi- Binding Theory 87 (7) Luciei believes that we should elect herselfi. nition of a local domain, these are given in (8), where “bound” means coindexed with a C-Commanding NP. (10) Luciei believes that we should elect Max and herselfi. (8) Binding conditions Anaphors that are not part of a chain are commonly Condition A: An anaphor must be bound in its local labeled “logophoric,” and the question when they are pre- domain. ferred over pronouns is dependent on DISCOURSE—rather Condition B: A pronoun must be free in its local than syntax—conditions (Pollard and Sag 1992; Reinhart domain. and Reuland 1993). There is, however, an aspect of bound local anaphora that (5c) and (6d) violate condition B. (7a, b) and (1c) violate is not covered by (8) or (9). Regarding case, SE and SELF condition A. The others violate neither, hence are permitted. anaphors are alike. Nevertheless, while both can occur in Later developments in SYNTAX enabled a fuller under- (6c), repeated in (11), SE is excluded in (12), which does standing of what this generalization follows from. A crucial not follow from (9). The difference is that in (12) a reflexive difference between WH-traces and NP-traces is that NP- predicate is formed, because the anaphor and Max are co- traces cannot carry case. The conditions in (8) alone cannot arguments. But in (11) the anaphor is the subject of the explain why this should be so; what is required is an exami- embedded predicate. The same contrast is found in many nation of the concept “argument.” An argument of some languages. predicative head P is any constituent realizing a grammati- cal function of P (thematic role, case, or grammatical sub- (11) Maxi hoorde [zichi/zichzelfi zingen] (Dutch) ject). However, arguments can be more complex objects (12) a. Maxi hoorde zichi. than just a single NP. In the passive sentence (5a), there is, b. Maxi hoorde zichzelfi. (Max heard himself.) in fact, just one argument, with two links. Arguments, then, need to be defined as chains: roughly, An A(rgument)-chain Reinhart and Reuland argue that, universally, the process is a sequence of (one or more) coindexed links satisfying C- of reflexivization requires morphological licensing. Thus, Command, in a local domain (skipping, again, the definition another principle is active here: of the local domain, which requires that there are no “barri- (13) Reflexivity Condition: ers” between any of the links). A reflexive predicate must be reflexive-marked. If A-chains count as just one syntactic argument, they cannot contain two fully independent links. Specifically, A predicate can be reflexive-marked either on the argu- coindexation that forms an A-chain must satisfy (9). ment, with a SELF anaphor, or on the predicate. (In the dra- vidian language Kannada, the reflexive morpheme kol is (9) The A-chain condition: used on the verb.) Because zich is not a reflexive-marker, An A-chain must contain exactly one link that carries (12a) violates (13). structural case (at the head of the chain). See also INDEXICALS AND DEMONSTRATIVES; GENERA- Condition (9) is clearly satisfied in (5a) and (6a), where TIVE GRAMMAR; QUANTIFIERS; SEMANTICS; SYNTAX– the trace gets no case. Turning to anaphoric expressions, SEMANTICS INTERFACE Reinhart and Reuland (1993) argue that while pronouns are —Tanya Reinhart fully Case-marked arguments, anaphors, like NP-traces, are Case-defective. Consequently, it turns out that the binding References conditions in (8) are just entailments of (9) (Fox 1993; Rein- hart and Reuland 1993). If a pronoun is bound in the local Chomsky, N. (1973). Conditions on transformations. In S. Ander- domain, as in (5c) and (6d), an A-chain is formed. But since son and P. Kiparsky, Eds., A Festschrift for Morris Halle. New York: Holt, Reinhart and Winston, pp. 232–286. the chain contains two Case-marked links, (9) rules this out, Chomsky, N. (1981). Lectures on Government and Binding. Dor- as did condition B of (8). In all the other examples in (5) and drecht: Foris. (6), the A-chains satisfy (9), because they are tailed by a Fox, D. (1993). Chain and binding—a modification of Reinhart caseless link (NP-trace or anaphor). If an anaphor is not and Reuland’s ‘Reflexivity.’ Ms., MIT, Cambridge, MA. bound in the local domain, it forms an A-chain of its own. Pollard, C., and I. Sag. (1992). Anaphors in English and the scope For example, in (7a) Lucie and herself are two distinct A- of the binding theory. Linguistic Inquiry 23: 261–305. chains (i.e., two arguments, rather than one). The second Reinhart, T. (1976). The Syntactic Domain of Anaphora. Ph.D. violates (9), because it does not contain even one case. diss., MIT, Cambridge, MA. Distributed by MIT Working Hence, (9) filters out the derivation, as did condition A of Papers in Linguistics. (8). Condition A, then, is just a reflex of the requirement that Reinhart, T., and E. Reuland. (1993). Reflexivity. Linguistic Inquiry 24: 657–720. arguments carry case, while condition B is the requirement that they do not carry more than one case, both currently Further Readings stated in (9). Recall that only arguments are required to have case. So Barss, A. (1986). Chains and Anaphoric Dependence. Ph. D. diss., (9) does not prevent an anaphor from occurring unbound in MIT, Cambridge, MA. a nonargument position. For example, the only difference Chomsky, N. (1986). Barriers. Cambridge, MA: MIT Press. between (7) and (10) is that the anaphor in (10) is embedded Chomsky, N. (1986). Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger. in an argument, but is not an argument itself. 88 Blindsight the point where one wonders how to explain that the sub- Everaert, M. (1991). The Syntax of Reflexivization. Dordrecht: Foris jects are blind. Hellan, L. (1988). Anaphora in Norwegian and the Theory of These puzzling residual functions that have increasingly Grammar. Dordrecht: Foris. attracted attention from philosophers (e.g., Nelkin 1996; see Jayasseelan, K. A. (1996). Anaphors as pronouns. Studia Linguis- consciousness) include neuroendocrine and reflexive tica 50: 207–255. responses that can even be demonstrated in unconscious Koster, J., and E. Reuland, Eds. (1991). Long-Distance Anaphora. (comatose) patients. In contrast, the nonreflexive responses Cambridge: Cambridge University Press. that are the hallmark of blindsight are only found in con- Lasnik, H. (1989). Essays on Anaphora. Dordrecht: Kluwer. scious patients with cortical visual field defects. They have Manzini, R., and K. Wexler. (1987). Parameters, binding theory been uncovered with two types of approach that circumvent and learnability. Linguistic Inquiry 18: 413–444. the blindness the patients experience. The first approach Sigurjonsdottir, S., and N. Hyams. (1992). Reflexivization and logophoricity: evidence from the acquisition of Icelandic. Lan- requires the patient to respond to a stimulus presented in the guage Acquisition 2 (4): 359–413. normal visual field, for instance by pressing a response key Williams, E. (1987). Implicit arguments, the binding theory and or by describing the stimulus. In part of the trials, unknown control. Natural Language and Linguistic Theory 5: 151–180. to the patient, the blind field is additionally stimulated. If the additional stimulus significantly alters the reaction time to the seen stimulus (Marzi et al. 1986), or if it alters its Blindsight appearance, for instance by inducing perceptual completion (Torjussen 1976), implicit processing of the unseen stimulus has been demonstrated. The second type of approach In 1905, the Swiss neurologist L. Bard demonstrated resid- requires the patients to respond directly to stimulation of the ual visual functions, in particular an ability to locate a blind field. Commonly, forced-choice guessing paradigms source of light, in cortically blind patients. The phenome- are used, and the patients are asked to guess where a stimu- non was termed blindsight by Weiskrantz and colleagues lus has been presented, whether one has been presented, or (1974) and has been extensively studied both in human which one of a small number of possible stimuli has been patients and in monkeys with lesions of the primary visual presented. Saccadic and manual localization, detection, and cortex (V1, striate cortex). The cortical blindness that discrimination of stimuli differing in dimensions ranging results from the visual cortex’s destruction or deafferenta- from speed and direction of motion to contrast, size, flux, tion is complete if the lesion destroys V1 in both hemi- spatial frequency, orientation, disparity, and wavelength spheres. The more common partial blindness (a field defect) have been demonstrated in this fashion (see Stoerig and always affects the contralesional visual hemifield. Its extent Cowey 1997 for review). Whether a patient’s performance is (“quadrantanopia,” “hemianopia”), position (“to the left”), at chance level, moderately significant, or close to perfect and density (“relative,” “absolute”) is perimetrically depends on many variables. Among others they include (a) assessed. Density refers to the degree to which conscious the stimulus properties: changes in on- and off-set charac- vision is lost: in a relative defect, conscious vision is teristics, size, wavelength, adaptation level, and speed can reduced and qualitatively altered; often, only fast-moving, all cause significant changes in performance (Barbur, Har- high-contrast stimuli are seen (Riddoch 1917). In an abso- low, and Weiskrantz 1994); (b) the stimulus position: when lute defect, no conscious vision remains. the stimulus is stabilized using an eye-tracking device, at Cortical blindness differs from blindness caused by com- least in some patients stimuli are detectable at some posi- plete destruction of the eye, the retina, or the optic nerve: tions and not at others (Fendrich, Wessinger, and Gazzaniga the latter lesions destroy the visual input into the brain, 1992); (c) the response: a spontaneous grasping response while destruction of the striate cortex spares the retinofugal may yield better discriminability than a verbal one (Perenin pathways that do not project (exclusively or at all) to this and Rossetti 1996); (d) the training: performance in identi- structure. These pathways form the extra-geniculo-striate cal conditions may improve dramatically with practice (Sto- cortical visual system that survives the effects of the V1- erig and Cowey 1997); (e) the lesion: although a larger lesion and the ensuing degeneration of the lateral geniculate lesion does not simply imply less residual function (Sprague nucleus and the partial degeneration of the retinal ganglion 1966), evidence from hemidecorticated patients indicates cell layer. Physiological recordings in monkeys and func- that at least the direct responses require extrastriate visual tional neuroimaging in patients has shown that this system, cortical mediation (King et al. 1996). which includes extrastriate visual cortical areas, remains Monkeys with striate cortical ablation show very similar visually responsive following inactivation or destruction of residual visual responses. In both humans and monkeys, V1 (Bullier, Girard, and Salin 1993; Stoerig et al. 1997). compared to the corresponding retinal position in the nor- The discovery of residual visual functions that were mal hemifield, the residual sensitivity is reduced by 0.4–1.5 demonstrable in patients who consistently claimed not to log units (Stoerig and Cowey 1997). It is important to note see the stimuli they nevertheless responded to (Pöppel, that detection based on straylight, determined with the stim- Frost, and Held 1973; Richards 1973; Weiskrantz et al. ulus positioned on the optic disc of normal observers or in 1974) was met with a surprise that bordered on disbelief. It the field defects of patients who are asked to respond by seemed inconceivable that human vision could be blind, indicating whether they can notice light emanating from nonphenomenal, and not introspectable. At the same time, this area, requires stimulus intensities 2–3 log units above the remaining visual responsivity of extensive parts of the those needed in the normal field. Blindsight is thus consid- visual system renders remaining visual functions likely to Blindsight 89 erably more sensitive and cannot be explained as an artifact Marzi, C. A., G. Tassinari, S. Aglioti, and L. Lutzemberger. (1986). Spatial summation across the vertical meridian in hemi- of light scattered into the normal visual field (Stoerig and anopics: a test of blindsight. Neuropsychologia 30: 783–795. Cowey 1991). The relatively small loss in sensitivity that Nelkin, N. (1996). Consciousness and the Origins of Thought. distinguishes blindsight from normal vision is remarkable Cambridge: Cambridge University Press. in light of the patients’ professed experience of blindness. Perenin, M. T., and Y. Rossetti. (1996). Grasping without form dis- Interestingly, hemianopic monkeys, when given the chance crimination in a hemianopic field. NeuroReport 7: 793–797. to indicate “no stimulus” in a signal detection paradigm, Pöppel, E., D. Frost, and R. Held. (1973). Residual visual function responded to stimuli they detected perfectly in a localiza- after brain wounds involving the central visual pathways in man. Nature 243: 295–296. tion task as if they could not see them (Cowey and Stoerig Richards, W. (1973). Visual processing in scotomata. Exp. Brain 1995). This indicates that it may not just be the patients who Res. 17: 333–347. deny seeing the stimuli and claim that they are only guess- Riddoch, G. (1917). Dissociation of visual perceptions due to ing, but that both species have blindsight: nonreflexive occipital injuries, with especial reference to appreciation of visual functions in response to stimuli that are not con- movement. Brain 40: 15–57. sciously seen. Sahraie, A., L.Weiskrantz, J. L. Barbur, A. Simmons, S. C. R. Wil- That the visual functions that remain in absolute cortical liams, and M. J. Brammer. (1997). Pattern of neuronal activity blindness are indeed blind is one of the most intriguing associated with conscious and unconscious processing of visual signals. Proc. Natl. Acad. Sci. USA 94: 9406–9411. aspects of the phenomenon. Like other implicit processes Sprague, J. M. (1966). Interaction of cortex and superior colliculus that have been described in patients with amnesia, achro- in mediation of visually guided behavior in the cat. Science matopsia, or prosopagnosia, they may help us understand 153: 1544–1547. which neuronal processes and structures mediate implicit as Stoerig, P., and A. Cowey. (1991). Increment-threshold spectral opposed to consciously represented processes. As ipsile- sensitivity in blindsight. Brain 114: 1487–1512. sional as well as contralesional extrastriate cortical respon- Stoerig, P., and A. Cowey. (1997). Blindsight in man and monkey. sivity to visual stimulation remains in patients and monkeys Brain 120: 120–145. with blindsight, it appears insufficient to generate the latter Stoerig, P., R. Goebel, L. Muckli, H. Hacker, and W. Singer. (Bullier, Girard, and Salin 1993; Stoerig et al. 1997). This (1997). On the functional neuroanatomy of blindsight. Soc. Neurosci. Abs. 27: 845. hypothesis gains further support from a recent functional Torjussen, T. (1976). Residual function in cortically blind hemi- magnetic resonance imaging study that compared within the fields. Scand. J. Psychol. 17: 320–322. same patient with a relative hemianopia the activation pat- Weiskrantz, L., E. K. Warrington, M. D. Sanders, and J. Marshall. terns elicited with a consciously perceived fast moving stim- (1974). Visual capacity in the hemianopic field following a ulus and a slow moving one that the patient could only restricted cortical ablation. Brain 97: 709–728. detect in an unaware mode: in both modes, extrastriate visual cortical areas were activated (Sahraie et al. 1997). Further Readings Further exploration along these lines may help pin down the neuronal substrate(s) of conscious vision, and studies of Barbur, J. L., K. H. Ruddock, and V. A. Waterfield. (1980). Human what can and cannot be done on the basis of blind vision visual responses in the absence of the geniculo-calcarine pro- alone can throw some light on the function as well as the jection. Brain 103: 905–928. nature of conscious representations. Barbur, J. L., J. D. Watson, R. S. J. Frackowiak, and S. Zeki. (1993). See also IMPLICIT VS. EXPLICIT MEMORY; QUALIA; Conscious visual perception without V1. Brain 116: 1293–1302. Blythe, I. M., C. Kennard, and K. H. Ruddock. (1987). Residual SENSATIONS vision in patients with retrogeniculate lesions of the visual —Petra Stoerig pathways. Brain 110: 887–905. Corbetta, M., C. A. Marzi, G. Tassinari, and S. Aglioti. (1990). References Effectiveness of different task paradigms in revealing blind- sight. Brain 113: 603–616. Barbur, J. L., A. J. Harlow, and L. Weiskrantz. (1994). Spatial and Cowey, A., P. Stoerig, and V. H. Perry. (1989). Transneuronal ret- temporal response properties of residual vision in a case of rograde degeneration of retinal ganglion cells after damage to hemianopia. Phil.Trans. R. Soc. Lond. B 343: 157–166. striate cortex in macaque monkeys: selective loss of P(β) cells. Bard, L. (1905). De la persistance des sensations lumineuses dans Neurosci. 29: 65–80. le champ aveugle des hemianopsiques. La Semaine Medicale Czeisler, C. A., T. L. Shanahan, E. B. Klerman, H. Martens, D. J. 22: 253–255. Brotman, J. S. Emens, T. Klein, and J. F. Rizzo III. (1995). Sup- Bullier, J., P. Girard, and P.-A. Salin. (1993). The role of area 17 in pression of melatonin secretion in some blind patients by expo- the transfer of information to extrastriate visual cortex. In A. sure to bright light. N. Engl. J. Med. 322: 6–11. Peters and K. S. Rockland, Eds., Cerebral Cortex. New York: Dineen, J., A. Hendrickson, and E. G. Keating. (1982). Alterations Plenum Press, pp. 301–330. of retinal inputs following striate cortex removal in adult mon- Cowey, A., and P. Stoerig. (1995). Blindsight in Monkeys. Nature key. Exp. Brain Res. 47: 446–456. 373: 247–249. Hackley, S. A., and L. N. Johnson. (1996). Distinct early and late Fendrich, R., C. M. Wessinger, and M. S. Gazzaniga. (1992). subcomponents of the photic blink reflex. I. Response charac- Residual vision in a scotoma. Implications for blindsight. Sci- teristics in patients with retrogeniculate lesions. Psychophysiol. ence 258: 1489–1491. 33: 239–251. King, S. M., P. Azzopardi, A. Cowey, J. Oxbury, and S. Oxbury. Heywood, C. A., A. Cowey, and F. Newcombe. (1991). Chromatic (1996). The role of light scatter in the residual visual sensitivity discrimination in a cortically colour blind observer. Eur. J. Neu- of patients with complete cerebral hemispherectomy. Vis. Neu- rosci. 3: 802–812. rosci. 13: 1–13. 90 Bloomfield, Leonard Bloomfield, Leonard Holmes, G. (1918). Disturbances of vision by cerebral lesions. Brit. J. Opthalmol. 2: 353–384. Humphrey, N. K. (1974). Vision in a monkey without striate cor- tex: a case study. Perception 3: 241–255. Leonard Bloomfield (1887–1949) is, together with Edward Humphrey, N. K. (1992). A History of the Mind. New York: Simon Sapir, one of the two most prominent American linguists of and Schuster. the first half of the twentieth century. His book Language Keane, J. R. (1979). Blinking to sudden illumination. A brain stem (Bloomfield 1933) was the standard introduction to linguis- reflex present in neocortical death. Arch. Neurol. 36: 52–53. tics for thirty years. Together with his students, particularly Klüver, H. (1941). Visual functions after removal of the occipital Bernard Bloch, Zellig Harris, and Charles Hockett, Bloom- lobes. J. Psychol. 11: 23–45. field established the school of thought that has come to be Mohler, C. W., and R. H. Wurtz. (1977). Role of striate cortex and superior colliculus in the guidance of saccadic eye movements known as American structural linguistics, which dominated in monkeys. J. Neurophysiol. 40: 74–94. the field until the rise of GENERATIVE GRAMMAR in the 1960s. Paillard, J., F. Michel, and G. Stelmach. (1983). Localization with- Throughout his career, Bloomfield was concerned with out content. A tactile analogue of ‘blind sight’. Arch. Neurol. developing a general and comprehensive theory of lan- 40: 548–551. guage. His first formulation (Bloomfield 1914) embedded Pasik, P., and T. Pasik. (1982). Visual functions in monkeys after that theory within the conceptualist framework of Wilhelm total removal of visual cerebral cortex. Contributions to Sen- Wundt. In the early 1920s, however, Bloomfield abandoned sory Physiology 7: 147–200. that in favor of a variety of BEHAVIORISM in which the the- Perenin, M. T., and M. Jeannerod. (1978). Visual function ory of language took center stage: “The terminology in within the hemianopic field following early cerebral which at present we try to speak of human affairs—. . . hemidecortication in man. I. Spatial localization. Neuropsy- chologia 16: 1–13. ‘consciousness,’ ‘mind,’ ‘perception,’ ‘ideas,’ and so on— Pöppel, E. (1986). Long-range colour-generating interactions . . . will be discarded . . . and will be replaced . . . by terms in across the retina. Nature 320: 523–525. linguistics . . . . Non-linguists . . . constantly forget that a Riddoch, G. (1917). Dissociation of visual perceptions due to speaker is making noise, and credit him, instead, with the occipital injuries, with especial reference to appreciation of possession of impalpable ‘ideas.’ It remains for the linguist movement. Brain 40: 15–57. to show, in detail, that the speaker has no ‘ideas’ and that the Rodman, H. R., C. G. Gross, and T. D. Albright. (1989). Afferent noise is sufficient” (Bloomfield 1936: 322, 325; page num- basis of visual response properties in area MT of the macaque. bers for Bloomfield’s articles refer to their reprintings in I. Effects of striate cortex removal. J. Neurosci. 9: 2033–2050. Hockett 1970). Rodman, H. R., C. G. Gross, and T. D. Albright. (1990). Afferent In repudiating the existence of all mentalist constructs, basis of visual response properties in area MT of the macaque. II. Effects of superior colliculus removal. J. Neurosci. 10: Bloomfield also repudiated the classical view that the struc- 1154–1164. ture of language reflects the structure of thought. For Sanders, M. D., E. K. Warrington, J. Marshall, and L. Weiskrantz. Bloomfield, the structure of language was the central object (1974). “Blindsight”: vision in a field defect. Lancet 20 April, of linguistic study, and hence of cognitive science, had that pp. 707–708. term been popular in his day. Schacter, D. L. (1987). Implicit memory. History and current sta- Bloomfield maintained that all linguistic structure could tus. J. Exp. Psychol.: Learn. Memory Cogn. 13: 501–518. be determined by the application of analytic procedures Stoerig, P. (1996). Varieties of vision: from blind responses to con- starting with the smallest units that combine sound (or scious recognition. Trends Neurosci. 19: 401–406. “vocal features”) and meaning (or “stimulus-reaction fea- Stoerig, P., and A. Cowey. (1989). Wavelength sensitivity in blind- tures”), called morphemes (Bloomfield 1926: 130). Having sight. Nature 342: 916–918. Stoeric, P., and A. Cowey. (1992). Wavelength discrimination in shown how to identify morphemes, Bloomfield went on to blindsight. Brain 115: 425–444. show how to identify both smaller units (i.e., phonemes, Stoerig, P., M. Hübner, and E. Pöppel. (1985). Signal detection defined as minimum units of “distinctive” vocal features) analysis of residual vision in a field defect due to a post-genicu- and larger ones (words, phrases, and sentences). late lesion. Neuropsychologia 23: 589–599. Bloomfield developed rich theories of both Stoerig, P., J. Faubert, M. Ptito, V. Diaconu, and A. Ptito. (1996). MORPHOLOGY and SYNTAX, much of which was carried over No blindsight following hemidecortication in human subjects? more or less intact into generative grammar. In morphology, NeuroReport 7: 1990–1994. Bloomfield paid careful attention to phonological van Buren, J. M. (1963). Trans-synaptic retrograde degeneration in alternations of various sorts, which led to the development the visual system of primates. J. Neurol. Neurosurg. Psychiatry of the modern theory of morphophonemics (see especially 34: 140–147. Weiskrantz, L. (1986). Blindsight: A Case Study and Implications. Bloomfield 1939). In syntax, he laid the foundations of the Oxford: Oxford University Press. theory of constituent structure, including the rudiments of Weiskrantz, L. (1990). Outlooks for blindsight: explicit methods XBAR-THEORY (Bloomfield 1933: 194–195). Bloomfield for implicit processes. Proc. R. Soc. Lond. B 239: 247–278. generated so much enthusiasm for syntactic analysis that his Weiskrantz, L., J. L. Barbur, and A. Sahraie. (1995). Parameters students felt that they were doing syntax for the first time in affecting conscious versus unconscious visual discrimination in the history of linguistics (Hockett 1968: 31). a patient with damage to the visual cortex (V1). Proc. Natl. Bloomfield did not develop his theory of semantics to the Acad. Sci. USA 92: 6122–6126. same extent as he did his theories of PHONOLOGY, morphol- Weller, R. E., and J. H. Kaas. (1989). Parameters affecting the loss ogy, and syntax, contenting himself primarily with naming of ganglion cells of the retina following ablations of striate cor- the semantic contributions of various types of linguistic units. tex in primates. Vis. Neurosci. 3: 327–342. Boas, Franz 91 For example, he called the semantic properties of morphemes Further Readings “sememes,” those of grammatical forms “episememes,” etc. Harris, Z. S. (1951). Methods in Structural Linguistics. Chicago: (Bloomfield 1933: 162, 166). Bloomfield contended that University of Chicago Press. whereas the phonological properties of morphemes are ana- Joos, M., Ed. (1967). Readings in Linguistics I. Chicago: Univer- lyzable into parts (namely phonemes), sememes are unana- sity of Chicago Press. lyzable: “There is nothing in the structure of morphemes like Matthews, P. H. (1993). Grammatical Theory in the United States wolf, fox, and dog to tell us the relation between their mean- from Bloomfield to Chomsky. Cambridge: Cambridge Univer- ings; this is a problem for the zoölogist” (1933: 162). Toward sity Press. the end of the heyday of American structural linguistics, however, this view was repudiated (Goodenough 1956; Boas, Franz Lounsbury 1956), and the claim that there are submorphemic units of meaning was incorporated into early theories of gen- Franz Boas (1858–1942) was the single most influential erative grammar (Katz and Fodor 1963). anthropologist in North America in the twentieth century. Bloomfield knew that for a behaviorist theory of meaning He immigrated to the United States from Germany in the such as his own to be successful, it would have to account for 1880s, taught briefly at Clark University, then in 1896 took the semantic properties of nonreferential linguistic forms a position at Columbia University, where he remained for such as the English words not and and. Bloomfield was the rest of his career. He was trained originally in physics aware of the difficulty of this task. His attempt at defining and geography, but by the time he came to this country his the word not is particularly revealing. After initially defining interests had already turned to anthropology. it as “the linguistic inhibitor [emphasis his] in our speech- He was a controversial figure almost from the start, in community,” he went on to write: “The utterance, in a part because of his debates with the cultural evolutionists phrase, of the word not produces a phrase such that simulta- about the course of human history (see CULTURAL EVOLU- neous parallel response to both this phrase and the parallel TION). According to the evolutionists, the pattern of history phrase without not cannot be made” (Bloomfield 1935: 312). is one of progress, whereby societies develop through stages In short, what Bloomfield is attempting to do here is to of savagery, barbarism, and eventually civilization. In this reduce the logical law of contradiction to a statement about view, progress is guided by human reason, and societies dif- possible stimulus-response pairs. fer because some have achieved higher degrees of rational- However, such a reduction is not possible. No semantic ity and therefore have produced more perfect institutions theory that contains the law of contradiction as one of its than others. According to Boas, however, the dominant pro- principles is expressible in behaviorist terms. Ultimately, cess of change is culture borrowing or diffusion. All societ- American structural linguistics failed not for its inadequa- ies invent only a small fraction of their cultural inventory, cies in phonology, morphology, and syntax, but because for they acquire most of their cultural material from other behaviorism does not provide an adequate basis for the peoples nearby. The process of diffusion is a result not of development of a semantic theory for natural languages. reason but of historical accident, and each culture is a See also DISTINCTIVE FEATURES; FUNCTIONAL ROLE unique amalgamation of traits and has a unique historical SEMANTICS; MEANING; SAUSSURE past. —D. Terence Langendoen Boas’s concept of culture changed radically in the con- text of these ideas about history, for he came to view culture References as a body of patterns that people learn through interactions with the members of their society. People adhere to such Bloomfield, L. (1914). An Introduction to the Study of Language. New York: Henry Holt. patterns as hunting practices and marriage rules not because Bloomfield, L. (1926). A set of postulates for the science of lan- they recognize that these help to improve their lives, as the guage. Language 2: 153–164. Reprinted in Hockett 1970, pp. evolutionists thought, but because the members of society 128–138. absorb the cultural forms of their social milieu. By this Bloomfield, L. (1933). Language. New York: Henry Holt. view, these historically variable patterns largely govern Bloomfield, L. (1935). Linguistic aspects of science. Philosophy of human behavior and thus are the most important component Science 2: 499–517. Reprinted in Hockett 1970, pp. 307–321. of the human character. Furthermore, most of culture is Bloomfield, L. (1936). Language or ideas? Language 12: 89–95. emotionally grounded and beyond the level of conscious Reprinted in Hockett 1970, pp. 322–328. awareness. Whereas the evolutionists assumed that people Bloomfield, L. (1939). Menomini morphophonemics. Travaux du are consciously oriented by patterns of rationality and that Cercle Linguistique de Prague 8: 105–115. Reprinted in Hock- ett, 1970, pp. 351–362. reason itself is universal and not local—although different Goodenough, W. (1956). Componential analysis and the study of societies exhibit different degrees of it—from Boas’s per- meaning. Language 32: 195–216. spective people are oriented by a body of cultural patterns of Hockett, C. F. (1968). The State of the Art. The Hague: Mouton. which they are largely unaware. These include such features Hockett, C. F., Ed. (1970). A Leonard Bloomfield Anthology. as linguistic rules, values, and assumptions about reality Bloomington: Indiana University Press. (see LANGUAGE AND CULTURE). These patterns are emo- Katz, J. J., and J. F. Fodor. (1963). The structure of a semantic the- tionally grounded in that people become attached to the ory. Language 39: 170–210. ways of life they have learned and adhere to them regardless Lounsbury, F. (1956). A semantic analysis of Pawnee kinship of rational or practical considerations. usage. Language 32: 158–194. 92 Boltzmann Machines Boas’s thinking also had significant implications for the Boas, F., Ed. (1938). General Anthropology. Boston: Heath. Codere, H., Ed. (1966). Kwakiutl Ethnography. Chicago: Univer- concept of race. People behave the way they do not because sity of Chicago Press. of differences in racial intelligence, but because of the cul- Goldschmidt, W. (1959). The Anthropology of Franz Boas: Essays tural patterns they have learned through enculturation. Boas on the Centennial of His Birth. Menasha, WI: American was an outspoken proponent of racial equality, and publica- Anthropological Association. tion of his book The Mind of Primitive Man in 1911 was a Hatch, E. (1973). Theories of Man and Culture. New York: Colum- major event in the development of modern racial thought. bia University Press. Furthermore, Boas’s culture concept had important relativ- Holder, P. (1911, 1966). Introduction. In F. Boas, Handbook of istic implications (see CULTURAL RELATIVISM). He pro- American Indian Languages. Lincoln: University of Nebraska posed that values are historically conditioned, in the same Press. way as pottery styles and marriage patterns, and conse- Jacknis, I. (1985). Franz Boas and exhibits: on the limitations of the museum method of anthropology. In G. W. Stocking, Jr., quently the standards that a person uses in judging other Ed., Objects and Others: Essays on Museums and Material societies reflect the perspective that he or she has learned. Culture. Madison: University of Wisconsin Press. Boas and his students developed a strong skepticism toward Lowie, R. H. (1917, 1966). Culture and Ethnology. New York: cross-cultural value judgments. Basic Books. Boas’s work was epistemologically innovative, and he Rohner, R. P., and E. C. Rohner. (1969). The Ethnography of Franz elaborated an important version of cognitive relativism (see Boas. Chicago: University of Chicago Press. RATIONALISM VS. EMPIRICISM). In his view, human beings Stocking, G. W., Jr. (1968). Race, Culture, and Evolution: Essays experience the world through such forms as linguistic pat- in the History of Anthropology. New York: The Free Press. terns and cultural beliefs, and like all other aspects of culture Stocking, G. W., Jr. (1974). The Shaping of American Anthropol- these are influenced by the vicissitudes of history. Conse- ogy: A Franz Boas Reader. New York: Basic Books. Stocking, G. W., Jr. (1992). The Ethnographer's Magic and Other quently, people experience the world differently according to Essays in the History of Anthropology. Madison: University of the cultures in which they are raised. For example, the lin- Wisconsin Press. guistic rules that a person learns have the capacity to lead that Stocking, G. W., Jr. (1996). Volkgeist as Method and Ethic: Essays individual to mis-hear speech sounds that he or she is not on Boasian Anthropology and the German Anthropological accustomed to hearing, while the same person has no diffi- Tradition. Madison: University of Wisconsin Press. culty hearing minute differences between other speech sounds that are part of his or her native tongue. Thus this seg- Boltzmann Machines ment of experience is comprehended through a complex of unconscious linguistic forms, and speakers of different lan- guages hear these sounds differently. See RECURRENT NETWORKS Yet in important respects Boas was not a relativist. For instance, while he argued that the speakers of different lan- Bounded Rationality guages hear the same speech sounds differently, he also assumed that the trained linguist may discover this happen- Bounded rationality is rationality as exhibited by decision ing, for, with effort, it is possible to learn to hear sounds as makers of limited abilities. The ideal of RATIONAL DECI- they truly are. In a sense, the linguist is able to experience SION MAKING formalized in RATIONAL CHOICE THEORY, speech sounds outside of his or her own linguistic frame- UTILITY THEORY, and the FOUNDATIONS OF PROBABILITY work, and to avoid the cognitive distortions produced by requires choosing so as to maximize a measure of expected culture. Boas held similar views about science. While real- utility that reflects a complete and consistent preference ity is experienced through cultural beliefs, it is possible to order and probability measure over all possible contingen- move outside of those beliefs into a sphere of objective neu- cies. This requirement appears too strong to permit accurate trality, or a space that is culture-free, in doing scientific description of the behavior of realistic individual agents research. Thus Boas’s anthropological theory contained a studied in economics, psychology, and artificial intelli- version of cognitive relativism at one level but rejected it at gence. Because rationality notions pervade approaches to another. Relativism applies when human beings think and so many other issues, finding more accurate theories of perceive in terms of their learned, cultural frameworks, but bounded rationality constitutes a central problem of these it is possible for cognitive processes to operate outside of fields. Prospects appear poor for finding a single “right” those frameworks as well. theory of bounded rationality due to the many different See also CULTURAL VARIATION; HUMAN UNIVERSALS; ways of weakening the ideal requirements, some formal SAPIR impossibility and tradeoff theorems, and the rich variety of —Elvin Hatch psychological types observable in people, each with differ- ent strengths and limitations in reasoning abilities. Russell References and Norvig’s 1995 textbook provides a comprehensive sur- vey of the roles of rationality and bounded rationality Boas, F. (1911). The Mind of Primitive Man. New York: Mac- notions in artificial intelligence. Cherniak 1986 provides a millan. philosophical introduction to the subject. Simon 1982 dis- Boas, F. (1928). Anthropology and Modern Life. New York: Norton. cusses numerous topics in economics; see Conlisk 1996 for Boas, F. (1940). Race, Language and Culture. New York: Mac- a broad economic survey. millan. Bounded Rationality 93 Studies in ECONOMICS AND COGNITIVE SCIENCE and of nality” of Leibenstein (1980) and “bounded optimality” of human DECISION MAKING document cases in which everyday Horvitz (1987) and Russell and Subramanian (1995) treat lim- and expert decision makers do not live up to the rational ideal itations stemming from optimization over circumscribed sets (Kahneman, Slovic, and TVERSKY 1982; Machina 1987). The of alternatives. ideal maximization of expected utility implies a comprehen- Lessening informational requirements constitutes one siveness at odds with observed failures to consider alterna- important form of procedural rationality. Goal-directed prob- tives outside those suggested by the current situation. The lem solving and small world formulations do this directly by ideal probability and utility distributions imply a degree of basing actions on highly incomplete preferences and proba- LOGICAL OMNISCIENCE that conflicts with observed inconsis- bilities. The extreme incompleteness of information repre- tencies in beliefs and valuations and with the frequent need to sented by these approaches can prevent effective action, invent rationalizations and preferences to cover formerly however, thus requiring means for filling in critical gaps in unconceived circumstances. The theory of BAYESIAN LEARN- reasonable ways, including various JUDGMENT HEURISTICS ING or conditionalization, commonly taken as the theory of based on representativeness or other factors (Kahneman, belief change or learning appropriate to rational agents, con- Slovic, and TVERSKY 1982). Assessing the expected value of flicts with observed difficulties in assimilating new informa- information forms one general approach to filling these gaps. tion, especially the resistance to changing cognitive habits. In this approach, one estimates the change in utility of the Reconciling the ideal theory with views of decision mak- decision that would stem from filling specific information ers as performing computations also poses problems. Con- gaps, and then acts to fill the gaps offering the largest ducting the required optimizations at human rates using expected gains. These assessments may be made of policies standard computational mechanisms, or indeed any physical as well as of specific actions. Applied to policies about how system, seems impossible to some. The seemingly enor- to reason, such assessments form a basis for the nonmono- mous information content of the required probability and tonic or default reasoning methods appearing in virtually all utility distributions may make computational representa- practical inference systems (formalized as various NON- tions infeasible, even using BAYESIAN NETWORKS or other MONOTONIC LOGICS and theories of belief revision) that fill relatively efficient representations. routine gaps in rational and plausible ways. Even when The search for realistic theories of rational behavior began expected deliberative utility motivates use of a nonmonotonic by relaxing optimality requirements. Simon (1955) formulated rule for adopting or abandoning assumptions, such rules typi- the theory of “satisficing,” in which decision makers seek only cally do not involve probabilistic or preferential information to find alternatives that are satisfactory in the sense of meeting directly, though they admit natural interpretations as either some threshold or “aspiration level” of utility. A more general statements of extremely high probability (infinitesimally exploration of the idea of meeting specific conditions rather close to 1), in effect licensing reasoning about magnitudes of than unbounded optimizations also stimulated work on PROB- probabilities without requiring quantitative comparisons, or LEM SOLVING, which replaces expected utility maximization as expressions of preferences over beliefs and other mental with acting to satisfy sets of goals, each of which may be states of the agent, in effect treating reasoning as seeking achieved or not. Simon (1976) also emphasized the distinction mental states that are Pareto optimal with respect to the rules between “substantive” and “procedural” rationality, concern- (Doyle 1994). Nonmonotonic reasoning methods also aug- ing, respectively, rationality of the result and of the process by ment BAYESIAN LEARNING (conditionalization) with direct which the result was obtained, setting procedural rationality as changes of mind that suggest “conservative” approaches to a more feasible aim than substantive rationality. Good (1952, reasoning that work through incremental adaptation to small 1971) urged a related distinction in which “Type 1” rationality changes, an approach seemingly more suited to exhibiting consists of the ordinary ideal notion, and “Type 2” rationality procedural rationality than the full and direct incorporation of consists of making ideal decisions taking into account the cost new information called for by standard conditionalization. of deliberation. The Simon and Good distinctions informed Formal analogs of Arrow’s impossibility theorem for work in artificial intelligence on control of reasoning (Dean social choice problems and multiattribute UTILITY THEORY 1991), including explicit deliberation about the conduct of rea- limit the procedural rationality of approaches based on piece- soning (Doyle 1980), economic decisions about reasoning meal representations of probability and preference informa- (Horvitz 1987, Russell 1991), and iterative approximation tion (Doyle and Wellman 1991). As such representations schemes or “anytime algorithms” (Horvitz 1987, Dean and dominate practicable approaches, one expects any automatic Boddy 1988) in which optimization attempts are repeated with method for handling inconsistencies amidst the probability increasing amounts of time, so as to provide an informed esti- and preference information to misbehave in some situations. mate of the optimal choice no matter when deliberation is ter- See also GAME THEORY; HEURISTIC SEARCH; LOGIC; RATIO- minated. Although reasoning about the course of reasoning NAL AGENCY; STATISTICAL LEARNING THEORY; UNCERTAINTY may appear problematic, it can be organized to avoid crippling —Jon Doyle circularities (see METAREASONING), and admits theoretical reductions to nonreflective reasoning (Lipman 1991). One References may also relax optimality by adjusting the scope of optimiza- tion as well as the process. Savage (1972) observed the practi- Cherniak, C. (1986). Minimal Rationality. Cambridge, MA: MIT cal need to formulate decisions in terms of “small worlds” Press. abstracting the key elements, thus removing the most detailed Conlisk, J. (1996). Why bounded rationality? Journal of Economic alternatives from optimizations. The related “selective ratio- Literature 34: 669–700. 94 Brain Mapping remembered for his formulation of the so-called Brentano Dean, T. (1991). Decision-theoretic control of inference for time- critical applications. International Journal of Intelligent Sys- thesis or doctrine of intentionality, according to which what tems 6 (4): 417–441. is characteristic of mental phenomena is their INTENTIONAL- Dean, T., and M. Boddy. (1988). An analysis of time-dependent ITY or the “mental inexistence of an object.” planning. Proceedings of the Seventh National Conference on For Brentano, intentionality is to be understood in psy- Artificial Intelligence pp. 49–54. chological (or in what might today be called methodologi- Doyle, J. (1980). A model for deliberation, action, and introspec- cally solipsistic) terms. To say that a mental act is “directed tion. Technical Report AI-TR 58. Cambridge, MA: MIT Artifi- toward an object” is to make an assertion about the interior cial Intelligence Laboratory. Doyle, J. (1994). Reasoned assumptions and rational psychology. structure or representational content of the act. Brentano’s Fundamenta Informaticae 20 (1–3): 35–73. primary aim is to provide a taxonomy of the different kinds Doyle, J., and M. P. Wellman. (1991). Impediments to universal of basic constituents of mental life and of the different kinds preference-based default theories. Artificial Intelligence 49 (1– of relations between them. Unlike more recent cognitive 3): 97–128. psychologists, Brentano takes as his main instrument in ana- Good, I. J. (1952). Rational decisions. Journal of the Royal Statis- lyzing these basic constituents and relations not logic but a tical Society B, 14: 107–114. sophisticated ontological theory of part and whole, or Good, I. J. (1971). The probabilistic explication of information, “mereology.” Where standard mereology is extensional, evidence, surprise, causality, explanation, and utility. In V. P. however, treating parts and wholes by analogy with Venn Godambe and D. A. Sprott, Eds., Foundations of Statistical Inference. Toronto: Holt, Rinehart, and Winston, pp. 108–127. diagrams, Brentano’s mereology is enriched by topological Horvitz, E. J. (1987). Reasoning about beliefs and actions under elements (Brentano 1987) and by a theory of the different computational resource constraints. Proceedings of the Third sorts of dependence relations connecting parts together into AAAI Workshop on Uncertainty in Artificial Intelligence pp. unitary wholes of different sorts. A theory of “mereotopol- 429–444. ogy” along these lines was first formalized by Husserl in Kahneman, D., P. Slovic, and A. Tversky, Eds., (1982). Judgment 1901 in the third of his Logical Investigations (1970), and under Uncertainty: Heuristics and Biases. Cambridge: Cam- its application by Husserl to the categories of language led bridge University Press. Leibenstein, H. (1980). Beyond Economic Man: A New Foundation to the development of CATEGORIAL GRAMMAR in the work for Microeconomics. 2nd ed. Cambridge, MA: Harvard Univer- of Lesniewski and in Ajdukiewicz (1967). ´ sity Press. The overarching context of all Brentano’s writings is the Lipman, B. L. (1991). How to decide how to decide how to. . . : psychology and ontology of Aristotle. Aristotle conceived modeling limited rationality. Econometrica 59 (4): 1105–1125. perception and thought as processes whereby the mind Machina, M. J. (1987). Choice under uncertainty: problems solved abstracts sensory and intelligible forms from external sub- and unsolved. Journal of Economic Perspectives 1 (1): 121–154. stances. Impressed by the successes of corpuscularism in Russell, S. J. (1991). Do the Right Thing: Studies in Limited Ratio- physics, Brentano had grown sceptical of the existence of nality. Cambridge, MA: MIT Press. any external substances corresponding to our everyday cog- Russell, S. J., and P. Norvig. (1995). Artificial Intelligence: A Mod- ern Approach. Englewood Cliffs, NJ: Prentice-Hall. nitive contents. He thus suspended belief in external sub- Russell, S. J., and D. Subramanian. (1995). Provably bounded- stances but retained the Aristotelian view of cognition as a optimal agents. Journal of Artificial Intelligence Research 2: process of combining and separating forms within the mind. 575–609. It is in these terms that we are to understand his view that Savage, L. J. (1972). The Foundations of Statistics. Second edition. “[e]very mental phenomenon includes something as object New York: Dover Publications. within itself” (1973). Simon, H. A. (1955). A behavioral model of rational choice. Quar- Brentano distinguishes three sorts of ways in which a terly Journal of Economics 69: 99–118. subject may be conscious of an object: Simon, H. A. (1976). From substantive to procedural rationality. In S. J. Latsis, Ed., Method and Appraisal in Economics. Cam- 1. In presentation. Here the subject is conscious of the bridge: Cambridge University Press, pp. 129–148. object or object-form, and has it before his mind, without Simon, H. A. (1982). Models of Bounded Rationality: Behavioral taking up any position with regard to it, whether in sen- Economics and Business Organization, vol. 2. Cambridge, MA: sory experience or via concepts. MIT Press. 2. In judgment. Here there is added to presentation one of two diametrically opposed modes of relating cognitively Brain Mapping to the object: modes of acceptance and rejection or of belief and disbelief. Perception, for Brentano, is a com- bination of sensory presentation and positive judgment. See INTRODUCTION: NEUROSCIENCES; COMPUTATIONAL NEU- 3. In phenomena of interest. Here there is added to presen- ROANATOMY; MAGNETIC RESONANCE IMAGING; POSITRON tation one of two diametrically opposed modes of relat- EMISSION TOMOGRAPHY ing conatively to the object: modes of positive and negative interest or of “love” and “hate.” Judgment and Brentano, Franz interest are analogous in that there is a notion of correct- ness applying to each: the correctness of a judgment (its truth) serves as the objective basis of logic, the correct- Franz Brentano (1838–1917), German philosopher and psy- ness of love and hate as the objective basis of ethics. chologist, taught in the University of Vienna from 1874 to Brentano’s theory of part and whole is presented in his 1894. He is the author of Psychology from an Empirical Descriptive Psychology (1995). Many of the parts of con- Standpoint (first published in 1874), and is principally Broadbent, Donald E. 95 sciousness are “separable” in the sense that one part can Brentano, F. (1995). Descriptive Psychology. London: Routledge. Chisholm, R. M. (1982). Brentano and Meinong Studies. Amster- continue to exist even though another part has ceased to dam: Rodopi. exist. Such separability may be either reciprocal—as in a Ehrenfels, C. von (1988). On “Gestalt Qualities.” In B. Smith, Ed., case of simultaneous seeing and hearing—or one-sided—as Foundations of Gestalt Theory. Munich and Vienna: in the relation of presentation and judgment, or of presenta- Philosophia, pp. 82–117. tion and desire: a judgment or desire cannot as a matter of Husserl, E. (1970). Logical Investigations. London: Routledge and necessity exist without some underlying presentation of the Kegan Paul. object desired or believed to exist. Keil, F. (1979). Semantic and Conceptual Development. An Onto- The relation of one-sided separability imposes upon con- logical Perspective. Cambridge, MA: Harvard University Press. sciousness a hierarchical order, with ultimate or fundamen- MacNamara, J., and G.-J. Boudewijnse. (1995). Brentano’s influ- tal acts, acts having no further separable parts, constituting ence on Ehrenfels’s theory of perceptual gestalts. Journal for the Theory of Social Behaviour 25: 401–418. the ground floor. Such basic elements are for Brentano Smith, B. (1994). Austrian Philosophy: The Legacy of Franz Bren- always acts of sensation. Even among basic acts, however, tano. Chicago: Open Court. we can still in a certain sense speak of further parts. Thus in ´ Wolen ski, J., and P. M. Simons. (1989). De Veritate: Austro-Polish a sensation of a blue patch we can distinguish a color deter- contributions to the theory of truth from Brentano to Tarski. In mination and a spatial determination as “distinctional parts” K. Szaniawski, Ed., The Vienna Circle and the Lvov-Warsaw that mutually pervade each other. Another sort of distinc- School. Dordrecht: Kluwer, pp. 391–442. tional part is illustrated by considering what the sensation of a blue patch and the sensation of a yellow patch share in common: they share, Brentano holds, the form of colored- Broadbent, Donald E. ness as a logical part. Brentano’s account of the range of dif- ferent sorts of distinctional parts of cognitive phenomena, and especially of the tree structure hierarchies manifested After years of behaviorist denial of mental terms, DONALD by different families of logical parts, covers some of the HEBB (1949) admonished: “We all know that attention and ground surveyed by later studies of “ontological knowl- set exist, so we had better get the skeleton out of the closet edge” (Keil 1979). and see what can be done with it.” Donald E. Broadbent Brentano’s students included not only Sigmund Freud (1926–1993), more than anyone, deserves the credit for and T. G. Masaryk, but also Edmund Husserl, Alexius shedding scientific light on this “skeleton.” Through his own Meinong, Christian von Ehrenfels, and Carl Stumpf. Each empirical contributions and his careful analyses of the find- went on to establish schools of importance for the develop- ings of others, Broadbent demonstrated that experimental ment of different branches of cognitive science within this psychology could reveal the nature of cognitive processes. In century (Smith 1994). Husserl’s disciples founded the so- his hands, an information processing approach to under- called phenomenological movement; Meinong founded the standing ATTENTION, perception, MEMORY, and performance Graz School of “Gegenstandstheorie” (ontology without was exceptionally illuminating, and helped initiate and fuel existence assumptions); and it was students of Ehrenfels and the paradigm shift known as the “cognitive revolution.” Stumpf in Prague and Berlin who founded the school of Broadbent joined the Royal Air Force in 1944. Noting GESTALT PSYCHOLOGY (the term “Gestalt” having been first equipment poorly matched to the human pilot, the impor- used by Ehrenfels as a technical term of psychology in tance of practice, and the possibility of measuring individ- 1890; see Ehrenfels 1988; MacNamara and Boudewijnse ual differences, he envisaged a career in psychology. Under 1995). The representatives of each of these schools and the leadership of Sir Frederic BARTLETT, the Psychology movements attempted to transform Brentano’s psychologi- Department at Cambridge University was engaged in solv- cal doctrine of intentionality into an ontological theory of ing precisely the kind of real world problems that excited how cognizing subjects are directed toward objects in the Broadbent. Upon graduation in 1949, Broadbent went world. The influence of Brentano’s philosophical ideas is straight into research as a staff member at the Medical alive today in the work of analytic philosophers such as Research Council’s Applied Psychology Unit (APU) in Roderick Chisholm (1982), and it has been especially prom- Cambridge. inent in twentieth-century Polish logic and philosophy Broadbent’s early research and thinking were strongly ´ (Wolenski and Simons 1989). influenced by the APU’s first two directors, Bartlett and Kenneth Craik. Bartlett (1932) emphasized that people were See also CONSCIOUSNESS; JAMES, WILLIAM; MENTAL active, constructivist processors; Craik (1943) pioneered the REPRESENTATION use of engineering concepts (cybernetics) to explain human —Barry Smith performance. At the time, communication theory (Shannon 1948), with its information metric and concept of a commu- References nication channel with a limited capacity for information transmission, was being applied to psychological phenom- Ajdukiewicz, K. (1967). Syntactic connexion. In S. McCall, Ed., ena. Broadbent induced the key principles of his “filter” the- Polish Logic 1920–1939. Oxford: Clarendon Press, pp. 207–231. ory from three basic findings on selective listening (e.g., Brentano, F. (1973). Psychology from an Empirical Standpoint. Cherry 1953): (1) People are strictly limited in their ability London: Routledge and Kegan Paul. to deal with multiple messages (sources of sensory informa- Brentano, F. (1987). Philosophical Investigations on Space, Time tion); (2) the ability to focus on one message while ignoring and the Continuum. London: Croom Helm. 96 Broadbent, Donald E. pure and applied psychological research. After this period of steadfast administrative service, Broadbent—who never held an academic appointment—stayed on the Medical Research Council’s scientific staff while moving to Oxford. Although Broadbent is, and will continue to be, best known for his empirical and theoretical work on atten- tion, his endorsement of applied psychology never waned. Thus, in contrast to an emphasis on the cognitive “hard- ware” implicit in his filter theory, Broadbent’s belief in the importance of cognitive “software” (task variables and individual differences in strategy selection), led him to predict: “In the long run, psychology will, like computer Figure 1. Broadbent’s (1958) filter theory asserts that there exists a science, become an ever-expanding exploration of the mer- limited capacity stage of perception (P-system), that this stage is its and disadvantages of alternative cognitive strategies” preceded by parallel analysis of simple stimulus features, and that (1980: 69). access to the P-system is controlled by a selective filter. Short-term Broadbent (1980) was genuinely ambivalent about what and long-term (store of conditional probabilities of past events) he called “academic psychology,” and although he recog- memory systems were postulated and integrated into the nized the importance of theory, unlike many of his contem- information processing system. (This figure is modeled on poraries, he rejected the hypothetico-deductive approach as Broadbent 1958: 297.) inefficient, advocating instead experiments whose results could discriminate between classes of theory (1958: 306) and generate a solid, empirical foundation (Broadbent another, irrelevant one is greatly improved if the messages 1973). In light of these attitudes, it is somewhat ironic that differ in a simple physical property such as location or Broadbent would have such a great impact on academic pitch; and, (3) the consequence of focusing on one message psychology (wherein his theoretical language helped foster is that the content of the ignored message is unreportable the cognitive revolution) and that his theory of attention (though simple physical properties can be picked up). would become the benchmark against which all subsequent These and other findings, and the theoretical framework theories are compared. Broadbent induced from them, were described in his first major work, Perception and Communication (1958). The See also ATTENTION AND THE HUMAN BRAIN; BEHAVIOR- information processing architecture of Broadbent’s famous ISM; INFORMATION THEORY; SIGNAL DETECTION THEORY filter theory (fig. 1) quickly became the most influential —Raymond M. Klein model of human cognitive activity cast in information pro- cessing terms. Thirteen years later, when Broadbent pub- References lished his second major work, Decision and Stress (1971), it was shown how the 1958 model needed modification: new Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge Uni- mechanisms for selection (pigeon-holing and categorizing) versity Press. that operated later in the processing sequence were added to Broadbent, D. E. (1958). Perception and Communication. Oxford: filtering and an emphasis on the statistical nature of evi- Pergamon. dence accumulation and decision making was incorporated. Broadbent, D. E. (1971). Decision and Stress. London: Academic In light of their growing popularity it might be suggested Press. Broadbent, D. E. (1973). In Defense of Empirical Psychology. that artificial NEURAL NETWORK models will replace the London: Methuen. information processing approach. While acknowledging the Broadbent, D. E. (1980). Donald E. Broadbent. In G. Lindzey, Ed., value of models cashed out in neural network terms, Broad- A History of Psychology in Autobiography, vol. 7. San Fran- bent (1985) points out (cf. MARR 1982) that such models are sisco: W. H. Freeman, pp. 39–73. at a different level of analysis (implementation) than infor- Broadbent, D. E. (1985). A question of levels: comment on mation processing models (computation); the appropriate McClelland and Rumelhart. Journal of Experimental Psychol- level depends on the nature of the problem to be solved. ogy: General 114: 189–192. Considering the problem of designing human-machine sys- Cherry, E. C. (1953). Some experiments on the recognition of tems for data-rich environments, Moray suggests an endur- speech, with one and with two ears. Journal of the Acoustical ing role for models like Broadbent’s: “Whatever the deep Society of America 26: 554–559. Craik, K. J. W. (1943). The Nature of Explanation. Cambridge: structure of attention may be, its surface performance is, in Cambridge University Press. the vast majority of cases, well described by a single, lim- Hebb, D. O. (1949). The Organization of Behavior: A Neuropsy- ited capacity channel, which is switched discretely among chological Theory. Wiley: New York. the various inputs” (1993: 113). Marr, D. (1982). Vision. San Fransisco: Freeman. Donald Broadbent was attracted to psychology because Moray, N. (1993). Designing for attention. In A. Baddeley and L. he believed that the application of psychological principles Weiskrantz, Eds., Attention: Selection, Awareness and Control: could benefit people. It is fitting, then, that in 1958 he was A Tribute to Donald Broadbent. New York: Oxford University selected to direct the APU, which he did for sixteen years. Press. During this time the APU—already widely respected— Shannon, C. E. (1948). A mathematical theory of communication. would become one of the world’s preeminent facilities for Bell System Technical Journal 27: 379–423, 623–656. Broca, Paul 97 Broca participated in that debate (1861a), and his April Further Readings 1861 report noted the relevance of case Leborgne, alias Baddeley, A., and L. Weiskrantz, Eds. (1993). Attention: Selection, “tan” (1861b). He chose the older, more prestigious anatom- Awareness and Control—A Tribute to Donald Broadbent. New ical society as the venue for publishing this case (1861c), York: Oxford University Press. stating his belief in cerebral localization in the convolutions, Broadbent, D. E. (1961). Behavior. London: Eyre and Spottis- outlining his views regarding regional structural differences woode. of the convolutions (prefiguring cytoarchitectonics) and Broadbent, D. E. (1977). Levels, hierarchies and the locus of con- suggesting that Leborgne’s left frontal lesion and loss of trol. Quarterly Journal of Experimental Psychology 32: 109– speech furnished evidence in support of these views. 118. Broca’s second case of loss of speech, patient Lelong, was Duncan, J. (1996). Information and uncertainty in a cumulative science of behavior: 25 years after Broadbent’s Decision and also published in the same bulletin (1861d); he expressed Stress. American Journal of Psychology 109: 617–625. surprise that the lesion was in the same place as in the previ- Klein, R. M. (1996). Attention: yesterday, today and tomorrow. ous case—left posterior frontal lobe—and again noted the Review of Attention: Selection, Awareness and Control—A compatibility with the theory of localization. Tribute to Donald Broadbent. American Journal of Psychology The role of the left versus the right hemisphere in lan- 109: 139–150. guage officially arose in 1863 when Gustave Dax deposited Posner, M. I. (1972). After the revolution . . . What? Review of for review a report that his father, Marc Dax, had presented Decision and Stress. Contemporary Psychology 17: 185–187. to the Montpellier medical society in 1836. In this report the Posner, M. I. (1987). Forward. In D. E. Broadbent, Perception and clinico-pathological correlations of forty cases of aphasia Communication. Oxford University Press, pp. v–xi. suggested that language function resided in the left hemi- sphere. While the existence of the 1836 Marc Dax mémoire Broca, Paul is not absolutely proven, the version written by his son Gustave Dax existed on 24 March 1863, when it was depos- During the 1850s Paul Broca (1824–1880) became an ited for review (it was published in 1865). The record also important and respected member of the French scientific shows that Broca’s own publication suggesting the special establishment, sufficient to overcome noteworthy political role of the left hemisphere did not appear until 2 April 1863. obstacles and to found the Société d’Anthropologie in 1859 The priority issue, Dax or Broca, is ably discussed by and remain its secretary until his death (Schiller 1979). In a Schiller (1979), Joynt and Benton (1964), and Cubelli and series of papers published between 1861 and 1866 employ- Montagna (1994); what is not in dispute is that by 1865 the ing the clinico-pathological correlation technique to analyze lateralization of language had become an empirical question. a loss of speech (aphemia), Broca persuaded a majority of Broca’s work was one more part of the ongoing debate his colleagues that there was a relatively circumscribed cen- on cerebral localization initiated by Gall and Bouillaud in ter, located in the posterior and inferior convolutions of the the early nineteenth century; “What Broca seems to have left frontal lobe, that was responsible for speech (langage contributed was a demonstration of this localization at a articulé). His conclusions have been enshrined in the time when the scientific community was prepared to take eponyms Broca’s Area and Broca’s APHASIA. Whether or the issue seriously” (Young 1970: 134–135). Less often not Broca’s conclusions constituted a scientific discovery, appreciated is the fact that every component of Broca’s anal- and whether or not he merits priority in this matter, has been ysis had been published before, between 1824 and 1849. In debated ever since (Moutier 1908; Souques 1928; Schiller 1824, Alexander Hood, an obscure Scottish general practi- 1979; Joynt and Benton 1964; Young 1970; Whitaker and tioner who believed in phrenological doctrine, published a Selnes 1975; Henderson 1986; Cubelli and Montagna 1994; case of what would later be called Broca’s Aphasia. Hood Eling 1994; Whitaker 1996). What is not in doubt is that distinguished between the motor control of the vocal tract cognitive neuroscience irrevocably changed after the publi- musculature, speech output control, and lexical-semantic cation of Broca’s papers; the cortical localization of lan- representation, albeit not in those terms, and assigned each a guage, and by implication other cognitive functions, was different left frontal lobe locus. Bouillaud (1825), discuss- now a serious, testable scientific hypothesis. ing cases presented earlier by Francois Lallemand, clearly Broca’s sources of knowledge about brain, intelligence, presented classic clinico-pathological correlation techniques and language functions included François Leuret and Louis as applied to expressive language. Marc Dax observed P. Gratiolet (1839–1857) and Gratiolet (1854), in which the (1836/1863) that Lallemand’s case histories documented history of cerebral localization was well described. Gratio- that aphasia-producing lesions were in the left hemisphere. let, a member of the anthropology society, argued from The historical question is to explain why Paul Broca in the comparative anatomy the importance of the frontal lobes. 1860s was suddenly able to focus neuroscience on brain Bouillaud, who had been influenced by Gall, argued for lan- localization and lateralization of language. guage localization in the frontal lobe on clinical evidence. One must acknowledge that Gall’s craniology Broca knew Bouillaud personally, had been to his house, (Spurzheim’s phrenology) had stigmatized research on cere- and even had considered studying internal medicine with bral localization. The doctrine of (brain) symmetry, persua- him (Schiller 1979: 172). In early 1861 meetings of the sively articulated by Xavier Bichat at the beginning of the anthropology society, Auburtin led a discussion on the ques- century, posed a major theoretical hurdle. Jean-Pierre Flou- tion of localizing mental functions to distinct parts of the rens (1794–1867), an influential member of France’s scien- brain, specifically on localizing speech to the frontal lobe. tific establishment, opposed the cortical localization of 98 Cajal, Santiago Ramón y cognitive functions. Finally, language was considered as Eling, P. (1994). Paul Broca (1824–1880). In P. Eling, Ed., Reader in the History of Aphasia. Amsterdam: John Benjamins, pp. verbal expression, as speech or as an output function prima- 29–58. rily motoric in nature. What we today recognize as language Gratiolet, P. (1854). Mémoire sur les Plis Cérébraux de l'Homme et comprehension fell under the rubric of intelligence or gen- des Primates. Paris: Bertrand. eral intellectual functions. Much of the clinical evidence Henderson, V. W. (1986). Paul Broca’s less heralded contributions that had been marshaled against Bouillaud came from to aphasia research. Archives of Neurology 43 (6): 609–612. patients with posterior left hemisphere lesions who mani- Hood, A. (1824). Case 4th—28 July 1824 (Mr. Hood’s cases of fested aphasia; with no theoretical construct of language injuries of the brain). The Phrenological Journal and Miscel- comprehension such data could only be interpreted as lany 2: 82–94. counter-evidence. The same arguments were offered against Joynt, R. J., and A. L. Benton. (1964). The memoir of Marc Dax Broca, of course. It was the work of Theodor Meynert on aphasia. Neurology 14: 851–854. (1867), Henry Charlton Bastian (1869), and finally Carl Leuret, F., and P. Gratiolet. (1839–1857). Anatomie Comparée du Wernicke (1874) on disorders of comprehension that com- Système Nerveux Considéré dans Ses Rapports avec l'Intelli- gence. 2 tomes. Paris: Didot. pleted the model of language localization, thus setting the Meynert, T. v. (1886). Ein Fall von Sprachstorung, anatomisch stage for the development of modern neurolinguistics and begründet. Medizinische Jahrbücher. Redigiert von C. Braun, creating a historical niche for Paul Broca. A. Duchek, L. Schlager. XII. Band der Zeitschrift der K. K. See also CORTICAL LOCALIZATION, HISTORY OF; HEMI- Gesellchaft der Arzte in Wien. 22. Jahr: 152–189. SPHERIC SPECIALIZATION; LANGUAGE, NEURAL BASIS OF Moutier, F. (1908). L'Aphasie de Broca. Paris: Steinheil. Schiller, Fr. (1979). Paul Broca. Founder of French Anthropology, —Harry A. Whitaker Explorer of the Brain. Berkeley: University of California Press. Souques, A. (1928). Quelques Cas d’Anarthrie de Pierre Marie. References Aperçu historique sur la localisation du langage. Revue Neu- rologique 2: 319–368. Bastian, C. (1869). On the various forms of loss of speech in cere- Wernicke, C. (1874). Der aphasische Symptomenkomplex: Eine bral disease. British and Foreign Medical and Surgical Review, psychologische Studie auf anatomischer Basis. Breslau: Max January, p. 209, April, p. 470. Cohn und Weigert. Bouillaud, J. B. (1825). Recherches cliniques propres à démontrer Whitaker, H. A., and O. A. Selnes. (1975). Broca’s area: a problem que la perte de la parole correspond à la lésion des lobules in brain-language relationships. Linguistics 154/155: 91–103. antérieurs du cerveau, et à confirmer l’opinion de M. Gall, sur Whitaker, H. A. (1996). Historical antecedents to Geschwind. In le siège de l’organe du langage articulé. Archives Générales de S.C. Schachter and O. Devinsky, Eds., Behavioral Neurology Médecine tome VIII: 25–45. and the Legacy of Norman Geschwind. New York: Lippincott- Broca, P. (1861a). Sur le principe des localisations cérébrales. Bul- letin de la Société d'Anthropologie tome II: 190–204. Raven, pp. 63–69. Broca, P. (1861b). Perte de la parole, ramollissement chronique et Young, R. M. (1970). Mind, Brain and Adaptation in the Nine- destruction partielle du lobe antérieur gauche. [Sur le siège de teenth Century. Cerebral Localization and its Biological Con- text from Gall to Ferrier. Oxford: Clarendon Press. la faculté du langage.] Bulletin de la Société d'Anthropologie tome II: 235–238. Broca, P. (1861c). Remarques sur le siège de la faculté du langage Cajal, Santiago Ramón y articulé, suivies d’une observation d’aphémie. Bulletin de la Société Anatomique tome XXXVI: 330–357. Broca, P. (1861d). Nouvelle observation d’aphémie produite par Santiago Ramón y Cajal (1852–1934) was one of the most une lésion de la moitié postérieure des deuxième et troisième outstanding neuroscientists of all time. He was born in circonvolution frontales gauches. Bulletin de la Société Petil-la de Aragón, a small village in the north of Spain. He Anatomique tome XXXVI: 398–407. studied medicine in the Faculty of Medicine in Zaragoza. Broca, P. (1863). Localisations des fonctions cérébrales. Siège de In 1883, Cajal was appointed in 1892 as chair of Descrip- la faculté du langage articulé. Bulletin de la Société d'Anthro- tive and General Anatomy at the University of Valencia. In pologie tome IV: 200–208. 1887, he moved to the University of Barcelona, where he Broca, P. (1865). Du siège de la faculté du langage articulé dans was appointed to the chair of Histology and Pathological l’hémisphère gauche du cerveau. Bulletin de la Société Anatomy. At the University of Madrid, where he remained d'Anthropologie tome VI: 377–393. until retirement, he was appointed to the chair of Histology Broca, P. (1866). Sur la faculté générale du langage, dans ses rap- ports avec la faculté du langage articulé. Bulletin de la Société and Pathological Anatomy. Dr. Cajal received numerous d'Anthropologie deuxième série, tome I: 377–382. prizes, honorary degrees, and distinctions, among the most Cubelli, R., and C. G. Montagna. (1994). A reappraisal of the con- important being the 1906 Nobel Prize for physiology or troversy of Dax and Broca. Journal of the History of the Neuro- medicine. To describe the work of Cajal is rather a difficult sciences 3: 1–12. task, because, unlike other great scientists, he is not known Dax, G. (1836/1863). Observations tendant à prouver la coinci- for one discovery only, but for his many and important con- dence constante des dérangements de la parole avec une tributions to our knowledge of the organization of the ner- lésion de I’hémisphère gauche du cerveau. C. R. hebdoma- vous system. Those readers interested in his life should daire des séances Académie Science tome LXI (23 mars): 534. consult his autobiography (Cajal 1917), where there is also Dax, M. (1865). Lésions de la moitié gauche de I’encéphale coin- a brief description of his main discoveries and theoretical cidant avec l’oubli des signes de la pensée. Gazette hebdoma- ideas. daire médicale deuxième série, tome II: 259–262. Case-Based Reasoning and Analogy 99 phase, Cajal also published some important papers on the The detailed study of the nervous system began in the structure of the RETINA and optic centers of invertebrates. middle of the last century. Before Cajal’s discoveries, very little was known about the neuronal elements of the nervous Interestingly, Golgi, as well as most neurologists, neu- system, and knowledge about the connections between its roanatomists, and neurohistologists of his time, was a fer- different parts was purely speculative. The origin of nerve vent believer in the reticular theory of nerve continuity. fibers was a mystery, and it was speculated that they arose However, for Cajal the neuron doctrine was crystal clear. from the gray matter independently of the nerve cells (neu- Microphotography was not well developed at that time, and rons). This lack of knowledge was due mainly to the fact virtually the only way to illustrate observations was by that appropriate methods for visualizing neurons were not means of drawings, which were open to skepticism (DeFe- available; the early methods of staining only permitted the lipe and Jones 1992). Some of Cajal’s drawings were con- visualization of neuronal cell bodies, a small portion of their sidered artistic interpretations rather than accurate copies of proximal processes, and some isolated and rather poorly his preparations. Nevertheless, examination of Cajal’s prep- stained fibers. It was in 1873 that the method of Camillo arations, housed in the Cajal Museum at the Cajal Institute, GOLGI (1843–1926) appeared; for the first time, neurons proves the exactness of his drawings (DeFelipe and Jones were readily observed in histological preparations with all 1988, 1992). Although Cajal had the same microscopes and their parts: soma, dendrites, and axon. Furthermore, Golgi- produced similar histological preparations with comparable stained cells displayed the finest morphological details with quality of staining as the majority of the neurohistologists of an extraordinary elegance, which led to the characterization his time, he saw differently than they did. This was the and classification of neurons, as well as to the study of their genius of Cajal. possible connections. In 1906 Golgi was awarded the Nobel See also CORTICAL LOCALIZATION, HISTORY OF; NEURON Prize for physiology or medicine for discovering this tech- —Javier DeFelipe nique. Cajal shared the Nobel Prize with Golgi in the same year, for his masterful interpretations of his preparations in which he applied the method of Golgi. References Cajal was not introduced to a scientific career under the Cajal, S. R. (1909, 1911). Histologie du Système Nerveux de direction of any scientist, as then usually occurred with l’Homme et des Vertébrés, L. Azoulay, trans. Paris: Maloine. most scientists, but rather he became a prominent neurohis- Translated into English as Histology of the Nervous System of tologist on his own. The career of Cajal can be divided into Man and Vertebrates (N. Swanson and L.W. Swanson, trans.). three major phases (DeFelipe and Jones 1991). New York: Oxford University Press, 1995. The first phase extended from the beginning in 1877 until Cajal, S. R. (1913–1914). Estudios sobre la Degeneración y 1887, when he was introduced to Golgi’s method. During Regeneración del Sistema Nervioso. Madrid: Moya. Translated this period he published a variety of histological and micro- into English as Degeneration and Regeneration of the Nervous biological studies, but they were of little significance. System (R. M. May, tran. and Ed.). London: Oxford University Press, 1928. Reprinted and edited with additional translations The second phase (1887–1903) was characterized by by J. DeFelipe and E. G. Jones (1991), Cajal’s Degeneration very productive research, in which he exploited the Golgi and Regeneration of the Nervous System. New York: Oxford method in order to describe in detail almost every part of the University Press. central nervous system. These descriptions were so accurate Cajal, S. R. (1917). Recuerdos de mi Vida, Vol. 2: Historia de mi that his classic book Histologie (Cajal 1909, 1911), in Labor Científica. Madrid: Moya. Translated into English as which these studies are summarized, is still a reference book Recollections of My Life (E. H. Craigie and J. Cano, trans.). in all neuroscience laboratories. Also, during the first few Philadelphia: American Philosophical Society, 1937. Reprinted, years of this second phase, Cajal found much evidence in Cambridge, MA: MIT Press, 1989. favor of the neuron doctrine, which contrasted with the DeFelipe, J., and E. G. Jones. (1988) Cajal on the Cerebral Cortex. other more commonly accepted reticular theory. The neuron New York: Oxford University Press. DeFelipe, J., and E. G. Jones. (1991). Cajal’s Degeneration and doctrine, the fundamental organizational and functional Regeneration of the Nervous System. New York: Oxford Uni- principle of the nervous system, states that the neuron is the versity Press. anatomical, physiological, genetic, and metabolic unit of the DeFelipe, J., and E. G. Jones. (1992). Santiago Ramón y Cajal and nervous system, whereas for the reticular theory the nervous methods in neurohistology. Trends in Neuroscience 15: 237– system consists of a diffuse nerve network formed by the 246. anastomosing branches of nerve cell processes (either both Jones, E. G. (1994). The neuron doctrine. Journal of History of dendritic and axonal, or only axonal), with the cell somata Neuroscience 3: 3–20. having mostly a nourishing role (for review, see Shepherd Shepherd, G. M. (1991). Foundations of the Neuron Doctrine. New 1991; Jones 1994). York: Oxford University Press The third phase of Cajal’s career began in 1903, with his discovery of the reduced silver nitrate method, and ended Case-Based Reasoning and Analogy with his death in 1934; this period was devoted mainly to the investigation of traumatic degeneration and regeneration of the nervous system. He published numerous scientific Case-based reasoning (CBR) refers to a style of designing a papers about this subject that were of great relevance, and system so that thought and action in a given situation are which were summarized in another classic book, Degenera- guided by a single distinctive prior case (precedent, proto- tion and Regeneration (Cajal 1913–1914). During this type, exemplar, or episode). Historically and philosophically, 100 Case-Based Reasoning and Analogy CBR exists as a reaction to rule-based reasoning: in CBR, of software systems and that takes its main themes and the emphasis is on the case, not the rule. inspiration from psychology (e.g., Rosch and Lloyd 1978). CBR works with a set of past cases, a case base. CBR Roger Schank (1982) imposes the view that case-based rea- seeks to determine a “source case” relevant to a given “target soning mimics human MEMORY. He refers to cases as case”. All CBR systems separate their reasoning into two “memories,” retrieval of cases as “remindings,” and repre- stages: (1) finding the appropriate source case (retrieving); sentation of cases as “memory organization.” Systems that and (2) determining the appropriate conclusions in the target owe their origin to this school of thought are considerable in case (revising/reusing). All CBR systems must have some scope and ability. There are case-based planners, case-based way of augmenting their case base or learning new cases, problem-solvers, case-based diagnosticians, case-based even if this simply involves appending to a list of stored financial consultants, case-based approaches to learning cases. Retrieval is described as finding the “most similar” (including EXPLANATION-BASED LEARNING), and case- past case or the “nearest neighbor”; this just begs the ques- based illuminations of search. tion of what is the appropriate similarity or distance metric. Research in this style tends to taxonomize issues and To reason about an Italian automobile, consider past approaches. There are qualitative and quantitative metrics of examples of automobiles. If the most similar retrieved case similarity; there are approaches that seek to understand the is a European automobile, this is a better source of informa- causality underlying a case, and approaches that do not. The tion than a past example of an American automobile, all literature contains both a rich conceptual cartography and things being equal. some of the most accessible polemics on the importance of A set of cases might be viewed as a corpus from which a nonstatistical approach to the logging of past cases. rules could potentially be gleaned, but the appropriate gen- A good example of a case-based system is Katia Sycara’s eralizations of which have not yet been performed. In this PERSUADER, which reasons about labor-management negoti- view, CBR is a postponement of INDUCTION. The advantage ations. It uses past agreements between similarly situated of the raw cases, in this view, is that revision of the rule base parties to suggest proposals that might succeed in the cur- can better be performed because the original cases remain rent negotiation. Past and present situations are compared available; they have not been discarded in favor of the rules on features such as wage rates and market competitiveness, that summarize them. and include structural models of how changing one feature To guide deliberation in a situation, a case-based rea- affects another. Past agreements can be transformed accord- soner represents and transforms the rationale of a precedent ing to the differences between past and present, possibly or the etiology of a prior case. By hypothesis, a single case including numerically scaling the size of settlements. suffices for guidance if it is the appropriate case and it is Case-based reasoning moved to the center of AI when transformed properly. In contrast, rule-based reasoners (e.g., the logical issues of postponing rule formation were sepa- EXPERT SYSTEMS and DEDUCTIVE REASONING) apply a rule rated from the psychological issues of stuctural ANALOGY. to a situation with no transformation. Stuart Russell (1989) defined precisely the “logical problem In both rule-based and case-based reasoning, managing of analogy” so that it could be studied with precision in the the interaction of multiple sources of guidance is crucial. In philosophical tradition. Russell proposed that there were CBR, different cases can suggest conflicting conclusions; in relations between predicates, called “determinations,” that rule-based reasoning, several rules might conflict. In one, would permit a single co-occurrence of P and Q to lead to choose a case; in the other, choose a rule. Nonmonotonic rea- the rule “if P(x) then Q(x).” Thus, a person’s nationality soning is fundamentally concerned with both kinds of choice. determines that person’s language, but does not determine In practice, the separation of CBR from other forms of marital status. The logical formulation showed clearly what reasoning is imperfect. An interplay of rules and cases is would be needed to formally justify analogy. Analogy is unavoidable. A case can almost always be viewed as a com- either presumptuous (thus, fallible, defeasible, or otherwise pact representation of a set of rules. CBR is just one form of susceptible to discount and revision), or else it brings extensional programming (other examples are PATTERN knowledge to bear that permits a single case to skew an entire (statistical) reference class. Like Nelson Goodman’s RECOGNITION AND FEEDFORWARD NETWORKS, MACHINE LEARNING, and statistical learning) though CBR performs (1972) paradox of “grue,” which raises the question of justi- its generalizations on-line, while others preprocess their fied projection with many cases, “determinations” raise the generalizations. question of justified projection from a single case. Nevertheless, CBR is a distinctly different paradigm. The Kevin Ashley (1990) showed how cases are used in legal emphasis is on the unique properties of each case, not the reasoning. Cases and precedents are fundamental to philos- statistical properties of numerous cases. CBR differs from ophy of law; AI and law have been equally concerned with induction because induction derives its power from the the proper modeling of cases. Ashley noted that some fea- aggregation of cases, from the attempt to represent what tures describing cases are inherently proplaintiff or pro– tends to make one case like or unlike another. CBR derives defendant. Understanding this distinction permits deeper its power from the attempt to represent what suffices to comparisons of similarity. CBR appears in moral and legal make one case like or unlike another. CBR emphasizes the philosophy under the name “casuistry.” structural aspects of theory formation, not the statistical Earlier researchers defended CBR by citing contempo- aspects of data. rary psychology. Ashley and Russell connected CBR to Case-based reasoning is usually associated with work immense literatures that were historically concerned with that has been called “scruffy”: work that aims at the design the significance of the single case. Current work on CBR Categorial Grammar 101 continues to revolve around these two foci: psychology- Riesbeck, C., and R. Schank. (1989). Inside Case-Based Reason- ing. Erlbaum. inspired themes for systems design, and the precise under- Skalak, D., and E. Rissland. (1992). Arguments and cases: an inev- standing of reasoning with the single case. itable intertwining. Artificial Intelligence and Law 1. See also PROBABILITY, FOUNDATIONS OF; SIMILARITY; Sunstein, C. (1996). Legal Reasoning and Political Conflict. STATISTICAL LEARNING THEORY Oxford. Winston, P. (1980). Learning and reasoning by analogy. Comm. —Ronald Loui ACM 23. References Categorial Grammar Ashley, K. (1990). Modeling Legal Arguments: Reasoning with Cases and Hypotheticals. Cambridge, MA: MIT Press. The term Categorial Grammar (CG) refers to a group of the- Goodman, N. (1972). Problems and Projects. Indianapolis: Bobbs- ories of natural language syntax and semantics in which the Merrill. main responsibility for defining syntactic form is borne by Rosch, E., and B. Lloyd, Eds. (1978). Cognition and Categoriza- the LEXICON. CG is therefore one of the oldest and purest tion. Erlbaum. examples of a class of lexicalized theories of grammar that Russell, S. (1989). The Use of Knowledge in Analogy and Induc- also includes HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR, tion. London: Pitman. Schank, R. (1982). Dynamic Memory: A Theory of Reminding and LEXICAL FUNCTIONAL GRAMMAR, Tree-adjoining grammar, Learning in Computers and People. Cambridge: Cambridge Montague grammar, RELATIONAL GRAMMAR, and certain University Press. recent versions of the Chomskean theory. The various modern versions of CG are characterized by a Further Readings much freer notion of derivational syntactic structure than is assumed under most other formal or generative theories of Berman, D., and C. Hafner. (1991). Incorporating procedural con- grammar. All forms of CG also follow Montague (1974) in text into a model of case-based legal reasoning. Proc. Intl. sharing a strong commitment to the Principle of COMPOSI- Conf. on AI and Law Oxford. TIONALITY—that is, to the assumption that syntax and inter- Berman, D., and C. Hafner. (1993). Representing teleological con- pretation are homomorphically related and may be derived in cepts in case-based legal reasoning: the missing link. Proc. Intl. tandem. Significant contributions have been made by Cate- Conf. on AI and Law Oxford. gorial Grammarians to the study of SEMANTICS, SYNTAX, Branting, K. (1991). Explanations with rules and structured cases. MORPHOLOGY, intonational phonology, COMPUTATIONAL International Journal of Man-Machine Studies 34. Burstein, M. (1985). Learning by Reasoning from Multiple Analo- LINGUISTICS, and human SENTENCE PROCESSING. gies. Ph.D. diss., Yale University. There have been two styles of formalizing the grammar of Carbonell, J. (1981). A computational model of analogical prob- natural languages since the problem was first articulated in lem-solving. Proc. IJCAI Vancouver. the 1950s. Chomsky (1957) and much subsequent work in Cohen, M., and E. Nagel. (1934). An Introduction to Logic and GENERATIVE GRAMMAR begins by capturing the basic facts Scientific Method. Harcourt Brace. of English constituent order exemplified in (1) in a Context- Davies, T., and S. Russell. (1987). A logical approach to reasoning free Phrase Structure grammar (CFPSG) or system of rewrite by analogy. Proc. IJCAI Milan. rules or “productions” like (2), which have their origin in DeJong, G. (1981). Generalizations based on explanations. Proc. early work in recursion theory by Post, among others. IJCAI Vancouver. Gardner, A. (1987). An AI Approach to Legal Reasoning. Cam- (1) Dexter likes Warren. bridge, MA: MIT Press. Gentner, D. (1983). Structure mapping: a theoretical framework (2) S → NP VP for analogy. Cognitive Science 7. VP → TV NP Hammond, K. (1990). Case-based planning: a framework for plan- TV → { likes, sees, . . . } ning from experience. Cognitive Science 14. Hesse, M. (1966). Models and Analogies in Science. University of Categorial Grammar (CG), together with its close cousin Notre Dame Press. Dependency Grammar (which also originated in the 1950s, Keynes, J. (1957 [1908]). A Treatise on Probability. MacMillan. in work by Tesnière), stems from an alternative approach to Kolodner, J. (1993). Case-Based Reasoning. Morgan Kaufman. the context-free grammar pioneered by Bar-Hillel (1953) Koton, P. (1988). Using Experience in Learning and Problem Solv- and Lambeck (1958), with earlier antecedents in Ajduk- ing. Ph.D. diss., MIT. iewicz (1935) and still earlier work by Husserl and Russell Leake, D., Ed. (1996). Case-Based Reasoning: Experiences, Les- sons, and Future Directions. Cambridge, MA: MIT Press. in category theory and the theory of types. Categorial Gram- Loui, R. (1989). Analogical reasoning, defeasible reasoning, and mars capture the same information by associating a func- the reference class. Proc. Knowledge Representation and Rea- tional type or category with all grammatical entities. For soning Toronto. example, all transitive verbs are associated via the lexicon Loui, R., and J. Norman. (1995). Rationales and argument moves. with a category that can be written as follows: Artificial Intelligence and Law 3. Mitchell, T., R. Keller, and S. Kedar-Cabelli. (1986). Explanation- (3) likes := (S\NP)/NP based generalization: a unifying view. Machine Learning Jour- The notation here is the “result leftmost” notation nal 1. according to which α/β and α\β represent functions from β Raz, J. (1979). The Authority of Law. Oxford. 102 Categorial Grammar into α, where the slash determines that the argument β is tion. One possible derivation of a complex relative clause respectively to the right (/) or to the left (\) of the functor. comes out as follows in one fairly typical version, “Combi- Thus the transitive verb (3) is a functor over NPs to its right- natory” Categorial Grammar (CCG), discussed at length by yielding predicates, or functors over NPs to the left, which the present author (see “Further Reading”), in which type- in turn yield S. (There are several other notations for catego- raising and composition are for historical reasons indicated rial grammars, including the widely used “result on top” by T and B, respectively. notation of Lambek 1958 and much subsequent work, (6) a woman whom Dexter thinks that Warren likes according to which the above category is written (NP\S)/NP. T T (N\N)/(S/NP) S/(S\NP) (S\NP)/S' S'/S S/(S\NP) (S\NP)/NP The advantage of the present notation for cognitive scien- B tists is that semantic type can be read in a consistent left- S/S' right order, regardless of directionality.) B S/S In “pure” context-free CG, categories can combine via B two general function application rules, which in the present S/(S\NP) B notation are written as in (4), to yield derivations, written as S/NP in (5a), in which underlines indexed with right and left arrows indicate the application of the two rules. N\N (4) Functional application Notice that this analysis bears no resemblance to a tradi- Y⇒X a. X/Y tional right-branching clause structure modified by struc- X\Y ⇒ X b. Y ture-preserving movement transformations. The alternative, deductive, style of Categorial Grammar, (5) a. Dexter likes Warren b. Dexter likes Warren pioneered by van Benthem (1986) and Moortgat (1988), NP (S\NP)/NP NP NP V NP takes as its starting point Lambek’s syntactic calculus. The Lambek system embodies a view of the categorial slash as a S\NP VP form of logical implication for which a number of axioms or < inference rules define a proof theory. (For example, func- S S tional application corresponds to the familiar classical rule of modus ponens under this view). A number of further axi- Such derivations are equivalent to traditional trees like oms give rise to a deductive calculus in which many but not (5b) in CFPSG. However, diagrams like (5a) should be all of the rules deployed by the alternative rule-based gener- thought of as derivations, delivering a compositional inter- alizations of CG are theorems. For example, the derivation pretation directly, rather than a purely syntactic structure. (6) corresponds to a proof in the Lambek calculus using The identification of derivation with interpretation becomes type-raising and composition as lemmas. important when we consider the extensions of CG that take The differences between these approaches make them- it beyond weak equivalence with CFPSG. selves felt when the grammars in question are extended A central problem for any theory of grammar is to cap- beyond the weak context-free power of the Lambek calculus ture the fact that elements of sentences that belong together and the combinatory rules that are theorems thereof, as they at the level of semantics or interpretation may be separated must be to capture natural language in an explanatory fash- by much intervening material in sentences, the most obvious ion. The problem is that almost any addition of axioms cor- example in English arising from the relative clause con- responding to the non-Lambek combinatory rules that have struction. All theories of grammar respond to this problem been proposed in the rule-based framework causes a col- by adding something such as the transformationalists’ WH- lapse of the calculus into “permutation completeness”—that MOVEMENT, GPSG feature-passing, ATN HOLD registers, or is, into a grammar that accepts all permutations of the words whatever to a context-free core. Usually, such additions of any sentence it accepts. This forces the advocates of the increase automata-theoretic power. To the extent that the Lambek calculus into the “multimodal” systems involving constructions involved seem to be quite severely con- many distinct slashes encoding multiple notions of implica- strained, and that certain kinds of long-range dependencies tion (Morrill 1994), and forces the advocates of rule-based seem to be universally prohibited, there is clearly some systems to impose type restrictions on their rules. (Neverthe- explanatory value in keeping such power to a minimum. less, Joshi, Vijay-Shanker, and Weir 1991 show that certain All of the generalizations of categorial grammar respond rule-based CGs remain of low automata-theoretic power.) to this problem by adding various type-driven combinatory These two styles of CG are reviewed and compared at operators to pure CG. The many different proposals for how length by Moortgat (1988) (with a deductive bias), and Wood to do this fall under two quite distinct approaches. The first, (1993) (with a rule-based bias) (see Further Readings). To rule-based, approach, pioneered by Lyons (1968), Bach some extent the same biases are respectively exhibited in the (1976), and Dowty (1979), among other linguists, and by selection made in two important collections of papers edited Lewis (1970) and Geach (1972), among philosophical logi- by Buszkowski, Marciszewiski, and van Benthem (1988) cians, starts from the pure CG of Bar-Hillel, and adds rules and Oehrle, Bach, and Wheeler (1988) (see Further Read- corresponding to simple operations over categories, such as ings), which include several of the papers cited here. “wrap” (or commutation of arguments), “type-raising,” The differences are less important for the present pur- (which resembles the application of traditional nominative, pose than the fact that all of these theories have the effect of accusative, etc. case to NPs, etc.) and functional composi- Categorial Grammar 103 engendering derivational structures that are much freer than mar to the processor to simplify the problem of explaining traditional surface structures, while nonetheless guarantee- the availability to human sentence processors of semantic ing that the nonstandard derivations deliver the same seman- interpretations for fragments like the flowers sent for, as evi- tic interpretation as the standard ones. For example, because denced by the effect of this content in (b) below in eliminat- all of these theories allow the residue of relativization Dex- ing the “garden-path” effect of the ambiguity in (a), ter thinks that Warren likes in example (6) to be a deriva- discussed by Crain and Steedman (1985) and Altman tional constituent of type S/NP, they also all allow a (1988). nonstandard analysis of the canonical sentence Dexter (10) a. The doctor sent for the patient died. thinks that Warren likes these flowers in terms of an identi- b. The flowers sent for the patient died. cally derived constituent followed by an object NP: All of these phenomena imply that the extra structural (7) [[Dexter thinks that Warren likes]S/NP [these flow- ambiguity engendered by generalized categorial grammars ers]NP]S is not “spurious,” but a property of competence grammar itself. This is a surprising property, because it seems to flout all See also MINIMALISM; PROSODY AND INTONATION; SYN- received opinion concerning the surface constituency of TAX-SEMANTICS INTERFACE English sentences, suggesting that a structure in which objects—even embedded ones—dominate subjects is as —Mark Steedman valid as the standard one in which subjects dominate objects. The implication is that the BINDING THEORY (which References must explain such facts as that in every language in the world you can say the equivalent of Warren and Dexter Ajdukiewicz, K. (1935). Die syntaktische Konnexitat. In S. shave each other but not Each other shave Dexter and War- McCall, Ed., Polish Logic 1920–1939. Oxford University Press, ren) must be regarded as a property of semantic interpreta- pp. 207–231. Translated from Studia Philosophica 1: 1–27. tion or LOGICAL FORM rather than of surface structure as Altmann, G. (1988). Ambiguity, parsing strategies, and computa- such (cf. Dowty 1979; Szabolcsi 1989; Chierchia 1988; tional models. Language and Cognitive Processes 3: 73–98. Hepple 1990; Jacobson 1992). Bach, E. (1976). An extension of classical transformational gram- These proposals also imply that there are many semanti- mar. In Problems in Linguistic Metatheory: Proceedings of the 1976 Conference at Michigan State University 183–224. cally equivalent surface derivations for every traditional Bar-Hillel, Y. (1953). A quasi-arithmetical notation for syntactic one, a problem that is sometimes misleadingly referred to as description. Language 29: 47–58. “spurious ambiguity,” and which appears to make parsing Carpenter, R. (1997). Type-Logical Semantics. Cambridge, MA: more laborious. However, this problem can be eliminated MIT Press. using standard chart-parsing techniques with an equivalence Chierchia, G. (1988). Aspects of a categorial theory of binding. In check on Logical Forms associated with constituents, as R. T. Oehrle, E. Bach, and D. Wheeler, Eds., Categorial Gram- proposed by Karttunen (1989) and other advocates of unifi- mars and Natural Language Structures. (Proceedings of the cation-based computational realizations of CG—see Car- Conference on Categorial Grammar, Tucson, AZ, June 1985.) penter 1997 for a review. Dordrecht: Reidel, pp. 153–98. Flexible or combinatory Categorial Grammars of all Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton. Crain, S., and M. Steedman. (1985). On not being led up the gar- kinds have real advantages for capturing a number of phe- den path: the use of context by the psychological parser. In L. nomena that are problematic for more traditional theories of Kartunnen, D. Dowty, and A. Zwicky, Eds., Natural Language grammar. For example, as soon as the analysis in (7) is Parsing: Psychological, Computational and Theoretical Per- admitted, we explain why similar fragments can behave like spectives. ACL Studies in Natural Language Processing. Cam- constituents for purposes of coordination: bridge: Cambridge University Press, pp. 320–358. Dowty, D. (1979). Dative movement and Thomason’s extensions (8) [[I dislike]S/NP, but [Dexter thinks that Warren likes] S/NP of Montague Grammar. In S. Davis and M. Mithun, Eds., Lin- [these flowers]NP]S guistics, Philosophy, and Montague Grammar. Austin: Univer- sity of Texas Press. (Other even more spectacular coordinating nonstandard Dowty, D. (1988). Type-raising, functional composition, and non- fragments are discussed in Dowty 1988.) constituent coordination. In R. T. Oehrle, E. Bach, and D. We also explain why intonation seems similarly able to Wheeler, Eds., Categorial Grammars and Natural Language treat such fragments as phrasal units in examples like the Structures. Proceedings of the Conference on Categorial Gram- following, in which % marks an intonational boundary or mar, Tucson, AZ, June 1985. Dordrecht: Reidel, pp. 153–198. break, and capitalization indicates STRESS (cf. Oehrle et al. Geach, P. (1972). A program for syntax. In D. Davidson and G. 1985; Prevost 1995): Harman, Eds., Semantics of Natural Language. Dordrecht: Reidel, pp. 483–497. (9) [Q:] I know who YOU like, but who does DEXTER Hepple, M. (1990). The Grammar and Processing of Order and like? Dependency: A Categorial Aproach. Ph.D. diss., University of [A:] [DEXTER likes]S/NP % [WARREN]NP Edinburgh. Jacobson, P. (1992). The lexical entailment theory of control and Moreover, the availability of semantic interpretations for the tough construction. In I. Sag and A. Szabolcsi, Eds., Lexical such nonstandard constituents appears under certain plausi- Matters. Chicago: CSLI/Chicago University Press, pp. 269– ble assumptions about the relation of the competence gram- 300. 104 Categorization reviews). A powerful but controversial idea is that SIMILAR- Joshi, A., K. Vijay-Shanker, and D. Weir. (1991). The convergence of mildly context-sensitive formalisms. In P. Sells, S. Shieber, ITY is an organizing principle. Within this framework, there and T. Wasow, Eds., Processing of Linguistic Structure. Cam- are important distinctions concerning just how similarity bridge MA: MIT Press, pp. 31–81. operates, but we will not be concerned with them here (see Karttunen, L. (1989). Radical lexicalism. In M. R. Baltin and A. S. Medin 1989 for a review). Simply stated, this view suggests Kroch, Eds., Alternative Conceptions of Phrase Structure. Chi- that we put things in the same categories because they are cago: University of Chicago Press. similar to each other. A robin and a hawk (both birds) seem Lambek, J. (1958). The mathematics of sentence structure. Ameri- obviously more similar than a robin and an elephant (not a can Mathematical Monthly 65: 154–170. bird); elephants are not birds because they are not suffi- Lewis, D. (1970). General semantics. Synthese 22: 18–67. ciently similar to them. A natural consequence of this simi- Lyons, J. (1968). Introduction to Theoretical Linguistics. Cam- larity view is that the world is organized for us and our bridge: Cambridge University Press. Montague, R. (1974). Formal philosophy: Papers of Richard Mon- categories map onto this reality (e.g., Rosch and Mervis tague, Richmond H. Thomason, Ed. New Haven: Yale Univer- 1975). sity Press. Why is this notion that categories are defined by some Moortgat, M. (1988). Categorial Investigations. Ph.D. diss., Uni- “objective” similarity controversial? The main criticism has versiteit van Amsterdam. (Published by Foris, 1989). been that the notion of similarity is too unconstrained to be Morrill, G. (1994). Type-Logical Grammar. Dordrecht: Kluwer. useful as an explanatory principle (Goodman 1972; Murphy Oehrle, R. T. (1988). Multidimensional compositional functions as and Medin 1985). Similarity is usually defined in terms of a basis for grammatical analysis. In R. T. Oehrle, E. Bach, and shared properties, but Goodman argued that any two things D. Wheeler, Eds., Categorial Grammars and Natural Language share an unlimited number of properties (e.g., robins and Structures. (Proceedings of the Conference on Categorial elephants can move, weigh more than an ounce, weigh more Grammar, Tucson, AZ, June 1985). Dordrecht: Reidel, pp. 349–390. than two ounces, take up space, can be thought about, etc.). Prevost, S. (1995). A Semantics of Contrast and Information Struc- Given this apparent flexibility, it may be that we see things ture for Specifying Intonation in Spoken Language Generation. as similar because they belong to the same category and not Ph.D. diss., University of Pennsylvania, Philadelphia, PA. vice versa. That is, maybe we can explain similarity in terms Szabolcsi, A. (1989). Bound variables in syntax: are there any? In of categories. R. Bartsch, J. van Benthem, and P. van Emde Boas, Eds., An alternative to the similarity view of categorization is Semantics and Contextual Expression. Dordrecht: Foris, pp. that theories provide conceptual coherence (Carey 1985; 295–318. Keil 1989; Medin 1989; Rips 1989; Hirschfeld and Gelman van Benthem, J. (1986). Essays in Logical Semantics. Dordrecht: 1994). The theory-based explanation of categorization is Reidel. consistent with the idea that CONCEPTS are comprised of Further Readings features or properties. By concept, we mean the mental rep- resentation of a category that presumably includes more Buszkowski, W., W. Marciszewski, and J. van Benthem, Eds. than procedures for identifying or classifying. These expla- (1988). Categorial Grammar. Amsterdam: John Benjamins. nations go beyond similarity models in arguing that under- Moortgat, M. (1997). Categorial type logics. In J. van Benthem lying principles (often causal) determine which features are and A. ter Meulen, Eds., Handbook of Logic and Language. relevant and how they might be interrelated (Komatsu 1992; Amsterdam: North Holland, pp. 93–177. see also Billman and Knutson 1996). Oehrle, R., E. Bach, and D. Wheeler. (1988). Categorial Gram- In current cognitive science theorizing, similarity has a mars and Natural Language Structures. Dordrecht: Reidel. role to play but a limited one that, in many respects, changes Steedman, M. (1996). Surface Structure and Interpretation. Lin- guistic Inquiry Monograph 30. Cambridge MA: MIT Press. its character. Researchers who focus on similarity (e.g., Wood, M. M. (1993). Categorial Grammar. Routledge. Nosofsky 1988) use models of selective feature weighting such that similarity is, in part, a byproduct of category learn- ing. Other researchers derive a role for similarity from an Categorization analysis of how categories might be used to satisfy human goals such as in drawing inferences (e.g., Anderson 1991). Categorization, the process by which distinct entities are Finally, investigators who argue that categories are orga- treated as equivalent, is one of the most fundamental and nized around knowledge structures (e.g., Wisniewski and pervasive cognitive activities. It is fundamental because cat- Medin 1994) allow theories to determine the very notion of egorization permits us to understand and make predictions what a feature is. about objects and events in our world. People (necessarily) Is there a single set of principles that applies to all catego- make use of only the tiniest fraction of the possible categori- ries? Evidence suggests that there may be important differ- zation schemes, but even a modest-sized set of entities can ences among them. First of all, a great deal of attention has be grouped in a limitless number of ways. Therefore, a fun- been directed to the hierarchical component of categories. damental question is why we have the categories we have Objects can be categorized at different levels of abstraction; and not others. Further, what do our categorization schemes for example, your pet Fido can be categorized as a living allow us to do that other schemes would not? thing, an animal, a mammal, a dog, or a poodle. Work by There has been a plethora of work on the structure of cat- Eleanor Rosch and her colleagues (Rosch et al. 1976; see egories, mostly examining natural object categories (see Berlin, Breedlove, and Raven 1973 for related work in Smith and Medin 1981; Rips 1990; Komatsu 1992 for anthropology) has shown that one level in this hierarchy, Categorization 105 dubbed the “basic level,” seems to be psychologically privi- categorize the same way also have the same concept? These leged. In our example, dog would be a basic level term, and are but a small sample of the fascinating issues associated Rosch et al. found that a number of measures of privilege all with research on categorization. converged on this level. The basic level is the level preferred See also ANALOGY; COLOR CLASSIFICATION; CONCEP- in naming by adults, is the first learned by children, and is TUAL CHANGE; FOLK BIOLOGY; NATIVISM; NATURAL KINDS the level at which adults can categorize most rapidly. It may —Douglas L. Medin and Cynthia Aguilar be that similarity plays a bigger role in categorization at the basic level than for more superordinate levels. Currently, References investigators are actively pursuing issues such as whether the basic level might change with expertise (e.g., Tanaka and Anderson, J. R. (1991). The adaptive nature of human categoriza- tion. Psychological Review 98: 409–429. Taylor 1991) or vary across cultures (Berlin 1992; Coley, Atran, S. (1990). Cognitive Foundations of Natural History: Medin, and Atran 1997). These questions bear on the respec- Towards an Anthropology of Science. Cambridge: Cambridge tive roles of mind and world in categorization (variability University Press. with culture or expertise would tend to support the former). Barsalou, L. W. (1983). Ad hoc categories. Memory and Cognition Other researchers have attempted to extend this hierar- 11: 211–227. chical structure to social categories (e.g., race, gender, Barsalou, L. W. (1985). Ideals, central tendency, and frequency of occupation, etc.). There has been some work applying instantiation as determinants of graded structure of categories. Rosch’s measures of basic levels to the domains of person Journal of Experimental Psychology: Learning, Memory, and concepts and routine social events with moderate success Cognition 11: 629–654. (Cantor and Mischel 1979; Morris and Murphy 1990). Berlin, B. (1992). Ethnobiological Classification: Principles of Categorization of Plants and Animals in Traditional Societies. Note, however, that many social categories are nonhierar- Princeton, NJ: Princeton University Press. chical and categories at the same level of abstractness may Berlin, B., D. Breedlove, and P. Raven. (1973). General principles be overlapping rather than mutually exclusive. For exam- of classification and nomenclature in folk biology. American ple, a person can be categorized as a woman, an ethnic Anthropologist 74: 214–242. minority member, a millionaire, and a celebrity all at the Billman, D., and J. F. Knutson. (1996). Unsupervised concept same time. None of these categories are subordinate or learning and value systematicity: a complex whole aids learn- superordinate to any other. This raises a new set of ques- ing the parts. Journal of Experimental Psychology: Learning, tions about which categories are activated in a given situa- Memory, and Cognition 22: 539–555. tion and how the corresponding concepts are updated with Cantor, N., and W. Mischel. (1979). Prototypes in person per- experience. There is even evidence that alternative social ception. In L. Berkowitz, Ed., Advances in Experimental Social Psychology, vol. 12. New York: Academic Press, pp. categories may compete and inhibit one another (Macrae, 3–52. Bodenhausen, and Milne 1995). In short, there may be Carey, S. (1985). Conceptual Change in Childhood. Cambridge, major differences between object and social categories (see MA: Bradford Books. Wattenmaker 1995 for a further example). Coley, J. D., S. Atran, and D. L. Medin. (1997). Does rank have its Goal-derived categories also differ from common taxo- privilege? Inductive inferences within folk biological taxono- nomic categories. Barsalou (1983, 1985) has shown that cat- mies. Cognition 64: 73–112. egories activated in the service of goals (e.g., things to take Goodman, N. (1972). Seven structures on similarity. In N. Good- on a camping trip, things to eat when on a diet) may follow man, Ed., Problems and Projects. New York: Bobbs-Merrill. different processing principles. For instance, goodness of Hirschfeld, L. A., and S. A. Gelman, Eds. (1994). Mapping the example for many object categories seems to be based on Mind: Domain Specificity in Cognition and Culture. New York: Cambridge University Press. having typical properties (a robin is judged to be a very typ- Keil, F. C. (1989). Concepts, Kinds and Cognitive Development. ical bird because it looks and acts like many other birds; Cambridge, MA: MIT Press. ostriches are not typical for the opposite reason; see Rosch Komatsu, L. K. (1992). Recent views of conceptual structure. Psy- and Mervis 1975), but for goal-derived categories, goodness chological Bulletin 112: 500–526. of example seems to be based on ideals or extremes. More Macrae, C. N., G.V. Bodenhausen, and A. B. Milne. (1996). The specifically, the best example of a diet food is one with zero dissection of selection in person perception: inhibitory pro- calories, even though zero may not be typical. cesses in social stereotyping. Journal of Personality and Social Still other researchers have suggested that categorization Psychology. principles show DOMAIN SPECIFICITY. For example, some Medin, D. L. (1989). Concepts and conceptual structure. American have suggested that biological categories constitute a dis- Psychologist 44: 1469–1481. Morris, M., and G. L. Murphy. (1990). Converging operations on a tinct (and innate) domain and that people universally assume basic level in event taxonomies. Memory and Cognition 18: that biological categories have an underlying essence that 107–418. makes things the way they are (Atran 1990). Domain-speci- Murphy, G. L., and D. L. Medin. (1985). The role of theories in ficity is a topic that is currently receiving much attention conceptual coherence. Psychological Review 92: 289–316. (see chapters in Hirschfeld and Gelman 1994). Nosofsky, R. M. (1988). Similarity, frequency and category repre- Categorization touches on many important applied and sentations. Journal of Experimental Psychology: Learning, theoretical questions. How does the perception of social Memory, and Cognition 14: 54–65. groups lead to stereotypes (see STEREOTYPING) and other Rips, L. J. (1989). Similarity, typicality and categorization. In S. forms of bias? What is the role of language in categorization Vosniadou and A. Ortony, Eds., Similarity and Analogical Rea- and conceptual development? To what extent do people who soning. Cambridge: Cambridge University Press. 106 Causal Reasoning pocket causes coins to be silver, because his pocket proba- Rips, L. J. (1990). Reasoning. Annual Review in Psychology 41: 321–353. bly did not come into existence before the coins did. If his Rosch, E., and C. B. Mervis. (1975). Family resemblances: studies pocket did predate all the coins in it, however, Hume’s solu- in the internal structure of categories. Cognitive Psychology 7: tion would fail: the coins were close to his pocket and they 573–605. were silver whenever they were in his pocket, including Rosch, E., C. B. Mervis, W. Gray, D. Johnson, and P. Boyes- soon after they were in his pocket. Braem. (1976). Basic objects in natural categories. Cognitive One approach to the psychology of causal inference Psychology 8: 573–605. inherited the problem posed by Hume and extended his reg- Smith, E., and D. L. Medin. (1981). Categories and Concepts. ularity solution. A branch of this approach was adopted by Cambridge, MA: Harvard University Press. Kelley’s (1973) ANOVA model and subsequent variants of Tanaka, J. W., and M. Taylor. (1991). Object categories and exper- it in social psychology (e.g., Hilton and Slugoski 1986). To tise: is the basic level in the eye of the beholder? Cognitive Psy- chology 23: 457–482. illustrate contemporary statistical variants of Hume’s solu- Tversky, A. (1977). Features of similarity. Psychological Review tion, consider the contrasting examples again. Cheng and 84: 327–352. Holyoak’s (1995) model would explain that a reasoner con- Wattenmaker, W. D. (1995). Knowledge structures and linear sepa- cludes that heating causes butter to melt because heating rability: integrating information in object and social categoriza- occurs before melting and melting occurs more often when tion. Cognitive Psychology 28: 274–328. the butter is heated to 150°F than when it is not, when other Wisniewski, E. J., and D. L. Medin. (1994). On the interaction of plausible influences on melting such as the purity of the but- theory and data in concept learning. Cognitive Science 18: 221– ter are controlled. In contrast, a reasoner does not conclude 281. that Goodman’s pocket causes coins to be silver because one knows of alternative causes of coins being silver that Causal Reasoning might be uncontrolled. For example, Goodman might have selectively kept only silver coins in his pocket, whereas Knowing that all pieces of butter have always melted when there is no such selection for coins outside his pocket. heated to 150°F, one would probably be willing to conclude As these examples illustrate, the regularity approach that if the next solid piece of butter is heated to 150°F, it will requires specific knowledge about alternative causes. But melt. In contrast, knowing that all coins in Nelson Goodman’s how does such knowledge come about in the first place? pockets, up to this point, were silver, one would be reluctant Unless one first knows all alternative causes, normative to conclude that if a copper coin were put in his pocket, it inference regarding a candidate cause seems impossible. Yet would become silver (examples adapted from Goodman such inferences occur everyday in science and daily life. For 1954/1983). Why is it that one is willing to believe that heat- example, without assuming that one knows all the alterna- ing causes butter to melt, but unwilling to believe that Good- tive causes of butter melting, it is nonetheless possible to man’s pocket causes coins to be silver? These contrasting feel convinced, after observing the heating and subsequent examples point to the kinds of questions that psychologists melting of butter, that heating causes butter to melt. who study causal reasoning have asked and to the approaches An independent branch of the regularity approach begun taken in their answers. The central question is: What makes in Pavlovian CONDITIONING, culminating in Rescorla and some sequences of events causal, thus licensing inference Wagner’s (1972) connectionist model and its variants, and involving similar events, and other sequences noncausal? has been adopted to apply to human causal reasoning (e.g., The problem of causal INDUCTION as posed by David Dickinson, Shanks, and Evenden 1984). This branch modi- HUME (1739/1987) began with the observation that causal fies Hume’s solution in a manner similar to the statistical relations are neither deducible nor explicit in the reasoner’s approach, but in addition provides algorithms for computing sensory input (where such input includes introspection as causal output from observational input. These connectionist well as external sensory input). Given that sensory input is variants of the regularity approach explain a wide range of the ultimate source of all information that a reasoner has, it empirical findings but have their shortcomings, such as a follows that all acquired causal relations must have been failure to explain the causal analog of the extinction of con- computed from (noncausal) sensory input in some way. A ditioned inhibition (for reviews, see Cheng 1997; Miller, fundamental question therefore arises for such relations: Barnet, and Grahame 1995). A common source of these How does a reasoner come to know that one thing, or type of shortcomings may be the inability of these variants to repre- thing, causes another? In other words, what is the mapping sent causal power as an explicit variable existing indepen- from observable events as input to causal relations as output? dently of its value (see BINDING PROBLEM). The solution Hume (1739/1987) proposed is that causal A second approach rejects all regularity solutions, and relations are inferred from the spatial and temporal contigu- claims to offer an alternative solution to causal inference: ity of the candidate cause c and the effect e, the temporal one infers a relation to be causal when one perceives or priority of c, and the constant conjunction between c and e. knows of a causal mechanism or causal power underlying For the butter example, Hume’s regularity approach might the relation (e.g., Koslowski 1996; Michotte 1963; Shultz explain that one concludes that heating causes butter to melt 1982; White 1995). Because power theorists do not explic- from the fact that the heat is close to the butter, melting fol- itly define causal “power” or causal “mechanism,” it is lows soon after heating, and whenever butter is heated to unclear whether heating, for example, qualifies as a causal 150°F, its melting follows. Similarly, this approach might mechanism for substances melting. Assuming that it does, explain that one is reluctant to believe that Goodman’s then power theorists would predict that heating should be Causal Reasoning 107 understood to cause butter to melt. In contrast, reasoners do See alsoCAUSATION; CONCEPTS; DEDUCTIVE REASON- not know of any mechanism involving Goodman’s pocket ING; EXPLANATION-BASED LEARNING; that would cause coins to be silver, and therefore would not —Patricia Cheng believe that his pocket causes coins to be silver. Power theo- rists attempt to refute the regularity view by demonstrating References that knowledge regarding specific causal powers influence causal judgments. Busemeyer, J., M. A. McDaniel, and E. Byun. (1997). Multiple To regularity theorists, it is unclear what question the input-output causal environments. Cognitive Psychology 32: 1– power approach seeks to answer; that question, however, is 48. definitely not the one posed by Hume (Cheng 1993). If it is Cheng, P. W. (1993). Separating causal laws from casual facts: “What kind of causal inference do people, including infants, pressing the limits of statistical relevance. In D. L. Medin, Ed., typically make in their everyday life?” then the answer is The Psychology of Learning and Motivation, 30. New York: Academic Press, pp. 215–264. that they often make inferences based on prior causal Cheng, P. W. (1997). From covariation to causation: a causal power knowledge (e.g., previously acquired knowledge that heat- theory. Psychological Review 104: 367–405. ing causes substances to melt). Regularity theorists, how- Cheng, P. W., and K. J. Holyoak. (1995). Complex adaptive sys- ever, have no objection to the use of prior causal knowledge, tems as intuitive statisticians: causality, contingency, and pre- as long as not all of that knowledge is innate; the kind of diction. In H. L. Roitblat and J.-A. Meyer, Eds., Comparative evidence offered by power theorists is therefore compatible Approaches to Cognitive Science. Cambridge, MA: MIT Press, with the regularity view (Cheng 1993; see Morris and Lar- pp. 271–302. rick 1995 for an example of an application of prior causal Dickinson, A., D. R. Shanks, and J. L. Evenden. (1984). Judgment knowledge using a statistical approach). If the power solu- of act-outcome contingency: the role of selective attribution. tion were to be regarded as an answer to Hume’s problem, Quarterly Journal of Experimental Psychology 36A: 29–50. Goodman, N. (1954/1983). Fact, Fiction, and Forecast. Fourth edi- then it begs the question: How does acquired knowledge tion. Cambridge, MA: Harvard University Press. about the causal nature of mechanisms (e.g., heating as a Hilton, D. J., and B.R. Slugoski. (1986). Knowledge-based causal cause of melting) come about? That is, how does a reasoner attribution: the abnormal conditions focus model. Psychologi- infer a causal mechanism from noncausal observations? cal Review 93: 75–88. The answer to this question (the same question that the reg- Hume, D. (1739/1987). A Treatise of Human Nature. Second edi- ularity view attempts but fails to answer) is what ultimately tion. Oxford: Clarendon Press. explains why one believes that one relation is causal and Kant, I. (1781/1965). Critique of Pure Reason. London: Mac- another not. millan and Co. In addition to their other problems, neither the regularity Kelley, H. H. (1973). The processes of causal attribution. American nor the power approach can explain the boundary conditions Psychologist 28: 107–128. Koslowski, B. (1996). Theory and Evidence: The Development of for causal inference (see Cheng 1997 for a review). For Scientific Reasoning. Cambridge, MA: MIT Press. example, neither explains why controlling for alternative Michotte, A. E. (1946/1963). The Perception of Causality. New causes allows a regularity to imply causality. York: Basic Books. A third approach to the psychology of causal inference Miller, R. R., R. C. Barnet, and N. J. Grahame. (1995). Assessment inherited Hume’s problem, but modified his regularity solu- of the Rescorla-Wagner model. Psychological Bulletin 117: tion radically by adding a Kantian framework that assumes 363–386. an a priori notion of causal power. This notion differs criti- Morris, W. M., and R. Larrick. (1995). When one cause casts cally from the causal knowledge presupposed by traditional doubts on another: a normative analysis of discounting in power theorists in that it is general rather than specific (see causal attribution. Psychological Review 102: 331–355. INFANT COGNITION for assumptions regarding specific Pearl, J. (1995). Causal diagrams for experimental research, Biometrika 82(4): 669–710. causal knowledge). According to this approach, the reasoner Rescorla, R. A., and A. R. Wagner. (1972). A theory of Pavlovian innately postulates that there exist such things as causes that conditioning: variations in the effectiveness of reinforcement have the power to produce an effect and causes that have the and nonreinforcement. In A. H. Black and W. F. Prokasy, Eds., power to prevent an effect, and determines whether a regu- Classical Conditioning II: Current Theory and Research. New larity is causal by attempting to generate it with such gen- York: Appleton-Century-Crofts, pp. 64–99. eral possible powers. By integrating the two previous Shultz, T. R. (1982). Rules of causal attribution. Monographs of approaches, this new power approach claims to explain a the Society for Research in Child Development 47 (1). wide range of findings regarding causal inference, overcom- Spellman, B. A. (1997). Crediting causality. Journal of Experimen- ing many problems that cripple earlier approaches (Cheng tal Psychology: General 126: 1–26. 1997). The same basic approach has been adopted by com- Spirtes, P., C. Glymour, and R. Scheines. (1993). Causation, Pre- diction and Search. New York: Springer-Verlag. puter scientists and philosophers in the last decade to study White, P. A. (1995). Use of prior beliefs in the assignment of how it is possible in principle to draw inferences about causal roles: causal powers versus regularity-based accounts. causal networks from patterns of probabilities (BAYESIAN Memory and Cognition 23: 243–254 NETWORKS; Pearl 1995; Spirtes, Glymour, and Scheines 1993). Although psychological work has begun on aspects Further Readings of causal networks (Busemeyer, McDaniel, and Byun 1997; Spellman 1997), how humans and other animal species infer Hart, H. L., and A. M. Honoré. (1959/1985). Causation in the Law. causal networks remains to be investigated. 2nd ed. Oxford: Oxford University Press. 108 Causation vide any account of the direction of causation—a problem Mackie, J. L. (1974). The Cement of the Universe: A Study of Cau- sation. Oxford: Clarendon Press. that quickly becomes evident when one notices that one Shanks, D. R., K. J. Holyoak, and D. L. Medin, Eds. (1996). The state of affairs may be a nomologically sufficient condition Psychology of Learning and Motivation, vol. 34: Causal Learn- for another either because the former is causally sufficient ing. New York: Academic Press. for the latter, or because, on the contrary, the latter is caus- Sperber, D., D. Premack, and A. J. Premack, Eds. (1995). Causal ally necessary for the former. Cognition: A Multidisciplinary Debate. New York: Oxford In the case of counterfactual approaches to causation, a University Press. crucial problem is that traditional analyses of subjunctive conditionals employ causal notions. Alternative accounts Causation have been proposed, involving similarity relations over pos- sible worlds. But these alternative accounts are exposed to decisive objections. Basic questions in the philosophy of causation fall into two Finally, there are also specific problems for probabilistic main areas. First, there are central metaphysical questions accounts, two of which are especially important. First, prob- concerning the nature of causation, such as the following: abilistic accounts have struggled to find an interpretation of What are causal laws? What is it for two states of affairs to their central claim—that causes must, in some way, make be causally related? Which are primary—causal relations their effects more likely—that is not open to counterexam- between states of affairs, or causal laws? How are causal ples. Second, probabilistic approaches to causation typically facts related to noncausal facts? How can one explain the involve the very counterintuitive consequence that a com- formal properties of causation—such as irreflexivity, asym- pletely deterministic world could not contain any causally metry, and transitivity? What is the ground of the direction related events (Tooley 1987). of causation? There are also other objections, however—of a very seri- Second, there are issues concerning the epistemology of ous sort—that tell against all reductionist approaches. First, causation. Can causal relations be directly observed? How one can show that some worlds with probabilistic laws may can the existence of causal laws be established? What statis- agree with respect to all causal laws and all noncausal facts, tical methods can be used to confirm causal hypotheses, and yet differ with respect to causal relations between events. So how can those methods be justified? some causal facts not only are not logically supervenient Such metaphysical and epistemological issues first came upon the totality of noncausal states of affairs, they are not sharply into focus as a result of David Hume’s penetrating even supervenient upon the combination of that totality scrutiny of causation, and the theses that he defended (Hume together with all causal laws (Carroll 1994; Tooley 1987). 1739–40, 1748). On the metaphysical side, HUME argued for Second, there are arguments showing that no reductionist the view that causal facts are reducible to noncausal facts, approach to causation can account for the direction of cau- while, on the epistemological side, Hume argued that causal sation. One problem, for example, is that very simple worlds relations between events, rather than being directly observ- containing causally related events may be devoid of all of able, can only be known by establishing the existence of the noncausal features upon which reductionist accounts “constant conjunctions,” or general laws. rely to define the direction of causation—such as increasing The major metaphysical choice is between realist and entropy and the presence of open causal forks. Another reductionist approaches to causation. According to the lat- problem is that, given deterministic laws of an appropriate ter, all causal facts are logically supervenient upon the total- sort—such as, for example, the laws of Newtonian phys- ity of noncausal states of affairs. It is logically impossible, ics—one can show that, corresponding to some worlds then, for two possible worlds to disagree with respect to where a reductionist account assigns the correct direction to some causal fact while agreeing completely with respect to causation, there will be inverted worlds where the direction all noncausal facts. of causation is opposite to that specified by any reductionist Reductionist approaches to causation have dominated the account (Tooley 1990b). philosophical landscape since the time of Hume, and many Given these difficulties, it is natural to explore realist different accounts have been advanced. Three types of alternatives, and the most plausible form of realism involves approaches are, however, especially important. First, there viewing causation as a theoretical relation between states of are approaches that start out from the general notion of a affairs. The development of this type of approach, however, law of nature, then define the ideas of necessary and suffi- presupposed solutions to two problems that confronted real- cient nomological conditions, and, finally, employ the latter ist interpretations of theories in general. First, there was the concepts to explain what it is for one state of affairs to cause semantical problem of how one could even make sense of a another (Mackie 1965). Second, there are approaches that realist interpretation of theoretical terms. Second, there was employ subjunctive conditionals in an attempt to give a the epistemological problem of how one could justify any counterfactual analysis of causation (Lewis 1973, 1979, statement containing theoretical terms when those terms 1986). Third, there are probabilistic approaches, where the were interpreted as referring to unobservable states of central idea is that a cause must, in some way, make its affairs. effect more likely (Reichenbach 1956; Good 1961–62; It is not surprising, then, that until those obstacles were Suppes 1970; Eells 1991; Mellor 1995). surmounted, reductionist approaches to causation held Each of these three types of approaches faces difficulties sway. Now, however, satisfactory answers to the above specific to it. The attempt to analyze causation in terms of problems are available. Thus, in the case of the semantical nomological conditions, for example, is hard pressed to pro- Causation 109 problem, one promising approach involves the use of exis- will generate only a probability distribution over possible tential quantification ranging over properties and relations causal relations. to assign a realist interpretation to theoretical terms (Lewis The basic conclusion, in short, is that an investigation 1970), while, as regards the epistemological problem, there into the epistemology of causation cannot proceed in isola- is now widespread acceptance of a type of inductive reason- tion from consideration of the metaphysics of causation, and ing—variously referred to as the method of hypothesis, if it turns out that a reductionist view of causation is untena- abduction, hypothetico-deductive reasoning, and inference ble, then one needs to employ a realist account that connects to the best explanation—that will allow one to justify theo- causation to probability, and then isolate algorithms that can retical claims realistically construed. be justified on the basis of such an account. These philosophical developments have made it possible, See also CAUSAL REASONING; INDUCTION; MENTAL CAU- then, to take seriously the idea that causation is a theoretical SATION; REALISM AND ANTIREALISM; REDUCTIONISM; relation. To construct such an account, however, one needs SUPERVENIENCE to set out an analytically true theory of causation, and, at —Michael Tooley present, only one such theory has been worked out in any detail (Tooley 1987, 1990a). It seems likely, however, both References that more theories will be proposed in the near future, and that, given the difficulties that reductionism faces, an Anscombe, G. E. M. (1971). Causality and Determination. Cam- account of the nature of causation along realist lines will bridge: Cambridge University Press. turn out to be correct. Carroll, J. W. (1994). Laws of Nature. Cambridge: Cambridge Uni- Some philosophers have maintained that causation is a versity Press. Eells, E. (1991). Probabilistic Causality. Cambridge: Cambridge basic and unanalyzable relation that is directly observable University Press. (Anscombe 1971; Fales 1990). That view, however, is Ehring, D. (1997). Causation and Persistence. New York: Oxford exposed to some very serious objections (Tooley 1990a). Of University Press. these, one of the most important concerns the fact that causal Fales, E. (1990). Causation and Universals. London: Routledge. beliefs are often established on the basis of statistical infor- Glymour, C., R. Scheines, P. Spirtes, and K. Kelly. (1987). Discov- mation—using methods that, especially within the social ering Causal Structure. Orlando, FL: Academic Press. sciences, are very sophisticated. But if causation is a basic Glymour, C., P. Spirtes, and R. Scheines. (1991). Causal inference. and unanalyzable relation, how can non-causal, statistical Erkenntnis 35: 151–189. information possibly serve to establish causal hypotheses? Good, I. J. (1961–62). A causal calculus I-II. British Journal for Recently, this question of how causal hypotheses can be the Philosophy of Science 11: 305–318, 12: 43–51. Hume, D. (1739–40). A Treatise of Human Nature. London: Millar. established using statistical information has been the subject Hume, D. (1748). An Enquiry Concerning Human Understanding. of intense investigation. The basic approach has been, first, London: Millar. to identify fundamental principles relating causation to Lewis, D. (1970). How to define theoretical terms. Journal of Phi- probability. Two such principles that have been suggested as losophy 67: 427–446. very important, for example, derive from the work of Hans Lewis, D. (1973). Causation. Journal of Philosophy 70: 556–567. Reichenbach (1949, 1956): Reprinted, with postscripts, in Philosophical Papers, vol. 2. Oxford: Oxford University Press, 1986. The Screening Off Principle: If A causes C only via B, then, given Lewis, D. (1979). Counterfactual dependence and time’s arrow. B, A and C are statistically independent. Noûs 13: 455–476. Reprinted, with postscripts, in Philosophi- cal Papers, vol. 2. Oxford: Oxford University Press, 1986. The Common Cause Principle: If A and B are statistically Lewis, D. (1986). Philosophical Papers, vol. 2. Oxford: Oxford dependent, and neither causes the other, then there is a common University Press. cause of A and B. Mackie, J. L. (1965). Causes and conditions. American Philosoph- ical Quarterly 2: 245–264. Then, second, one attempts to show that those principles Mackie, J. L. (1974). The Cement of the Universe. Oxford: Oxford can be used to justify algorithms that will enable one to University Press. move from information about statistical relationships to Mellor, D. H. (1995). The Facts of Causation. London: Routledge. conclusions about causal relations (Glymour et al. 1987; Menzies, P. (1989). Probabilistic causation and causal processes: a Glymour, Spirtes, and Scheines 1991; Spirtes, Glymour, and critique of Lewis. Philosophy of Science 56: 642–663. Scheines 1993) Reichenbach, H. (1949). The Theory of Probability. Berkeley: Uni- This is an interesting and important research program. In versity of California Press. Reichenbach, H. (1956). The Direction of Time. Berkeley: Univer- its present form, however, it suffers from certain defects. sity of California Press. First, some of the principles employed are unsound. Salmon, W. C. (1980). Probabilistic causality. Pacific Philosophi- Reichenbach’s common cause principle, for example, is cal Quarterly 61: 50–74. shown to be false by the inverted worlds objection to causal Salmon, Wesley C. (1984). Scientific Explanation and the Causal reductionism mentioned above. Second, if reductionism is Structure of the World. Princeton, NJ: Princeton University false, then it is a mistake to look for algorithms that will Press. specify, for some set of statistical relationships, what the rel- Sosa, E., and M. Tooley, Eds. (1993). Causation. Oxford: Oxford evant causal relations must be, as given a realist view of University Press. causation, different causal relations may underlie a given set Spirtes, P., C. Glymour, and R. Scheines. (1993). Causation, Pre- of statistical relationships. A sound algorithm, accordingly, diction, and Search. New York: Springer-Verlag. 110 Cellular Automata Strawson, G. (1989). The Secret Connexion: Causation, Realism, and David Hume. Oxford: Oxford University Press. Suppes, P. (1970). A Probabilistic Theory of Causality. Amster- dam: North-Holland Publishing Company. Tooley, M. (1987). Causation: A Realist Approach. Oxford: Oxford University Press. Tooley, M. (1990a). The nature of causation: a singularist account. In David Copp, Ed., Canadian Philosophers. Canadian Journal of Philosophy Supplement 16: 271–322. Tooley, M. (1990b). Causation: reductionism versus realism. Phi- losophy and Phenomenological Research 50: 215–236. Tooley, M. (1996). Causation. In Donald M. Borchert, Ed., The Encyclopedia of Philosophy, Supplement. New York: Simon and Schuster Macmillan, pp. 72–75. von Wright, G. H. (1971). Explanation and Understanding. Ithaca, Figure 1. Neuronal circuitry of the cerebellum. CC, cerebellar NY: Cornell University Press. cortex; pc, Purkinje cell; bc, basket cell; st, stellate cell; pf, parallel Woodward, J. (1992). Realism about laws. Erkenntnis 36: 181– fiber; gr, granule cell; go, Golgi cell; mf, mossy fiber; cf, climbing 218. fiber; IO, inferior olive; PCN, precerebellar nuclei; CN, cerebellar nuclei; VN, vestibular nuclei. Cellular Automata ebellar nuclei reach the cerebellar cortex and are relayed to the granule cells via mossy fibers, and to the dendrites of See AUTOMATA; VON NEUMANN, JOHN the Purkinje cells and other inhibitory cells via the axons of the granule cells, that is, parallel fibers. Afferent signals Cerebellum from the inferior olive in the medulla oblongata pass directly to the Purkinje cells via climbing fibers. L- The cerebellum constitutes 10 to 15 percent of the entire glutamate is the neurotransmitter for the vast majority of brain weight, about 140 grams in humans. Rollando (1809; mossy fibers and all granule cells, while GABA is the neu- see Dow and Moruzzi 1958) was the first who, by observing rotransmitter for all inhibitory neurons. motor disturbances in an animal with a lesioned cerebellum, The major signal flows in the cerebellum pass from the related the cerebellum to movement. Careful analyses of the cells of origin for mossy fibers (PCN in fig. 1), granule cells motor disturbances so induced led Flourens (1824; see Dow (gr), and Purkinje cells (pc). Marr (1969), Albus (1971), and and Moruzzi 1958) to conclude that the cerebellum is neither other theorists proposed that if climbing fiber signals modify an initiator nor an actuator, but instead serves as a coordina- granule cell-to-Purkinje cell synapses, the three-neuron tor of movements. An animal with a damaged cerebellum structure would operate as a learning machine in the same still initiates and executes movement, but only in a clumsy way as the simple perceptron described by Rosenblatt manner. Flourens (1842) and Luciani (1891; see Dow and (1962). It was a decade before long-term depression (LTD) Moruzzi 1958) observed that motor disturbances caused in an was discovered to occur at parallel fiber-to-Purkinje cell syn- animal by a partial lesion of the cerebellum were gradually apses after conjunctive activation of these synapses together compensated for due to the functional plasticity of cerebellar with climbing fiber-to-Purkinje cell synapses (Ito, Sakurai, tissues. According to the current knowledge cited below, this and Tongroach 1982; Ito 1989). LTD occurs as a result of plasticity is an expression of a learning capability of the cere- complex chemical processes involving a number of receptors bellum, which normally plays a role in MOTOR LEARNING, as and second messengers, eventually leading to the phosphory- in the cases of practicing sports and acquiring skilled move- lation of glutamate receptors (Nakazawa et al. 1995). ments. Early in the twentieth century, neurologists defined The cerebellar cortex contains numerous small longitudi- the unique symptoms, such as dysmetria and motor incoordi- nal microzones (Oscarsson 1976). Each microzone is paired nation, of cerebellar diseases. Based on these classic observa- with a small distinct group of neurons in a cerebellar or ves- tions, it has been thought that the major function of the tibular nucleus (CN, VN in fig. 1) to form a corticonuclear cerebellum is to enable us to learn to perform movements microcomplex that hereafter is referred to as a cerebellar accurately and smoothly. The extensive studies that have chip or a chip (Ito 1984). In a cerebellar chip, input signals been performed over the past four decades have facilitated from various precerebellar nuclei activate the nuclear neu- the formulation of comprehensive views on the structure of rons that produce the output signals of the chip. This major the cerebellum, what processes occur there, and what roles it signal path across a chip is attached with a sidepath through plays not only in bodily but also in cognitive functions, even a microzone that receives input signals via mossy fibers and though some of the views are still hypothetical. relays Purkinje cell signals to the nuclear neurons. Climbing The cerebellar cortex contains an elaborate neuronal cir- fibers convey signals representing errors in the performance cuit composed of five types of cells: Purkinje, basket, strel- of the chip, as detected through various sensory pathways, late, Golgi, and granule cells (fig. 1; Eccles, Ito, and which induce LTD in Purkinje cells. The LTD induction Szentagothai 1967). With respect to their synaptic action, changes the signal flow through the microzone sidepath, these cells are inhibitory except for the granule cells, thereby altering the signal flow across the chip. A cerebellar which are excitatory. Afferent signals from various precer- chip thus behaves as an adaptive unit in which input-output Cerebral Cortex 111 relationships are adaptively altered by error signals con- References veyed by climbing fibers. Albus, J. S. (1971). A theory of cerebellar function. Mathematical The cerebellum is divided into the flocculonodular lobe Bioscience 10: 25–61. and the corpus cerebelli, the latter being further divided into Dow, R. E., and G. Moruzzi. (1958). The Physiology and Pathol- vermis, paravermis (intermediate part), and hemisphere. Cer- ogy of the Cerebellum. Minneapolis: University of Minnesota ebellar chips in the flocculonodular lobe, vermis, and Press. paravermis are connected to the brain stem and spinal cord, Eccles J. C., M. Ito, and J. Szentagothai. (1967). The Cerebellum and confer adaptiveness on reflexes (not only motor but also as a Neuronal Machine. New York: Springer-Verlag. autonomic; for the vestibuloocular reflex, see Robinson Ito, M. (1984). The Cerebellum and Neural Control. New York: 1976; Ito 1984; for eye-blink conditioned reflex, see Thomp- Raven Press. Ito, M. (1989). Long-term depression. Annual Review of Neuro- son 1987) and compound movements (for locomotion, see science 12: 85–102. Yanagihara and Kondo 1996), which by themselves are ste- Ito, M. (1993). Movement and thought: identical control mecha- reotyped and nonadaptive. The role of the evolutionary old nisms by the cerebellum. Trends in Neuroscience 16: 448–450. part of the cerebellum is therefore to ensure the adaptiveness Ito, M., M. Sakurai, and P. Tongroach. (1982). Climbing fibre of the spinal cord and brain stem control functions in order to induced depression of both mossy fibre responsiveness and enable animals to survive in ever-changing environments. glutamate sensitivity of cerebellar Purkinje cells. Journal of With respect to cerebral control functions, cerebellar Physiology, London 324: 113–134. chips appear to play a different role, that is, the formation of Kawato, M., K. Furukawa, and R. Suzuki. (1987). A hierarchical an internal model of a controller or a control object. If, neuronal network model for control and learning of voluntary while a chip and the system to be modeled are supplied with movement. Biological Cybernetics 57: 169–185. Leiner, H. C., A. L. Leiner, and R. S. Dow. (1986). Does the cere- common input signals, differences in their output signals are bellum contribute to mental skill? Behavioral Neuroscience returned to the chip as error signals, the chip will gradually 100: 443–453. assume dynamic characteristics equivalent to those of the Marr, D. (1969). A theory of cerebellar cortex. Journal of Physiol- system to be modeled. ogy, London 202: 437–470. Cerebellar chips located in the paravermis are connected Nakazawa, K., S. Mikawa, T. Hashikawa, and M. Ito. (1995). Tran- to the cerebral motor cortex in such a way that these chips sient and persistent phosphorylations of AMPA-type glutamate constitute a model that mimics the dynamics of the skeleto- receptor subunits in cerebellar Purkinje cells. Neuron 1: 697– muscular system (Ito 1984). The motor cortex thus becomes 709. capable of performing a learned movement with precision Oscarsson, O. (1976). Spatial distribution of climbing and mossy by referring to the model in the cerebellum and not to the fibre inputs into the cerebellar cortex. In O. Creutzfeldt, Ed., Afferent and Intrinsic Organization of Laminated Structures in skeletomuscular system. Dysmetria, the failure to perform a the Brain. Berlin: Springer-Verlag, pp. 34–42. precise reaching movement without visual feedback, could Robinson, D. A. (1976). Adaptive gain control of vestibulo-ocular be due to the loss of such internal models. Another possibil- reflex by the cerebellum. Journal of Neurophysiology 39: 954– ity is that a chip located in the cerebellar hemisphere forms 969. a model that acts as a controller in place of the motor cortex Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptron (Kawato, Furukawa, and Suzuki 1987; Shidara et al. 1995). and the Theory of Brain Mechanisms. Washington, DC: Spartan Learned movements could then be controlled uncon- Books. sciously, yet accurately, by the cerebellum. These two Schmahmann, J. D. (1997). The Cerebellum and Cognition. San model systems, one eliminating the need for sensory feed- Diego: Academic Press. back and the other awareness from learned voluntary move- Shidara, M., M. Kawano, H. Gomi, and M. Kawato. (1995). Inverse-dynamics encoding of eye movements by Purkinje cells ment control, appear to represent different phases of motor in the cerebellum. Nature 365: 50–52. learning conducted in different cerebellar areas. Thompson, R. F. (1987). The neurobiology of learning and mem- Based on the parallel development of the cerebral associa- ory. Science 233: 941–947. tion cortex and cerebellar hemispheres in primates, Leiner, Yanagihara, D., and I. Kondo. (1996). Nitric oxide plays a key role Leiner, and Dow (1986) suggested that the lateralmost part of in adaptive control of locomotion in cats. Proceedings of the the cerebellar hemisphere is involved in cognitive rather than National Academy of Sciences, USA 93: 13292–13297. motor functions. Thought may occur as a result of the pre- frontal association cortex acting as a controller upon images, Further Readings ideas, or concepts encoded in the parietolateral association Palay, S. L., and V. Chan-Palay. (1974). The Cerebellar Cortex. cortex as a control object. During thought repetition, a cere- New York: Springer-Verlag. bellar chip may form a model of the parietolateral cortex or the prefrontal cortex. A repeatedly learned thought may thus be performed quickly yet accurately even without reference Cerebral Cortex to the consequences of the thought or without conscious attention. Evidence suggesting such roles of the cerebellum as this is accumulating from studies on the human cerebel- The cerebral cortex is a paired structure in the forebrain that lum using noninvasive techniques (see Schmahmann 1997). is found only in mammals and that is largest (relative to body size) in humans (Herrick 1926; Jerison 1973). Its most See also MOTOR CONTROL; MOTOR LEARNING distinctive anatomical features are (i) the very extensive —Masao Ito internal connections between one part and another part, and 112 Cerebral Cortex (ii) its arrangement as a six-layered sheet of cells, many of results, and are therefore not good tests for stored general which cells are typical pyramidal cells. Although the crum- knowledge. The evidence for such knowledge is that corti- pled, folded surface of this sheet is responsible for the very cally dominant animals take advantage of the enormously characteristic appearance of the brains of large mammals, complicated associative structure of their environments, and the cortex of small mammals tends to be smooth and this could come about in two ways. A species could have unfolded. Where folds occur, each fold, or gyrus, is about genetically determined mechanisms, acquired through evo- 1/2 cm in width (Sarnat and Netsky 1981). lutionary selection, for taking advantage of the regular fea- Imagine the crumpled sheet expanded to form a pair of tures of the environment, or they could have learned through balloons with walls 2.5 mm thick, each balloon with a diam- direct experience. It seems most likely that animals with a eter of 18 cm and a surface area close to 1000 cm2. The pair dominant neocortex have a combination of the two means— weighs about 500 grams, contains about 2 × 1010 cells con- they have the best of both worlds by combining genetic and necting with each other through some 1014 synapses, and individual acquisition of knowledge about their environ- through a total length of about 2 × 106 km of nerve fiber— ments (Barlow 1994). more than five times the distance to the moon (Braitenberg Neuropsychologists who have studied the defects that and Schuz, 1991). The two balloons connect to each other result from damage and disease to the cerebral cortex through the corpus callosum, a massive tract of nerve fibers. emphasize localization of function (Phillips, Zeki, and Bar- Almost all the synapses and nerve fibers connect cortical low 1984): cognitive functions are disrupted in a host of dif- neurons to each other, but it is of course the connections ferent ways that can, to some extent, be correlated with the with the rest of the animal that allow the cortex to control locus of the damage and the known connections of the dam- the animal’s highest behavior. The main connections are as aged part. But there can be considerable recovery of a func- follows. tion that has been lost in this way, particularly following The olfactory input from the nose probably represents damage in infancy or childhood: in the majority of adults the input to the primordial structure from which the cortex the left hemisphere is absolutely necessary for speech and evolved. It goes directly to layer 1, the outermost layer, of a language, but these capacities can develop in the right hemi- special region at the edge of the cortical sheet. Touch, hear- sphere following total loss of the left hemisphere in child- ing, and vision relay through thalamic nuclei that are situ- hood. ated near what would be the necks of the balloons, and these Neurophysiologists have recorded the activity of individ- nuclei receive a large number of feedback connections from ual nerve cells in the cortical sheet. This work leads to the the cortical sheet as well as inputs from the sense organs. view that the cortex represents the sensory input and the ani- An important output pathway comes from the motor area, mal is receiving, processes it for OBJECT RECOGNITION, and which is the region believed to be responsible for voluntary selects an appropriate motor output. Neurons in the primary movement. The CEREBELLUM should be mentioned here visual cortex (also called striate cortex or V1) are selectively because it is also concerned with the regulation and control responsive to edge orientation, direction of MOTION, TEX- of muscular movement; it has profuse connections to and TURE, COLOR, and disparity. These are the local properties from the cortex, and enlarged with it during evolution. It has of the image that, according to the laws of GESTALT PERCEP- recently become clear that it is concerned with some forms TION, lead to segregation of figure from ground. The neu- of CONDITIONING (Yeo, Hardiman, and Glickstein 1985; rons of V1 send their axons to adjacent extra-striate visual Thompson 1990). cortex, where the precise topographic mapping of the visual The second set of pathways in and out of the cortex pass field found in V1 becomes less precise and information is through two adjacent regions of modified cortical sheet collected together according to parameters (such as direc- called the archicortex and paleocortex that flank the six-lay- tion and velocity of motion) of the segregating features ered neocortex (Sarnat and Netsky 1981). Paleocortex lies (Barlow 1981). These steps can account for the first stages below the necks of the balloons and contains the rhinen- of object recognition, but what happens at later stages is less cephalon (nose brain), where smell information enters; it clear. connects with other regions thought to be concerned with Although there are considerable anatomical differences mood and EMOTION. Archicortex developed into the HIP- between the different parts of the cortical sheet, there are POCAMPUS, and is thought to be concerned with the laying also great similarities, and its evolution as a whole prompts down of memories. Like the paleocortex, it has connections one to seek a similar function for it throughout. Here the with regions involved in mood, emotion, and endocrine con- trouble starts, for it is evident that comparative anatomists trol. say it does one thing, neuropsychologists another, and neu- The words used to describe the higher mental capacities rophysiologists yet something else (Barlow 1994). These of animals with a large neocortex include CONSCIOUSNESS, divergences result partly from the different spatial and time free will, INTELLIGENCE, adaptability, and insight, but ani- scales of the observations and experiments in different mals with much simpler brains learn well, so LEARNING fields, for neurophysiology follows the activity of individual should not be among these capacities (Macphail 1982). The neurons from second to second, neuropsychologists are con- comparative anatomists Herrick and Jerrison emphasize that cerned with chronic, almost permanent defects resulting neocortically dominated animals show evidence of having from damage to cortical areas containing several million acquired extensive general knowledge about the world, but cells, and behavioral observation is concerned with func- laboratory learning experiments are usually designed to pre- tions of the whole animal over intermediate periods of sec- vent previously acquired knowledge from influencing onds and minutes. But the cells everywhere have an unusual Chess, Psychology of 113 and similar form, which suggests they have a common func- Further Readings tion; an attractive hypothesis is that this is prediction.and Abeles, M. (1991). Corticonics: Neural Circuits of the Cerebral Sense organs are slow, but an animal’s competitive sur- Cortex. Cambridge: Cambridge University Press. vival often depends upon speedy response; therefore, a rep- Creuzfeldt, O. D. (1983). Cortex cerebri: leistung, strukturelle und resentation that is up to the moment, or even ahead of the functionelle Organisation der Hirnrinde. Berlin: Springer-Ver- moment, would be of very great advantage. Prediction lag. Translated by Mary Creuzfeldt et al. as “Cortex cerebri: depends upon identifying a commonly occurring sequential performance, structural and functional organisation of the cor- pattern of events at an early stage in the sequence and tex,” Gottingen 1993. assuming that the pattern will be completed. This requires Jones, E. G., and A. Peters. (1985). Cerebral Cortex. Five volumes. knowledge of the spatio-temporal sequences that commonly New York: Plenum Press. Martin, R. D. (1990). Primate Origins and Evolution. London: occur, and the critical or sensitive periods, which neuro- Chapman and Hall. physiologists have studied in the visual cortex (Hubel and Wiesel 1970; Movshon and Van Sluyters 1981) but which Cerebral Specialization are also known to occur in the development of other cogni- tive systems, may be periods when spatio-temporal sequences that occur commonly encourage the development See HEMISPHERIC SPECIALIZATION of neurons with a selectivity of response to these patterns. If such phase sequences, as HEBB (1949) called them, were Chess, Psychology of recognized by individual cells, one would have a computa- tional unit that, with appropriate connections, would be of selective advantage in an enormous range of circumstances. Historically, chess has been one of the leading fields in the The survival value of neurons with the power of prediction study of EXPERTISE (see De Groot and Gobet 1996 and could have led to the explosive enlargement of the neocortex Holding 1985 for reviews). This popularity as a research that culminated in the human brain. domain is explained by the advantages that chess offers for studying cognitive processes: (i) a well-defined task; (ii) See also ADAPTATION AND ADAPTATIONISM; CORTICAL the presence of a quantitative scale to rank chess players LOCALIZATION, HISTORY OF; EVOLUTION; HEMISPHERIC SPE- (Elo 1978); and (iii) cross-fertilization with research on CIALIZATION; NEURON game-playing in computer science and artificial intelli- ——Horace Barlow gence. Many of the key chess concepts and mechanisms to be References later developed in cognitive psychology were anticipated by Barlow, H. B. (1981). Critical limiting factors in the design of the Adriaan De Groot’s book Thought and Choice in Chess eye and visual cortex. The Ferrier Lecture, 1980. Proceedings (1946/1978). De Groot stressed the role of selective search, of the Royal Society, London Series B 212: 1–34. perception, and knowledge in expert chess playing. He also Barlow, H. B. (1994). What is the computational goal of the neo- perfected two techniques that were to be often used in later cortex? In C. Koch and J. Davis, Eds., Large Scale Neuronal research: recall of briefly presented material from the Theories of the Brain. Cambridge, MA: MIT Press. domain of expertise, and use of thinking-aloud protocols to Braitenberg, V., and A. Schuz. (1991). Anatomy of the Cortex: Sta- study problem-solving behavior. His key empirical findings tistics and Geometry. Berlin: Springer-Verlag. were that (i) world-class chess grandmasters do not search Hebb, D. O. (1949). The Organisation of Behaviour. New York: more, in number of positions considered and in depth of Wiley. Herrick, C. J. (1926). Brains of Rats and Men. Chicago: University search, than weaker (but still expert) players; and (ii) grand- of Chicago Press. masters and masters can recall and replace positions (about Hubel, D. H., and T. N. Wiesel. (1970). The period of susceptibil- two dozen pieces) presented for a few seconds almost per- ity to the physiological effects of unilateral eye closure in kit- fectly, while weaker players can replace only a half dozen tens. Journal of Physiology 206: 419–436. pieces. Jerison, H. J. (1973). Evolution of the Brain and Intelligence. New De Groot’s theoretical ideas, based on Otto Selz’s psy- York: Academic Press. chology, were not as influential as his empirical techniques Macphail, E. (1982). Brain and Intelligence in Vertebrates. New and results. It was only about twenty-five years later that York: Oxford University Press. chess research would produce a theory with a strong impact Movshon, J. A., and R. C. Van Sluyters. (1981). Visual neural on the study of expertise and of cognitive psychology in development. Annual Review of Psychology 32: 477–522. Phillips, C. G., S. Zeki, and H. B. Barlow. (1984). Localisation of general. In their chunking theory, Simon and Chase (1973) function in the cerebral cortex. Brain 107: 327–361. stressed the role of perception in skilled behavior, as did De Sarnat, H. B., and M. G. Netsky. (1981). Evolution of the Nervous Groot, but they added a set of elegant mechanisms. Their System. New York: Oxford University Press. key idea was that expertise in chess requires acquiring a Thompson, R. F. (1990). Neural mechanisms of classical condi- large collection of relatively small chunks (each at most six tioning in mammals. Philosophical Transactions of the Royal pieces) denoting typical patterns of pieces on the chess Society of London Series B 329:171–178. board. These chunks are accessed through a discrimination Yeo, C. H., M. J. Hardiman, and M. Glickstein. (1985). Classical net and act as the conditions of a PRODUCTION SYSTEM: they conditioning of the nictitating membrane response of the rabbit evoke possible moves in this situation. In other respects, (3 papers). Experimental Brain Research 60: 87–98; 99–113; chess experts do not differ from less expert players: they 114–125. 114 Chess, Psychology of have the same limits in memory (a short-term memory of While productive in its own terms, computer science about seven chunks) and learning rate (about eight seconds research on chess (see GAME-PLAYING SYSTEMS) has had are required to learn a chunk). In chess, as well as in other relatively little impact on the psychology of chess. The main domains, the chunking theory explains experts’ remarkable advances have been the development of search techniques, memory by their ability to find more and larger chunks, and which have culminated in the construction of DEEP BLUE, explains their selective search by the fact that chunks evoke the first computer to have beaten a world champion in a potentially good actions. Some aspects of the theory were match. More recently, chess has been a popular domain for implemented in a computer program by Simon and Gilmar- testing MACHINE LEARNING techniques. Finally, attempts to tin (1973). Simulations with this program gave a good fit to use a production-system architecture (e.g., Wilkins 1980) the behavior of a strong amateur and led to the estimation have met with limited success in terms of the strength of the that expertise requires the presence of a large number of programs. chunks, approximately between 10,000 and 100,000. The key findings in chess research—selective search, A wealth of empirical data was gathered to test the pattern recognition, and memory for the domain material— chunking theory in various domains of expertise. In chess, have been shown to generalize to other domains of exper- five directions of research may be singled out as critical: tise. This augurs well for current interests in the field: inte- importance of perception and pattern recognition, relative gration of low- and high-level aspects of knowledge and role of short-term and long-term memories, evidence for unification of chess perception, memory, and problem-solv- chunks, role of higher-level knowledge, and size of search. ing theories into a single theoretical framework. Converging evidence indicates that perceptual, pattern- See also COGNITIVE ARCHITECTURE; DOMAIN SPECIFIC- based cognition is critical in chess expertise. The most com- ITY; HEURISTIC SEARCH; PROBLEM SOLVING pelling data are that EYE MOVEMENTS during the first few —Fernand Gobet seconds when examining a new chess position differ between experts and nonmasters (De Groot and Gobet 1996), and that References masters still play at a high level in speed chess games where they have only five seconds per move on average, or in simul- Charness, N. (1976). Memory for chess positions: resistance to taneous games where their thinking time is reduced by the interference. Journal of Experimental Psychology: Human presence of several opponents (Gobet and Simon 1996). Learning and Memory 2: 641–653. Research on MEMORY has led to apparently contradictory Charness, N. (1981). Search in chess: age and skill differences. conclusions. On the one hand, several experiments on the Journal of Experimental Psychology: Human Perception and effect of interfering tasks (e.g., Charness 1976) have shown Performance 2: 467–476. Chi, M. T. H. (1978). Knowledge structures and memory develop- that two of Simon and Chase’s (1973) assumptions—that ment. In R. S. Siegler, Ed., Children’s Thinking: What Devel- storage into long-term memory is slow and that chunks are ops? Hillsdale, NJ: Erlbaum, pp. 73–96. held in short-term memory—run into problems. This Cooke, N. J., R. S. Atlas, D. M. Lane, and R. C. Berger. (1993). encouraged researchers such as Cooke et al. (1993) to Role of high-level knowledge in memory for chess positions. emphasize the role of higher-level knowledge, already antic- American Journal of Psychology 106: 321–351. ipated by De Groot (1946/1978). On the other hand, empiri- de Groot, A. D. (1978). Thought and Choice in Chess. The Hague: cal evidence for chunks has also been mounting (e.g., Chi Mouton Publishers. 1978; Gobet and Simon 1996; Saariluoma 1994). de Groot, A. D., and F. Gobet. (1996). Perception and memory in Attempts to reconcile low-level and high-level types of chess. Heuristics of the professional eye. Assen: Van Gorcum. encoding have recently been provided by the long-term Elo, A. (1978). The Rating of Chess Players, Past and Present. New York: Arco. WORKING MEMORY (LTWM) theory (Ericsson and Kintsch Ericsson, K. A., and W. Kintsch. (1995). Long-term working mem- 1995) and by the template theory (Gobet and Simon 1996). ory. Psychological Review 102: 211–245. LTWM proposes that experts build up both schema-based Gobet, F., and H. A. Simon. (1996). Templates in chess memory: a knowledge and domain-specific retrieval structures that rap- mechanism for recalling several boards. Cognitive Psychology idly encode the important elements of a problem. The tem- 31: 1–40. plate theory, based on the chunking theory and implemented Holding, D. H. (1985). The Psychology of Chess Skill. Hillsdale, as a computer program, proposes that chunks evolve into NJ: Erlbaum. more complex data structures (templates), allowing some Saariluoma, P. (1994). Location coding in chess. The Quarterly values to be encoded rapidly. Both theories also account for Journal of Experimental Psychology 47A: 607–630. aspects of skilled perception and problem solving in chess. Simon, H. A., and W. G. Chase. (1973). Skill in chess. American Scientist 61: 393–403. Recent results indicate that stronger players search some- Simon, H. A., and K. J. Gilmartin. (1973). A simulation of mem- what more broadly and deeply than weaker players (Char- ory for chess positions. Cognitive Psychology 5: 29–46. ness 1981; Holding 1985), with an asymptote at high skill Wilkins, D. (1980). Using patterns and plans in chess. Artificial levels. In addition, the space searched remains small (think- Intelligence 14: 165–203. ing-aloud protocols indicate that grandmasters typically search no more than one hundred nodes in fifteen minutes). Further Readings These results are compatible with a theory based on pattern recognition: chunks, which evoke moves or sequences of Binet, A. (1966). Mnemonic virtuosity: a study of chess players. moves, make search more selective and allow better players Genetic Psychology Monographs 74: 127–162. (Translated fom to search more deeply. the Revue des Deux Mondes (1893) 117: 826–859.) Chinese Room Argument 115 symbols which, unknown to the person in the room, are Calderwood, B., G. A. Klein, and B. W. Crandall. (1988). Time pressure, skill, and move quality in chess. American Journal of questions in Chinese (the input). And imagine that by fol- Psychology 101: 481–493. lowing the instructions in the program the man in the room Charness, N. (1989). Expertise in chess and bridge. In D. Klahr is able to pass out Chinese symbols that are correct answers and K. Kotovsky, Eds., Complex Information Processing: The to the questions (the output). The program enables the per- Impact of Herbert A. Simon. Hillsdale, NJ: Erlbaum, pp. 183– son in the room to pass the Turing test for understanding 208. Chinese, but he does not understand a word of Chinese. Chase, W. G., and H. A. Simon. (1973). Perception in chess. Cog- The point of the argument is this: if the man in the room nitive Psychology 4: 55–81. does not understand Chinese on the basis of implementing Frey, P. W., and P. Adesman. (1976). Recall memory for visually the appropriate program for understanding Chinese, then presented chess positions. Memory and Cognition 4: 541–547. neither does any other digital computer solely on that basis Freyhoff, H., H. Gruber, and A. Ziegler. (1992). Expertise and hierarchical knowledge representation in chess. Psychological because no computer, qua computer, has anything the man Research 54: 32–37. does not have. Fürnkranz, J. (1996). Machine learning in computer chess: the next The larger structure of the argument can be stated as a generation. International Computer Chess Association Journal derivation from three premises. 19: 147–161. Gobet, F., and H. A. Simon. (1996a). The roles of recognition pro- 1. Implemented programs are by definition purely formal cesses and look-ahead search in time-constrained expert prob- or syntactical. (An implemented program, as carried out lem solving: evidence from grandmaster level chess. by the man in the Chinese room, for example, is defined Psychological Science 7: 52–55. purely in terms of formal or syntactical symbol manipu- Gobet, F., and H. A. Simon. (1996b). Recall of rapidly presented lations. The notion “same implemented program” speci- random chess positions is a function of skill. Psychonomic Bul- fies an equivalence class defined purely in terms of letin and Review 3: 159–163. syntactical manipulations, independent of the physics of Goldin, S. E. (1978). Effects of orienting tasks on recognition of their implementation.) chess positions. American Journal of Psychology 91: 659–671. 2. Minds have mental or semantic contents. (For example, Hartston, W. R., and P. C. Wason. (1983). The Psychology of in order to think or understand a language you have to Chess. London: Batsford. have more than just the syntax, you have to associate Newell, A., and H. A. Simon. (1972). Human Problem Solving. some meaning, some thought content, with the words or Englewood Cliffs, NJ: Prentice-Hall. signs.) Pitrat, J. (1977). A chess combinations program which uses plans. 3. Syntax is not by itself sufficient for, nor constitutive of, Artificial Intelligence 8: 275–321. semantics. (The purely formal, syntactically defined Robbins, T. W., E. Anderson, D. R. Barker, A. C. Bradley, C. symbol manipulations don’t by themselves guarantee the Fearnyhough, R. Henson, S. R. Hudson, and A. D. Baddeley. presence of any thought content going along with them.) (1995). Working memory in chess. Memory and Cognition 24: Conclusion: Implemented programs are not constitutive 83–93. of minds. Strong AI is false. Simon, H. A. (1979). Models of Thought, vol. 1. New Haven: Yale University Press. Why does the man in the Chinese room not understand Simon, H. A., and M. Barenfeld. (1969). Information processing Chinese even though he can pass the Turing test for under- analysis of perceptual processes in problem solving. Psycho- standing Chinese? The answer is that he has only the formal logical Review 76: 473–483. syntax of the program and not the actual mental content or semantic content that is associated with the words of a lan- Chinese Room Argument guage when a speaker understands that language. You can see this by contrasting the man in the Chinese room with the same man answering questions put to him in his native The Chinese room argument is a refutation of strong artifi- English. In both cases he passes the Turing test, but from his cial intelligence. “Strong AI” is defined as the view that an point of view there is a big difference. He understands the appropriately programmed digital computer with the right English and not the Chinese. In the Chinese case he is acting inputs and outputs, one that satisfies the Turing test, would as a digital computer. In the English case he is acting as a necessarily have a mind. The idea of Strong AI is that the normal competent speaker of English. This shows that the implemented program by itself is constitutive of having a Turing test fails to distinguish real mental capacities from mind. “Weak AI” is defined as the view that the computer simulations of those capacities. Simulation is not duplica- plays the same role in studying cognition as it does in any tion, but the Turing test cannot detect the difference. other discipline. It is a useful device for simulating and There have been a number of attempts to answer this therefore studying mental processes, but the programmed argument, all of them, in the view of this author, unsuccess- computer does not automatically guarantee the presence of ful. Perhaps the most common is the systems reply: “While mental states in the computer. Weak AI is not criticized by the man in the Chinese room does not understand Chinese, the Chinese room argument. he is not the whole system. He is but the central processing The argument proceeds by the following thought experi- unit, a simple cog in the large mechanism that includes ment. Imagine a native English speaker, let’s say a man, room, books, etc. It is the whole room, the whole system, who knows no Chinese locked in a room full of boxes of that understands Chinese, not the man.” Chinese symbols (a data base) together with a book of The answer to the systems reply is that the man has no instructions for manipulating the symbols (the program). way to get from the SYNTAX to the SEMANTICS, but neither Imagine that people outside the room send in other Chinese 116 Chunking does the whole room. The whole room also has no way of functions whose values can be computed by a particular ide- attaching any thought content or mental content to the for- alized computing device, a Turing machine. As the two mal symbols. You can see this by imagining that the man mathematical notions are provably equivalent, the theses are internalizes the whole room. He memorizes the rulebook “equivalent,” and are jointly referred to as the Church-Tur- and the data base, he does all the calculations in his head, ing thesis. and he works outdoors. All the same, neither the man nor The reflective, partly philosophical and partly mathemat- any subsystem in him has any way of attaching any meaning ical, work around and in support of the thesis concerns one to the formal symbols. of the fundamental notions of mathematical logic. Its proper The Chinese room has been widely misunderstood as understanding is crucial for making informed and reasoned attempting to show a lot of things it does not show. judgments on the significance of limitative results—like GÖDEL’S THEOREMS or Church’s theorem. The work is 1. The Chinese room does not show that “machines can’t equally crucial for computer science, artificial intelligence, think.” On the contrary, the brain is a machine and brains and cognitive psychology as it provides also for these sub- can think. jects a basic theoretical notion. For example, the thesis is the 2. The Chinese room does not show that “computers can’t cornerstone for Allen NEWELL’s delimitation of the class of think.” On the contrary, something can be a computer physical symbol systems, that is, universal machines with a and can think. If a computer is any machine capable of particular architecture. Newell (1980) views this delimita- carrying out a computation, then all normal human tion “as the most fundamental contribution of artificial intel- beings are computers and they think. The Chinese room shows that COMPUTATION, as defined by Alan TURING ligence and computer science to the joint enterprise of and others as formal symbol manipulation, is not by cognitive science.” In a turn that had almost been taken by itself constitutive of thinking. Turing (1948, 1950), Newell points to the basic role physi- 3. The Chinese room does not show that only brains can cal symbol systems have in the study of the human mind: think. We know that thinking is caused by neurobiologi- “the hypothesis is that humans are instances of physical cal processes in the brain, but there is no logical obstacle symbol systems, and, by virtue of this, mind enters into the to building a machine that could duplicate the causal physical universe . . . this hypothesis sets the terms on which powers of the brain to produce thought processes. The we search for a scientific theory of mind.” The restrictive point, however, is that any such machine would have to “almost” in Turing’s case is easily motivated: he viewed the be able to duplicate the specific causal powers of the precise mathematical notion as a crucial ingredient for the brain to produce the biological process of thinking. The mere shuffling of formal symbols is not sufficient to investigation of the mind (using computing machines to guarantee these causal powers, as the Chinese room simulate aspects of the mind), but did not subscribe to a shows. sweeping “mechanist” theory. It is precisely for an under- standing of such—sometimes controversial—claims that the See also COMPUTATIONAL THEORY OF MIND; FUNCTION- background for Church’s and Turing’s work has to be pre- ALISM; INTENTIONALITY; MENTAL REPRESENTATION sented carefully. Detailed connections to investigations in —John R. Searle cognitive science, programmatically indicated above, are at the heart of many contributions (cf. for example, COGNITIVE Further Readings MODELING, COMPUTATIONAL LEARNING THEORY, and COM- PUTATIONAL THEORY OF MIND). Searle, J. R. (1980). Minds, brains and programs. Behavioral and The informal notion of an effectively calculable function, Brain Sciences, vol. 3 (together with 27 peer commentaries and effective procedure, or algorithm had been used in nine- author’s reply). teenth century mathematics and logic, when indicating that a class of problems is solvable in a “mechanical fashion” by Chunking following fixed elementary rules. Hilbert in 1904 already suggested taking formally presented theories as objects of mathematical study, and metamathematics has been pursued See CHESS, PSYCHOLOGY OF; EXPLANATION-BASED LEARN- vigorously and systematically since the 1920s. In its pursuit ING; FRAME-BASED SYSTEMS; METAREASONING concrete issues arose that required for their resolution a pre- cise characterization of the class of effective procedures. Church-Turing Thesis Hilbert’s Entscheidungsproblem (see Hilbert and Bernays 1939), the decision problem for first order logic, was one Alonzo Church proposed at a meeting of the American such issue. It was solved negatively—relative to the precise Mathematical Society in April 1935, “that the notion of an notion of recursiveness, respectively to Turing machine effectively calculable function of positive integers should be computability; though obtained independently by Church identified with that of a recursive function.” This proposal and Turing, this result is usually called Church’s theorem. A of identifying an informal notion, effectively calculable second significant issue was the formulation of Gödel’s function, with a mathematically precise one, recursive func- Incompleteness theorems as applying to all formal theories tion, has been called Church’s thesis since Stephen Cole (satisfying certain representability and derivability condi- Kleene used that name in 1952. Alan TURING independently tions). Gödel had established the theorems in his ground- made a related proposal in 1936, Turing’s thesis, suggesting breaking 1931 paper for specific formal systems like type the identification of effectively calculable functions with theory of Principia Mathematica or Zermelo-Fraenkel set Church-Turing Thesis 117 evident limitation of the computor’s sensory apparatus moti- theory. The general formulation required a convincing char- vate the conditions. Turing also required that the computor acterization of “formality” (see FORMAL SYSTEMS). proceed deterministically. The above conditions, somewhat According to Kleene (1981) and Rosser (1984), Church hidden in Turing’s 1936 paper, are here formulated follow- proposed in late 1933 the identification of effective calcula- bility with λ-definability. That proposal was not published ing Sieg (1994); first the boundedness conditions: at the time, but in 1934 Church mentioned it in conversation (B.1) There is a fixed bound for the number of symbolic to Gödel, who judged it to be “thoroughly unsatisfactory.” configurations a computor can immediately recognize. In his subsequent Princeton Lectures Gödel defined the con- cept of a (general) recursive function using an equational (B.2) There is a fixed bound for the number of a computor’s calculus, but he was not convinced that all effectively calcu- internal states that need to be taken into account. lable functions would fall under it. The proof of the equiva- Because the behavior of the computor is uniquely deter- lence between λ-definability and recursiveness (found by mined by the finitely many combinations of symbolic con- Church and Kleene in early 1935) led to Church’s first pub- figurations and internal states, he can carry out only finitely lished formulation of the thesis as quoted above; it was reit- many different operations. These operations are restricted erated in Church’s 1936 paper. Turing also introduced in by the locality conditions: 1936 his notion of computability by machines. Post’s 1936 paper contains a model of computation that is strikingly (L.1) Only elements of observed configurations can be similar to Turing’s, but he did not provide any analysis in changed. support of the generality of his model. On the contrary, he (L.2) The computor can shift his attention from one sym- suggested considering the identification of effective calcula- bolic configuration to another only if the second is bility with his concept as a working hypothesis that should within a bounded distance from the first. be verified by investigating ever wider formulations and reducing them to his basic formulation. The classical papers Thus, on closer inspection, Turing’s thesis is seen as the of Gödel, Church, Turing, Post, and Kleene are all reprinted result of a two-part analysis. The first part yields the above in Davis 1965, and good historical accounts can be found in conditions and Turing’s central thesis, that any mechanical Davis 1982, Gandy 1988, and Sieg 1994. procedure can be carried out by a computor satisfying these Church (1936) presented one central reason for the pro- conditions. The second part argues that any number theo- posed identification, namely that other plausible explica- retic function calculable by such a computor is computable tions of the informal notion lead to mathematical concepts by a Turing machine. Both Church and Gödel found Tur- weaker than or equivalent to recursiveness. Two paradig- ing’s analysis convincing; indeed, Church wrote (1937) that matic explications, calculability of a function via algorithms Turing’s notion makes “the identification with effectiveness and in a logic, were considered by Church. In either case, in the ordinary (not explicitly defined) sense evident imme- the steps taken in determining function values have to be diately.” From a strictly mathematical point, the analysis effective; if the effectiveness of steps is taken to mean recur- leaves out important steps, and the claim that is actually siveness, then the function can be proved to be recursive. established is the more modest one that Turing machines This requirement on steps in Church’s argument corre- operating on strings can be simulated by machines operating sponds to one of the “recursiveness conditions” formulated on single letters; a way of generalizing Turing’s argument is by Hilbert and Bernays (1939). That condition is used in presented in Sieg and Byrnes (1996). their characterization of functions that are evaluated accord- ing to rules in a deductive formalism: it requires the proof Two final remarks are in order. First, all the arguments for predicate for a deductive formalism to be primitive recur- the thesis take for granted that the effective procedures are sive. Hilbert and Bernays show that all such “reckonable” being carried out by human beings. Gandy, by contrast, ana- functions are recursive and actually can be evaluated in a lyzed in his 1980 paper machine computability; that notion very restricted number theoretic formalism. Thus, in any crucially involves parallelism. Gandy’s mathematical model formalism that satisfies the recursiveness conditions and nevertheless computes only recursive functions. Second, the contains this minimal number theoretic system, one can effective procedures are taken to be mechanical, not general compute exactly the recursive functions. Recursiveness or cognitive ones—as claimed by Webb and many others. computability consequently has, as Gödel emphasized, an Also, Gödel was wrong when asserting in a brief note from absoluteness property not shared by other metamathemati- 1972 that Turing intended to show in his 1936 paper that cal notions like provability or definability; the latter notions “mental procedures cannot go beyond mechanical proce- depend on the formalism considered. dures.” Turing, quite explicitly, had no such intentions; even All such indirect and ultimately unsatisfactory consider- after having been engaged in the issues surrounding ations were bypassed by Turing. He focused directly on the machine intelligence, he emphasized in his 1953 paper that fact that human mechanical calculability on symbolic con- the precise concepts (recursiveness, Turing computability) figurations was the intended notion. Analyzing the pro- are to capture the mechanical processes that can be carried cesses that underlie such calculations (by a computer), out by human beings. Turing was led to certain boundedness and locality condi- See also ALGORITHM; COMPUTATION; COMPUTATION tions. To start with, he demanded the immediate recogniz- AND THE BRAIN; LOGIC ability of symbolic configurations so that basic computation —Wilfried Sieg steps need not be further subdivided. This demand and the 118 Civilization References Further Readings Hilbert, D., and W. Ackermann. (1928). Grundzüge der theoret- Church, A. (1935). An unsolvable problem of elementary number ichen Logik. Springer Verlag. theory (abstract). Bulletin of the Amer. Math. Soc. 41: 332–333. Mundici, D., and W. Sieg. (1995). Paper machines. Philosophia Church, A. (1936). An unsolvable problem of elementary number Mathematica 3: 5–30. theory. Amer. J. Math. 58: 345–363. Post, E. (1947). Recursive unsolvability of a problem of Thue. J. Church, A. (1937). Review of Turing (1936). J. Symbolic Logic 2: Symbolic Logic 12: 1–11. 42–43. Sieg, W. (1997). Step by recursive step: Church’s analysis of effec- Davis, M., Ed. (1965). The Undecidable: Basic Papers on Unde- tive calculability. Bulletin of Symbolic Logic 3: 154–180. cidable Propositions, Unsolvable Problems and Computable Soare, R. (1996). Computability and recursion. Bulletin of Sym- Functions. Raven Press. bolic Logic 2: 284–321. Davis, M. (1982). Why Gödel didn’t have Church’s Thesis. Infor- mation and Control 54: 3–24. Gandy, R. (1980). Church’s Thesis and principles for mechanisms. Civilization In Barwise, Keisler, and Kunen, Eds., The Kleene Symposium. North-Holland, pp. 123–148. Gandy, R. (1988). The confluence of ideas in 1936. In R. Herken, SeeARTIFACTS AND CIVILIZATION; TECHNOLOGY AND Ed., The Universal Turing Machine. Oxford University Press, HUMAN EVOLUTION; WRITING SYSTEMS pp. 55–111. Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Classification Mathematica und verwandter Systeme I. In Collected Works I. Oxford: Oxford University Press, 1986, pp. 144–195. Gödel, K. (1934). On undecidable propositions of formal mathe- See matical systems (Princeton Lectures). In Collected Works I. CATEGORIZATION; DECISION TREES; MACHINE LEARN- Oxford: Oxford University Press, 1986, pp. 346–369. ING; NEURAL NETWORKS Gödel, K. (1972). A philosophical error in Turing’s work. In Col- lected Works II. Oxford: Oxford University Press, 1986, pp. Clustering 306. Gödel, K. (1986). Collected Works. Three volumes. S. Feferman et al., Eds. Oxford: Oxford University Press. See UNSUPERVISED LEARNING Hilbert, D., and P. Bernays. (1939). Grundlagen der Mathematik, vol. 2. Springer Verlag. Kleene, S. (1952). Introduction to Metamathematics. Wolters- Codeswitching Noordhoff. Kleene, S. (1981). Origins of recursive function theory. Annals Hist. Computing 3: 52–66. Codeswitching (CS) is commonly defined as the alternating Newell, A. (1980). Physical symbol systems. Cognitive Science 2: use of two or more codes in the same conversational event. 135–184. The term was first employed to refer to the coexistence of Post, E. (1936). Finite combinatory processes. Formulation 1. J. more than one structural system in the speech of one indi- Symbolic Logic 1: 103–105. vidual by JAKOBSON, Fant, and Halle (1952), who use Rosser, B. (1984). Highlights of the history of the lambda-calcu- “code” in the abstract information theoretical sense. In later lus. Annals Hist. Computing 6: 337–349. writings, “code” has come to be synonymous with “lan- Sieg, W. (1994). Mechanical procedures and mathematical experi- guage” or “speech variety.” Recent research on CS falls ence. In A. George, Ed., Mathematics and Mind. Oxford: within two distinct traditions: the syntactic, providing Oxford University Press, pp. 71–117. Sieg, W., and J. Byrnes. (1996). K-graph machines: generalizing insights into the linguistic principles that underlie the form Turing’s machines and arguments. In P. Hájek, Ed., Lecture that CS takes; and the pragmatic that relates linguistic form Notes in Logic 6. Springer Verlag, pp. 98–119. to function in everyday discourse. Turing, A. (1936). On computable numbers, with an application to Contrary to common assumptions, CS is most frequent the Entscheidungsproblem. Reprinted in M. Davis, Ed., (1965), among proficient multilinguals. CS may be intersentential The Undecidable: Basic Papers on Undecidable Propositions, or intrasentential, the latter exemplified in the English- Unsolvable Problems and Computable Functions. Raven Press. Spanish utterance, “Codeswitching among fluent bilinguals Turing, A. (1948). Intelligent Machinery. In D. C. Ince, Ed., Col- ha sido la fuente de numerosas investigaciones” (“has been lected Works of A. M. Turing: Mechanical Intelligence. North- the source of numerous studies”), and the English-Japanese, Holland, pp. 107–127. “That’s how you say it nihongo de” (“in Japanese”). The Turing, A. (1950). Computing machinery and intelligence. Mind 59: 433–460. status of such intrasentential CS had been much in dispute: Turing, A. (1953). Solvable and unsolvable problems. Science some linguists view it as indicative of imperfect language News 31: 7–23. acquisition or interference. However, later studies reveal van Heijenoort, J. (1967). From Frege to Gödel. Cambridge, MA: that intrasentential CS requires advanced competence in the Harvard University Press. syntactic systems involved. Particularly significant is the Webb, J. C. (1980). Mechanism, Mentalism, and Metamathemat- fact that intrasentential CS demonstrates grammatical regu- ics. Reidel. larities, reflecting underlying, unconscious principles that Webb, J. C. (1990). Introductory note to Gödel (1972). In Col- speakers rely on in distinguishing between permissible and lected Works II. Oxford: Oxford University Press, 1986, pp. unacceptable switches. 292–304. Codeswitching 119 The notion of grammatical equivalence has played an ture and the other with function. In one widely accepted important role in the syntactic analysis of CS. One early for- view, for any one speech event, specific codes count as malization is Poplack’s (1980) “Equivalence Constraint,” appropriate while others are marked (Myers-Scotton 1993). according to which codes will be switched at points where CS is said to convey information by virtue of the fact that the surface structures of the languages map onto each other. the markedness directly reflects societal values and ideolo- This premise has been challenged by studies on CS in typo- gies. For example, “standard” speech varieties are said to logically dissimilar languages (Romaine 1989). More convey authority because they are associated with official recently, researchers have introduced CS data into the dis- situations. But linguistic anthropologists criticize this cussion of universal grammar, as advanced in Chomsky’s approach on the grounds that it rests on a dichotomized principles and parameters framework (Chomsky 1981, view of structure and function that cannot account for situ- 1986), maintaining that the relevant constraints on CS ated understanding (Bourdieu 1991; Hanks 1995; Auer should exploit syntactic distinctions and relations already 1998). It is assumed that the signaling processes underlying extant in the grammar. This line of inquiry was initiated by interpretation involve both symbolic (i.e., denotational) and Woolford (1983), who developed a generative model of CS. indexical signs, which communicate via conventionalized Since that time, investigations into the properly syntactic associations between sign and context (Lucy 1993; Silver- principles underlying CS patterns have grown significantly stein 1993). In discourse, indexical signs function metaprag- in number and scope: Di Sciullo, Muysken, and Singh matically to evoke the mostly unverbalized contextual (1986) propose the Government Constraint, invoking this presuppositions on which assessments of communicative syntactic-theoretical hierarchical relation in disallowing CS intent rest. CS functions as one of a class of indexical signs between certain elements in the sentence; Belazi, Rubin, or contextualization cues (Gumperz 1992, 1996). Along and Toribio (1994) propose the Functional Head Constraint, with others of the same class (e.g., PROSODY AND INTONA- a specific application of the general X-bar theoretical pro- TION and rhythm), such signs are not lexically meaningful; cess of feature checking that holds between a functional they work by constructing the ground for situated interpreta- head and its complement (see X-BAR THEORY); and tion. By way of example, consider the following exchange. McSwan (1997) demonstrates the proper role of CS con- A third grader, D, is having difficulty with the adjectival use straints within Chomsky’s (1993) Minimalist Program. The of “surprising” in a workbook question about a story dis- validity of the aforementioned works relating CS to gram- cussed in class: What surprising discovery does Molly matical competence is further corroborated by investiga- make? His partner G makes several unsuccessful attempts to tions focusing on the development of CS ability in children explain, but D remains unconvinced until G finally comes acquiring multiple languages simultaneously. Especially up with the paraphrase: “A discovery que era [“that was”] noteworthy are writings by Jürgen Meisel (1990, 1994), surprising discovery.” Whereupon D finally produces the whose findings on the syntactic regularities underlying early expression “surprising discovery” on his own. The switch CS provide theoretical insights obscured in the investigation here counts as an indexical cue that reframes the issue, so as of monolingual acquisition. These developments make clear to relate the new expression to what D already knows that the study of CS has reached a dimension of inquiry that (Gumperz, Cook-Gumperz, and Szymanski 1998). can be informed by, and at once contribute to, the continued When seen in the perspective of practice, then, CS does advancement of syntactic theory. not convey propositional meaning nor does it directly con- Pragmatic approaches to CS deal with the relation vey societal attitudes: CS affects the situated interpretive between structure and function in everyday speech process by enabling participants to project particular inter- exchanges, and cover a wider, often more loosely defined pretations that are then confirmed or disconfirmed by what range of switching phenomena. Talk is treated as discourse happens in subsequent speech. What distinguishes CS from level intentional action, where actors communicate in the other metapragmatic signs is that it is always highly ideolo- context of social groupings, be they speech communities, gized, so that the sequential analysis of the interactive pro- social, ethnic, professional, or other interest groups (Hymes cess by which interpretations are agreed upon can be a 1967; Clark 1996), acting in pursuit of context-specific highly sensitive index, not solely of grammatical knowl- communicative ends (Grice 1989). The verbal resources of edge, but also of shared, culturally specific knowledge. such human populations are described in terms of inherently See also BILINGUALISM AND THE BRAIN; MINIMALISM; variable linguistic repertoires (Labov 1972) that, depending PRAGMATICS; PRESUPPOSITION; RELEVANCE AND RELE- on local circumstances, consist of either grammatically dis- VANCE THEORY; SYNTAX tinct languages or dialects, or styles of the same language —John Gumperz and Almeida Jacqueline Toribio (Gumperz 1964; Hymes 1967). The use of one or another of the available coexisting codes (languages, dialects, styles, or References speaking genres) serves a variety of rhetorical functions, for example to engage the listener, to shift footing, to mitigate Auer, P., Ed. (1998). Code-Switching in Conversation: Language, or strengthen a speech act, to mark reported speech, and to Interaction and Identity. London: Routledge. repair or clarify. In this way, language switching can be said Belazi, H. M., E. J. Rubin, and A. J. Toribio. (1994). Code- to be functionally equivalent to style shifting in monolin- switching and X-bar theory. Linguistic Inquiry 25 (2): 221– gual speech (Zentella 1997). 237. A common claim is that syntactic and pragmatic Bourdieu, P. (1991). Language and Symbolic Power. Cambridge: approaches complement each other, one dealing with struc- Polity Press. 120 Cognition and Aging Chomsky, N. (1981). Lectures on Government and Binding. Dor- Toribio, A. J., and E. J. Rubin. (1996). Code-switching in genera- drecht: Foris. tive grammar. In A. Roca and J. Jensen, Eds., Spanish in Con- Chomsky, N. (1986). Barriers. Cambridge, MA: MIT Press. tact: Issues in Bilingualism. Sommerville, MA: Cascadilla Chomsky, N. (1993). A minimalist program for linguistic theory. Press, pp. 203–226. In K. Hale and S. J. Keyser, Eds., The View From Building 20: Woolford, E. (1983). Bilingual code-switching and syntactic the- Essays in Linguistics in Honor of Sylvain Bromberger. Cam- ory. Linguistic Inquiry 14: 520–536. bridge, MA: MIT Press, pp. 1–52. Zentella, A. C. (1997). Growing Up Bilingual. Malden, MA: Clark, H. (1996). Using Language. Cambridge: Cambridge Uni- Blackwell Publishers. versity Press. Duran, R., Ed. (1988). Latino Language and Communicative Cognition and Aging Behavior. Norwood, NJ: ABLEX Publishing. Di Sciullo, A. M., P. Muysken, and R. Singh. (1986). Government and code-mixing. Journal of Linguistics 22: 1–24. See AGING AND COGNITION; AGING, MEMORY, AND THE Goffman, E. (1982). The interaction order. American Sociological Review 48: 1–17. BRAIN Grice, P. (1989). Ways with Words. Cambridge, MA: Harvard Uni- versity Press. Cognitive Anthropology Gumperz, J. (1964). Linguistic and social interaction in two com- munities. American Anthropologist 6: 137–153. Gumperz, J. (1981). Discourse Strategies. Cambridge: Cambridge Cognitive anthropology is a unified subfield of cultural University Press. anthropology whose principal aim is to understand and Gumperz, J. (1992). Contextualization and understanding. In A. Duranti and C. Goodwin, Eds., Rethinking Context. Cambridge: describe how people in societies conceive and experience Cambridge University Press, pp. 229–252. their world (Casson 1994). Gumperz, J. (1996). The linguistic and cultural relativity of infer- The definition of culture that guides research in cognitive ence. In J. Gumperz and S. Levinson, Eds., Rethinking Linguistic anthropology holds that culture is an idealized cognitive Relativity. Cambridge: Cambridge University Press, pp. 374–406. system—a system of knowledge, beliefs, and values—that Gumperz, J. J., J. Cook-Gumperz, and M. Szymanski. (1998). Col- exists in the minds of members of society. Culture is the laborative Practice in Bilingual Learning. Santa Barbara: mental equipment that society members use in orienting, CREDE Research Report, University of California. transacting, discussing, defining, categorizing, and inter- Hanks, W. (1995). Language and Communicative Practice. Boul- preting actual social behavior in their society. der, CO: Westview Press. Among the many research topics in cognitive anthropol- Heller, M., Ed. (1988). Code-Switching: Anthropological and Sociolinguistic Perspectives. The Hague: Mouton de Gruyter. ogy, three are central: cultural models, cultural universals, Hymes, D. (1967). Models of the interaction of language and and CULTURAL CONSENSUS. The first of these will be the social setting. Journal of Social Issues 23: 8–28. focus here. Jakobson, R., G. M. Fant, and M. Halle. (1952). Preliminaries to Cultural models, often termed schemata, are abstractions Speech Analysis: The Distinctive Features and Their Corre- that represent conceptual knowledge. They are cognitive lates. Cambidge, MA: MIT Press. structures in memory that represent stereotypical concepts. Labov, W. (1972). Sociolinguistic Patterns. Philadelphia: Univer- Schemata structure our knowledge of objects and situations, sity of Pennsylvania. events and actions, and sequences of events and actions. Lucy, J. (1993). Introduction. In J. Lucy, Ed., Reflexive Language: General aspects of concepts are represented at higher levels Reported Speech and Metapragmatics. Cambridge: Cambridge in schematic structures, and variables associated with spe- University Press, pp. 9–32. McSwan, J. (1997). A Minimalist Approach to Intrasentential cific elements are represented at lower levels. Code Switching: Spanish-Nahuatl Bilingualism in Central Items in the LEXICON—words—and grammatical catego- Mexico. Ph.D. diss., University of California, Los Angeles. ries and rules are associated in memory with cultural mod- Published as McSwan, J. (1999). A Minimalist Approach to els. Linguistic forms and cognitive schemata “activate” each Intrasentential Code Switching. New York: Garland Press. other: linguistic forms bring schemata to mind, and sche- Meisel, J. (1990). Two First Languages: Early Grammatical mata are expressed in linguistic forms. Virtually all research Development in Bilingual Children. Dordrecht: Foris. strategies exploit this relationship between LANGUAGE AND Meisel, J. (1994). Bilingual First Language Acquisition. Amster- THOUGHT in studying conceptual knowledge and cognitive dam: John Benjamins Publishing Company. systems. Milroy, L., and P. Muysken, Eds. (1995). One Speaker, Two Lan- The cognitive model underlying commercial events in our guages: Cross-Disciplinary Perspectives on Code-Switching. Cambridge: Cambridge University Press. culture, a much discussed schema (e.g., Casson 1994), can Myers-Scotton, C. (1993). Social Motivations of Code-Switching. serve as an example. The [Commercial Event] schema has the Oxford: Clarendon Press. variables [buyer], [seller], [money], [goods], and [exchange] Poplack, S. (1980). Sometimes I’ll start a sentence in English y ter- (brackets here distinguish conceptual units from words). In mino en español: toward a typology of code-switching. Lin- this way, [buyer] is a person who possesses [money], the guistics 18: 581–618. medium of exchange, and [seller] is a person who possesses Romaine, S. (1989). Bilingualism. Oxford: Blackwell. [goods], the merchandise for sale; [exchange] is an interac- Silverstein, M. (1993). Metapragmatic discourse and metaprag- tion in which [buyer] gives [money] and gets [goods], while matic function. In J. Lucy, Ed., Reflexive Language: Reported [seller] gives [goods] and gets [money]. An event is under- Speech and Metapragmatics. Cambridge: Cambridge Univer- stood as a commercial transaction when persons, objects, and sity Press, pp. 33–58. Cognitive Anthropology 121 events in the environment are associated with appropriate ment is exposure” are evident in these sentences: “You schema variables. really exposed yourself,” “He felt the weight of everyone’s A number of words—buy, sell, pay, cost, worth, value, eyes,” “I felt naked,” “I was caught with my pants down,” spend, and charge—activate the [Commercial Event] and “I wanted to crawl under a rock” (Holland and Kipnis schema. Each of these words selects particular aspects of 1994: 320–322). the schema for highlighting or foregrounding, while leaving Cultural universals are systems of conceptual knowl- others in the background unexpressed. Buy focuses on the edge that occur in all societies. In studying cognitive exchange from the buyer’s perspective, and sell from the commonalities, anthropologists assume a “limited relativ- seller’s perspective. Cost focuses on the money part of ist,” or universalist, position, adopting a relativist view in the money-goods relationship, and value and worth focus on recognizing differences in cognitive and cultural systems the goods part of the relationship. Pay and spend focus and a universalist position in emphasizing fundamental on the buyer and the money part of the money-goods concepts and uniformities in these systems (Lounsbury relationship, and charge focuses on the seller and the goods 1969: 10). part of the money-goods relationship (Fillmore 1977). Comparative color category research, for instance, has Classification systems are complex cultural models shown that basic color categories are organized around best structured by hierarchical embedding. Entities—objects, examples, and that these focal colors are the same across acts, and events—that are in fact different are grouped individuals and languages (Berlin and Kay 1969). It has also together in conceptual categories and regarded as equiva- established that there are exactly eleven of these universal lent. Semantic relationships among the categories define color categories—[black], [white], [red], [green], [yellow], cognitive systems. Taxonomic hierarchies, or taxonomies, [blue], [brown], [purple], [orange], [pink] and [gray]—that are classifications structured on the basis of the inclusion, or they are encoded in a strict evolutionary sequence, and that “kind of,” relationship. Some categories included in the tree these universals are determined largely by neurophysiologi- category, for example, are oak, pine, elm, spruce, poplar, cal processes in human color perception (Kay, Berlin, and walnut, and fir. Oak in turn includes white oak, post oak, pin Merrifield 1991; see COLOR CLASSIFICATION). oak, and many other kinds of oak. Cultural consensus is concerned with individual variabil- Nontaxonomic classifications of various types are also ity in cultural knowledge and how the diversity of individual hierarchically structured. Partonomic classifications are conceptual systems are organized in cultural systems. Con- organized in terms of part-whole relationships. The family sensus theory examines the patterns of agreement among category, for example, has among its members (or parts) group members about particular domains of cultural knowl- mother, son, and sister. Functional classifications are con- edge in order to determine the organization of cognitive structed on the basis of the instrumental, or “used for”, rela- diversity. It establishes both a “correct” version of cultural tionship—a vehicle is any object that can be used for knowledge and patterns of cognitive diversity (Romney, transportation, for example, car, bus, moped, or unicycle Weller, and Batchelder 1986: 316). (Wierzbicka 1985). The Aguaruna Jivaro, a forest tribe in northern Peru, for Event scenarios are complex cultural models structured example, derive the majority of their sustenance from man- by horizontal linkages. Scenes in event schemata are linked ioc plants. A study of Aguaruna manioc gardens discovered in ordered sequences by way of causal relationships. The that, although individual Aguaruna vary widely in their Yakan, a Philippine agricultural society living on Basilan naming of manioc plants, they nonetheless maintain a con- Island in houses elevated on piles, have an event schemata sensus model of manioc classification. Patterns of agree- specifying “how to enter a Yakan house.” Social encounters ment reveal that individuals learn a single set of manioc are defined by the degree to which outsiders are able to categories with varying degrees of success: some individu- negotiate penetration into households. An outsider als have greater cultural competence in manioc identifica- progresses from “in the vicinity” of the house to “at” the tion than others (Boster 1985: 185). house, from “below” the house to “on” the porch, from See also CATEGORIZATION; CULTURAL PSYCHOLOGY; “outside” on the porch to “inside” the main room, and from CULTURAL RELATIVISM; CULTURAL VARIATION; HUMAN the “foot zone” at the entrance door to the “head zone” UNIVERSALS; LANGUAGE AND CULTURE; METAPHOR AND opposite the door, which is the most private setting in the CULTURE; NATURAL KINDS house (Frake 1975: 26–33). —Ronald W. Casson Metaphorical cultural models are structured by concep- tual METAPHORS. Abstract concepts that are not clearly References delineated in experience, such as time, love, and ideas, are metaphorically structured, understood, and discussed in Berlin, B., and P. Kay. (1969). Basic Color Terms: Their Univer- terms of other concepts that are more concrete in experi- sality and Evolution. Berkeley: University of California Press. ence, such as money, travel, and foods (Lakoff and Johnson Boster, J. S. (1985). “Requiem for the Omniscient Informant”: 1980). The metaphorical concept “embarrassment is expo- There’s life in the old girl yet. In J. W. D. Dougherty, Ed., sure” is an example. The embarrassment schema is struc- Directions in Cognitive Anthropology. Urbana: University of tured in terms of the exposure schema. The systematicity of Illinois Press. the metaphor is reflected in everyday speech formulas, Casson, R. W. (1994). Cognitive anthropology. In P. K. Bock, Ed., which are sources of insight into and evidence for the nature Handbook of Psychological Anthropology. Westport, CT: of the metaphor. Fixed-form expressions for “embarrass- Greenwood Press. 122 Cognitive Archaeology Fillmore, C. J. (1977). Topics in lexical semantics. In R. W. Cole, Wallace, A. F. C. (1970). Culture and Personality. Second edition. Ed., Current Issues in Linguistic Theory. Bloomington: Indiana New York: Random House. University Press. Frake, C. O. (1975). How to enter a Yakan house. In M. Sanches Cognitive Archaeology and B. Blount, Eds., Sociocultural Dimensions of Language Use. New York: Academic Press. Holland, D., and A. Kipnis. (1994). Metaphors for embarrassment The term cognitive archaeology was introduced during the and stories of exposure: the not-so-egocentric self in American early 1980s to refer to studies of past societies in which culture. Ethos 22: 316–342. explicit attention is paid to processes of human thought and Kay, P., B. Berlin, and W. Merrifield. (1991). Biocultural implica- symbolic behavior. As the archaeological record only con- tions of systems of color naming. Journal of Linguistic Anthro- sists of the material remains of past activities—artifacts, pology 1: 12–25. Lakoff, G., and M. Johnson. (1980). Metaphors We Live By. Chi- bones, pits, hearths, walls, buildings—there is no direct cago: University of Chicago Press. information about the types of belief systems or thought Lounsbury, F. G. (1969). Language and culture. In S. Hook, Ed., processes that existed within past minds. These must be Language and Philosophy. New York: New York University inferred from those material remains. Cognitive archaeol- Press. ogy attempts to do this, believing that appropriate interpre- Romney, A. K., S. C. Weller, and W. H. Batchelder. (1986). Cul- tations of past material culture, the behavioral processes that ture as consensus: a theory of culture and informant accuracy. created it, and long-term patterns of culture change evident American Anthropologist 88: 313–338. from the archaeological record, such as the origin of agri- Wierzbicka, A. (1985). Lexicography and Conceptual Analysis. culture and the development of state society, requires that Ann Arbor, MI: Karoma Publishers, Inc. those belief systems and processes of thought be recon- Further Readings structed. There is a diversity of approaches and studies that fall Alverson, H. (1994). Semantics and Experience: Universal Meta- under the poorly defined umbrella of cognitive archaeology phors of Time in English, Mandarin, Hindi, and Sesotho. Balti- (see Renfrew et al. 1993). These can be grouped into three more, MD: Johns Hopkins University Press. broad categories that we can term postprocessual archaeol- Berlin, B. (1992). Ethnobiological Classification: Principles of ogy, cognitive-processual archaeology, and evolutionary- Categorization of Plants and Animals in Traditional Societies. cognitive archaeology. While these three categories differ in Princeton, NJ: Princeton University Press. significant ways with regard to both form and content, they Brown, C. H. (1984). Language and Living Things: Uniformities in also share some overriding features. The first is that an Folk Classification and Naming. New Brunswick, NJ: Rutgers University Press. understanding of human behavior and society, whether in Casson, R. W., Ed. (1981). Language, Culture, and Cognition: the distant past or the present, requires explicit reference to Anthropological Perspectives. New York: Macmillan. human cognition—although there is limited agreement on D’Andrade, R. G. (1995). The Development of Cognitive Anthro- quite what nature that reference should take. Second, that pology. New York: Cambridge University Press. the study of past or present cognition cannot be divorced Dougherty, J. W. D., Ed. (1985). Directions in Cognitive Anthro- from the study of society in general—individuals are inti- pology. Urbana: University of Illinois Press. mately woven together in shared frames of thought (Hodder, Frake, C. O. (1980). Language and Cultural Description. Stanford, in Renfrew et al. 1993). Indeed, the study of past or present CA: Stanford University Press. minds is hopelessly flawed unless it is integrated into a Goodenough, W. H. (1981). Culture, Language, and Society. Sec- study of society, economy, technology, and environment. ond edition. Menlo Park, CA: Benjamin/Commings. Hardin, C. L., and L. Maffi, Eds. (1997). Color Categories in Third, that material culture is critical not only as an expres- Thought and Language. Cambridge: Cambridge University sion of human cognition, but also as a means to attain it. Press. Postprocessual studies, which began in the late 1970s, Holland, D., and N. Quinn, Eds. (1987). Cultural Models in Lan- not only laid emphasis on the symbolic aspects of human guage and Thought. New York: Cambridge University Press. behavior but also adopted a postmodernist agenda in which Hunn, E. S. (1977). Tzeltal Folk Zoology: The Classification of processes of hypothesis testing as a means of securing Discontinuities in Nature. New York: Academic Press. knowledge were replaced by hermeneutic interpretation Kronenfeld, D. B. (1996). Plastic Glasses and Church Fathers: (e.g., Hodder 1982, 1986). As such, these studies began as a Semantic Extension from the Ethnoscience Tradition. New reaction against what was perceived, largely correctly, as a York: Oxford University Press. crude functionalism that had come to dominate archaeologi- Lakoff, G. (1987). Women, Fire, and Dangerous Things. Chicago: University of Chicago Press. cal theory and attempted to provide a new academic agenda MacLaury, R. E. (1997). Color and Cognition in Mesoamerica: for the discipline, epitomized in a volume by Mike Shanks Constructing Categories as Vantages. Austin: University of and Chris Tilley (1987) entitled Re-constructing Archaeol- Texas. ogy. While the critique of functionalism was warmly Scheffler, H. W., and F. G. Lounsbury. (1971). A Study in Struc- received and has had a long-lasting effect, it was soon rec- tural Semantics: The Siriono Kinship System. Englewood ognized that the epistemology of relativism, the lack of Cliffs, NJ: Prentice-Hall. explicit methodology, and the refusal to provide criteria to Spradley, J. P., Ed. (1972). Culture and Cognition: Rules, Maps, judge between competing interpretations constituted an and Plans. San Francisco: Freeman. appalling agenda for the discipline. Consequently, while Tyler, S. A., Ed. (1969). Cognitive Anthropology. New York: Holt, such work was critical for the emergence of cognitive Rinehart, and Winston. Cognitive Archaeology 123 archaeology, it now plays only a marginal role within the tion and using the developmental stages proposed by PIAGET discipline. as models for stages of cognitive evolution. While there A contrasting type of cognitive archaeology has were other important attempts at inferring the mental char- attempted to provide an equal emphasis on symbolic acteristics of our extinct ancestors and relatives from their thought and ideology, but sought to do this within a scien- material culture, such as by Glynn Isaac (1986) and John tific frame of reference in which claims about past beliefs Gowlett (1984), it was in fact a psychologist, Merlin Donald and ways of thought can be objectively evaluated. As such, (1991), who was the first to propose a theory for cognitive this archaeology has been characterized as a “cognitive- evolution that made significant use of archaeological data in processual” archaeology by Colin Renfrew (Renfrew and his book Origins of the Modern Mind. Bahn 1991). This covers an extremely broad range of stud- His scenario, however, has been challenged by Mithen ies in which attention has been paid to ideology, religious (1996a), who attempted to integrate current thought in EVO- thought, and cosmology (e.g., Flannery and Marcus 1983; LUTIONARY PSYCHOLOGY with that in cognitive archaeol- Renfrew 1985; Renfrew and Zubrow 1993). Such studies ogy. As such, he argues that premodern humans (e.g., H. argue that these aspects of human behavior and thought are erectus, Neanderthals) had a domain-specific mentality and as amenable to study as are the traditional subjects of that this accounts for the particular character of their archae- archaeology, such as technology and subsistence, which ological record. In his model, the origin of art, religious leave more direct archaeological traces. Of course, when thought, and scientific thinking—all of which emerged written records are available to supplement the archaeolog- rather dramatically about 30,000 years ago (70,000 years ical evidence, reconstruction of past beliefs can be substan- after anatomically modern humans appear in the fossil tially developed (Flannery and Marcus, in Renfrew et al. record)—arose from a new-found ability to integrate ways 1993). One branch of this cognitive-processual archaeology of thinking and types of knowledge that had been “trapped” has attempted to focus on processes of human DECISION- in specific cognitive domains. It is evident that the remark- MAKING, and argued that explicit reference to individuals is able development of culture in the past 30,000 years, and required for adequate explanations of long-term cultural especially its cumulative character of knowledge (some- change. Perles (1992), for instance, has attempted to infer thing that had been absent from all previous human cul- the cognitive processes of prehistoric flint knappers, while tures) is partly attributable to the disembodiment of mind Mithen (1990) used computer simulations of individual into material culture. For example, the first art objects decision making to examine the hunting behavior of prehis- included those that extended memory from its biological toric foragers. Another important feature has been an basis in the brain to a material basis in terms of symbolic explicit concern with the process of cultural transmission. codes engraved on pieces of bone or in paintings on cave In such studies attempts have been made to understand how walls (e.g., Mithen 1988; Marshack 1991; D’Errico 1995). the processes of social learning are influenced by different Depictions of imaginary beings are not simply reflections of forms of social organization (e.g., Mithen 1994; Shennan mental representations, but are critical in allowing those 1996). More generally, it is argued that the long-term pat- representations to persist and to be transmitted to other indi- terns of culture change in the archaeological record, such as viduals, perhaps across several generations (Mithen 1996b). the introduction, spread, and then demise of particular arti- In this regard, material culture plays an active role in formu- fact types (e.g., forms of axe head) can only be explained lating thought and transmitting ideas, and is not simply a by understanding both the conscious and unconscious pro- passive reflection of these. Whether or not this particular cesses of social learning (Shennan 1989, 1991). scenario from evolutionary-cognitive archaeology has any A third category of studies in cognitive archaeology, merit remains to be seen. But it is one example of the major although one that could be subsumed within cognitive- development of cognitive archaeology—in all of its processual archaeology, consists of those that are concerned guises—that has occurred during the last two decades. One with the EVOLUTION of the human mind and that can be must anticipate substantial future developments, especially referred to as an evolutionary-cognitive archaeology. As the if greater interdisciplinary research between archaeologists, archaeological record begins 2.5 million years ago with the biological anthropologists, and cognitive scientists can be first stone tools, it covers the period of brain enlargement achieved. and the evolution of modern forms of language and intelli- See also CULTURAL EVOLUTION; CULTURAL RELATIVISM; gence. While the fossil record can provide data about brain CULTURAL SYMBOLISM; DOMAIN SPECIFICITY; LANGUAGE size, anatomical adaptations for speech, and brain morphol- AND CULTURE; TECHNOLOGY AND HUMAN EVOLUTION ogy (through the study of endocasts), the archaeological —Steven J. Mithen record is an essential means to reconstruct the past thought and behavior of our ancestors, and the selective pressures for References cognitive evolution. Consequently, studies of human fossils and artifacts need to be pursued in a very integrated fashion Donald, M. (1991). Origins of the Modern Mind. Cambridge, MA: if we are to reconstruct the evolution of the human mind. Harvard University Press. The last decade has seen very substantial developments D’Errico, F. (1995). A new model and its implications for the ori- in this area, although significant contributions had already gin of writing: the la Marche antler revisited. Cambridge been made by Wynn (1979, 1981). He attempted to infer the Archaeological Journal 5: 163–206. levels of intelligence of human ancestors from the form of Flannery, K. V., and J. Marcus. (1983). The Cloud People. New early prehistoric stone tools by adopting a recapitualist posi- York: Academic Press. 124 Cognitive Architecture tion of the functions and capacities of each, and a blueprint Gowlett, J. (1984). Mental abilities of early man: a look at some hard evidence. In R. Foley, Ed., Hominid Evolution and Com- to integrate the systems. Such theories are designed around munity Ecology. London: Academic Press, pp. 167–192. a small set of principles of operation. Theories of cognitive Hodder, I. (1986). Reading the Past. Cambridge: Cambridge Uni- architecture can be contrasted with other kinds of cognitive versity Press. theories in providing a set of principles for constructing Hodder, I., Ed. (1982). Symbolic and Structural Archaeology. cognitive models, rather than a set of hypotheses to be Cambridge: Cambridge University Press. empirically tested. Isaac, G. (1986). Foundation stones: early artefacts as indicators of Theories of cognitive architecture can be roughly divided activities and abilities. In G. N. Bailey and P. Callow, Eds., according to two legacies: those motivated by the digital Stone Age Prehistory. Cambridge: Cambridge University Press, computer and those based on an associative architecture. pp. 221–241. The currency of the first kind of architecture is information Marshack, A. (1991). The Tai plaque and calendrical notation in the Upper Palaeolithic. Cambridge Archaeological Journal 1: 25–61. in the form of symbols; the currency of the second kind is Mithen, S. (1988). Looking and learning: Upper Palaeolithic art activation that flows through a network of associative links. and information gathering. World Archaeology 19: 297–327. The most common digital computer architecture is called Mithen, S. (1990). Thoughtful Foragers: A Study of Prehistoric the VON NEUMANN architecture in recognition of the contri- Decision Making. Cambridge: Cambridge University Press. butions of the mathematician John von Neumann to its Mithen, S. (1994). Technology and society during the Middle development. The key idea, the stored-program technique, Pleistocene. Cambridge Archaeological Journal 4: 3–33. allows program and data to be stored together. The von Neu- Mithen, S. (1996a). The Prehistory of the Mind: A Search for the mann architecture consists of a central processing unit, a Origins of Art, Science and Religion. London: Thames and memory unit, and input and output units. Information is Hudson. input, stored, and transformed algorithmically to derive an Mithen, S. (1996b). The supernatural beings of prehistory: the cul- tural storage and transmission of religious ideas. In C. Scarre output. The critical role played by this framework in the and C. Renfrew, Eds., External Symbolic Storage. Cambridge: development of modern technology helped make the COM- McDonald Institute for Archaeological Research (forthcoming). PUTATIONAL THEORY OF MIND seem viable. The framework Perles, C. (1992). In search of lithic strategies: a cognitive has spawned three classes of theories of cognitive architec- approach to prehistoric chipped stone assemblages. In J-C. Gar- ture, each encompassing several generations. The three din and C. S. Peebles, Eds., Representations in Archaeology. classes are not mutually exclusive; they should be under- Bloomington: Indiana University Press, pp. 357–384. stood as taking different perspectives on cognitive organiza- Renfrew, C., (1985). The Archaeology of Cult, the Sanctuary at tion that result in different performance models. Phylakopi. London: Thames and Hudson. The original architecture of this type was a PRODUCTION Renfrew, C., and P. Bahn. (1991). Archaeology: Theories, Methods SYSTEM. In this view, the mind consists of a working mem- and Practice. London: Thames and Hudson. Renfrew, C., C. S. Peebles, I. Hodder, B. Bender, K. V. Flannery, ory, a large set of production rules, and a set of precedence and J. Marcus. (1993). What is cognitive archaeology? Cam- rules determining the order of firing of production rules. A bridge Archaeological Journal 3: 247–270. production rule is a condition-action pair specifying actions Renfrew, C., and E. Zubrow, Eds. (1993). The Ancient Mind. Cam- to perform if certain conditions are met. The first general bridge: Cambridge Univerity Press. theory of this type was proposed by NEWELL, Simon, and Shanks, M., and C. Tilley. (1987). Re-Constructing Archaeology. Shaw (1958) and was called the General Problem Solver Cambridge: Cambridge University Press. (GPS). The idea was that a production system incorporating Shennan, S. J. (1989). Cultural transmission and cultural change. a few simple heuristics could solve difficult problems in the In S. E. van der Leeuw and R. Torrence, Eds., What’s New? A same way that humans did. A descendant of this approach, Closer Look at the Process of Innovation. London: Unwin SOAR (Newell 1990), elaborates the production system Hyman, pp. 330–346. architecture by adding mechanisms for making decisions, Shennan, S. J. (1991). Tradition, rationality and cultural transmis- sion. In R. Preucel, Ed., Processual and Postprocessual for recursive application of operators to a hierarchy of goals Archaeologies: Multiple Ways of Knowing the Past. Carbon- and subgoals, and for learning of productions. The architec- dale: Center for Archaeological Investigations, Southern Illi- ture has been applied to help understand a range of human nois University at Carbondale, pp. 197–208. performance from simple stimulus-response tasks, to typ- Shennan, S. J. (1996). Social inequality and the transmission of ing, syllogistic reasoning, and more. cultural traditions in forager societies. In S. Shennan and J. A second class of von Neumann-inspired cognitive archi- Steele, Eds., The Archaeology of Human Ancestry: Power, Sex tecture is the information processing theory. Unlike produc- and Tradition. London: Routledge, pp. 365–379. tion systems, which posit a particular language of symbolic Wynn, T. (1979). The intelligence of later Acheulian hominids. transformation, information processing theories posit a Man 14: 371–391. sequence of processing stages from input through encoding, Wynn, T. (1981). The intelligence of Oldowan hominids. Journal of Human Evolution 10: 529–541. memory storage and retrieval, to output. All such theories assume the critical components of a von Neumann architec- ture: a central executive to control the flow of information, Cognitive Architecture one or more memories to retain information, sensory devices to input information, and an output device. The crit- Cognitive architecture refers to the design and organization ical issues for such theories concern the nature and time of the mind. Theories of cognitive architecture strive to pro- course of processing at each stage. An early example of vide an exhaustive survey of cognitive systems, a descrip- such a theory is Broadbent’s (1958) model of ATTENTION, Cognitive Architecture 125 the imprint of which can be found on the “modal” informa- (1949) book, The Organization of Behavior. HEBB attempted tion processing theory, whose central distinction is between to account for psychological phenomena using a theory of short-term and long-term memory (e.g., Atkinson and Shif- neural connections (cell assemblies) that could be neuro- frin 1968), and on later models of WORKING MEMORY. physiologically motivated, in part by appeal to large-scale The digital computer also inspired a class of cognitive cortical organization. Thus, brain architecture became a architecture that emphasizes veridical representation of the source of inspiration for cognitive architecture. Hebb’s con- structure of human knowledge. The computer model distin- ception was especially successful as an account of percep- guishes program from data, and so the computer modeler tual learning. The two remaining antecedents involve has the option of putting most of the structure to be repre- technical achievements that led to a renewed focus on asso- sented in the computer program or putting it in the data that ciative models in the 1980s. Earlier efforts to build associa- the program operates on. Representational models do the tive devices resulted in machines that were severely limited latter; they use fairly sophisticated data structures to model in the kinds of distinctions they were able to make (they organized knowledge. Theories of this type posit two mem- could only distinguish linearly separable patterns). This lim- ory stores: a working memory and a memory for structured itation was overcome by the introduction of a learning algo- data. Various kinds of structured data formats have been rithm called “backpropagation of error” (Rumelhart, Hinton, proposed, including frames (Minsky 1975), SCHEMATA and Williams 1986). The second critical technical achieve- (Rumelhart and Ortony 1977), and scripts (Schank and ment was a set of proofs, due in large part to Hopfield (e.g., Abelson 1977), each specializing in the representation of 1982), that provided new ways of interpreting associative different aspects of the world (objects, events, and action computation and brought new tools to bear on the study of sequences, respectively). What the formats have in common associative networks. These proofs demonstrated that certain is that they (i) represent “default” relations that normally kinds of associative networks could be interpreted as opti- hold, though not always; (ii) have variables, so that they can mizing mathematical functions. This insight gave theorists a represent relations between abstract classes and not merely tool to translate a problem that a person might face into an individuals; (iii) can embed one another (hierarchical orga- associative network. This greatly simplified the process of nization); and (iv) are able to represent the world at multiple constructing associative models of cognitive tasks. levels of abstraction. These achievements inspired renewed interest in associa- The second type of cognitive architecture is associative. tive architectures. In 1986, Rumelhart and McClelland pub- In contrast with models of the von Neumann type, which lished a pair of books on parallel, distributed processing that assume that processing involves serial, rule-governed opera- described a set of models of different cognitive systems (e.g., tions on symbolic representations, associative models memory, perception, and language) based on common asso- assume that processing is done by a large number of parallel ciative principles. The work lent credence to the claim that operators and conforms to principles of similarity and conti- an integrated associative architecture could be developed. guity. For example, an associative model of memory Although a broad division between von Neumann and explains how remembering part of an event can cue retrieval associative architectures helps to organize the various con- of the rest of the event by claiming that an association ceptions of mental organization that have been offered, it between the two parts was constructed when the event was does an injustice to hybrid architectural proposals; that is, first encoded. Activation from the representation of the first proposals that include von Neumann-style as well as asso- part of the event flows to the representation of the second ciative components. Such alliances of processing systems part through an associative connection. More generally, the seem necessary on both theoretical and empirical grounds first part of the event cues associative retrieval of the entire (Sloman 1996). Only von Neumann components seem event (and thus the second part) by virtue of being similar to capable of manipulating variables in a way that matches the entire event. human competence (see BINDING PROBLEM), yet associa- Associative models have a long history stretching back to tive components seem better able to capture the context- Aristotle, who construed MEMORY and some reasoning pro- specificity of human judgment and performance as well as cesses in terms of associations between elementary sense people’s ability to deal with and integrate many pieces of images. More recent associative models are more promiscu- information simultaneously. One important hybrid theory ous: different models assume associations between different is ACT* (Anderson 1983). ACT* posits three memories: a entities, concepts themselves, or some more primitive set of production, a declarative, and a working memory, as well elements out of which concepts are assumed to be con- as processes that interrelate them. The architecture structed. includes both a production system and an associative net- Such modern conceptions of associative cognitive archi- work. In this sense, ACT* is an early attempt to build an tecture have two antecedents located in the history of cogni- architecture that takes advantage of both von Neumann tive science and two more immediate precursors. The first and associative principles. But integrating these very dif- historical source is the foundational work on associative ferent attitudes in a principled and productive way is an computation begun by MCCULLOCH and PITTS (1943) dem- ongoing challenge. onstrating the enormous computational power of popula- See also AUTOMATA; BROADBENT; COGNITIVE MODEL- tions of neurons and the ability of such systems to learn ING, CONNECTIONIST; COGNITIVE MODELING, SYMBOLIC; using simple algorithms. The second source is the applica- DISTRIBUTED VS. LOCAL REPRESENTATION tion of associative models based on neurophysiology to psy- —Steven Sloman chology. An influential synthesis of these efforts was Hebb’s 126 Cognitive Artifacts classification and comparison. Goody (1977) argues that the References advent of WRITING SYSTEMS fundamentally transformed Anderson, J. R. (1983). The Architecture of Cognition. Cambridge, human cognition. Nonlinguistic inscriptions such as maps, MA: Harvard University Press. charts, graphs, and tables enable the superimposition of rep- Atkinson, R. C., and R. M. Shiffrin. (1968). Human memory: a resentations of otherwise incommensurable items (Latour proposed system and its control processes. In K. W. Spence and 1986). Tabular formats for data are at least three thousand J. T. Spence, Eds., The Psychology of Learning and Motivation: years old (Ifrah 1987), and support reasoning about the Advances in Research and Theory, vol. 2. New York: Academic coordination of differing category structures, types, and Press, pp. 89–195. quantities of goods, for example. Broadbent, D. E. (1958). Perception and Communication. London: People often engage in activities characterized by the Pergamon Press. Hebb, D. O. (1949). The Organization of Behavior. New York: incremental creation and use of cognitive artifacts. Doing Wiley. place-value arithmetic amounts to successively producing Hopfield, J. J. (1982). Neural networks and physical systems with artifact structure, examining it, and then producing more emergent collective computational abilities. Proceedings of the structure (Rumelhart et al. 1986). Everyday tasks such as National Academy of Sciences, USA 79: 2554–2558. cooking involve a continuous process of creating and using McCulloch, W. S., and W. Pitts. (1943). A logical calculus of the cognitive artifacts. Kirsh (1995) refers to the systematic cre- ideas immanent in nervous activity. Bulletin of Mathematical ation and use of spatial structure in the placement of cook- Biophysics 5: 115–133. ing implements and ingredients as the intelligent use of Minsky, M. (1975). A framework for representing knowledge. In P. space. Here, the arrangement of artifacts is itself a cognitive Winston, Ed., The Psychology of Computer Vision. New York: artifact. McGraw-Hill. Newell, A. (1990). Unified Theories of Cognition. Cambridge: Norman (1993) relaxes the definition of cognitive arti- Harvard University Press. facts to include mental as well as material elements. Rules Newell, A., H. A. Simon, and J. C. Shaw. (1958). Elements of a of thumb, proverbs, mnemonics, and memorized procedures theory of human problem solving. Psychological Review 65: are clearly artifactual and play a similar role to objects in 151–166. some cognitive processes (Shore 1996). Of course, material Rumelhart, D. E., and A. Ortony. (1977). The representation of cognitive artifacts are only useful when they are brought knowledge in memory. In R. C. Anderson, R. J. Spiro, and W. into coordination with a corresponding mental element— E. Montague, Eds., Schooling and the Acquisition of Knowl- the knowledge of how to use them. edge. The behaviors of other actors in a social setting can serve Rumelhart, D. E., G. E. Hinton, and R. J. Williams. (1986). Learn- as cognitive artifacts. The work of VYGOTSKY (Vygotsky ing internal representations by error propagation. In D. E. Rumelhart, J. L. McClelland, and the PDP Research Group, 1978, 1986; Wertsch 1985) on activity theory emphasizes Eds., Parallel Distributed Processing, 1. Cambridge, MA: MIT the role of others in creating a “zone of proximal develop- Press. ment” in which the learning child is capable of cognitive Schank, R. C., and R. Abelson. (1977). Scripts, Plans, Goals, and activities that it could not do alone. Activity theory takes Understanding. Hillsdale, NJ: Erlbaum. words and concepts to be powerful psychological tools that Sloman, S. A. (1996). The empirical case for two systems of rea- organize thought and make higher level cognitive processes soning. Psychological Bulletin 119: 3–22. possible. In this view, language becomes the ultimate cogni- tive artifact system, and cognitive artifacts are absolutely Further Readings fundamental to human consciousness and what it means to Newell, A., and H. A. Simon. (1972). Human Problem Solving. be human. Englewood Cliffs, NJ: Prentice-Hall. One of the principal findings of studies of SITUATED Hinton, G. E., and J. A. Anderson. (1989). Parallel Models of COGNITION AND LEARNING is that people make opportunis- Associative Memory. Hillsdale, NJ: Erlbaum. tic use of structure. The method of loci in which an orator Pinker, S., and J. Mehler, Eds. (1988). Connections and Symbols. who must remember a speech associates elements of the Cambridge, MA: MIT Press. speech with architectural features of the place where the Rumelhart, D. E., J. L. McClelland, and the PDP Research Group, speech is delivered is a well-known example. Lave, Mur- Eds. (1986). Parallel Distributed Processing. Cambridge, MA: taugh, and de la Rocha (1984) examined the way that shop- MIT Press. pers made use of the structure of supermarkets. The layout Smolensky, P. (1988). On the proper treatment of connectionism. of the supermarket itself with the orderly arrangement of Behavioral and Brain Sciences 11: 1–23. items on the shelf is the ultimate icon of the shopping list. Regular shoppers develop routine trajectories through this Cognitive Artifacts space, thus creating a sequence of reminders of items to buy. Scribner (1984) documented the ways that dairy workers Cognitive artifacts are physical objects made by humans for take advantage of the layouts of standard diary product the purpose of aiding, enhancing, or improving cognition. cases in filling orders. Beach (1988) went to bartender’s Examples of cognitive artifacts include a string tied around school and learned how to use the shapes of drink glasses the finger as a reminder, a calendar, a shopping list, and a and their placement on the bar to encode the drinks in a computer. In the modern world, many cognitive artifacts multiple drink order. Hutchins (1995b) showed how airline rely on LITERACY and numeracy skills. Lists of various pilots take advantage of an incidental feature of the airspeed kinds support not only MEMORY, but also reasoning about indicator to identify +/–5 knot deviations from target speeds Cognitive Artifacts 127 by looking at the display in a particular way rather than by members of Western society know how to read, use a tele- calculating. Frake (1985) showed how medieval sailors in phone, drive a car, and so on. Conversely, the distribution of northern Europe used the structure of the compass card to knowledge in a community constrains technology. If every- “see” the times of high and low tides at major ports. In each one already knows how to do something with a particular of these cases people use designed objects in ways that were technology, an attempt to change or replace that technology not intended by the artifact’s designers. may meet resistance because learning is expensive. Sometimes even structures that are not made by humans There is no widespread consensus on how to bound the play the same role as cognitive artifacts. Micronesian navi- category “cognitive artifacts.” The prototypical cases seem gators can see the night sky as a 32-point compass that is clear, but the category is surrounded by gray areas consist- used to express courses between islands (Gladwin 1970; ing of mental and social artifacts, physical patterns that are Lewis 1972), and forms the foundation for a complex lay- not objects, and opportunistic practices. The cognitive arti- ered mental image that represents distance/rate/time prob- fact concept points not so much to a category of objects, as lems in analog form (Hutchins and Hinton 1984; Hutchins to a category of processes that produce cognitive effects by 1995a). The Micronesian navigator uses the night sky in the bringing functional skills into coordination with various same way that many manufactured navigational artifacts are kinds of structure. used. See also ARTIFACTS AND CIVILIZATION; HUMAN NAVIGA- There is a continuum from the case in which a cognitive TION; SITUATEDNESS/EMBEDDEDNESS artifact is used as designed, to cases of cognitive uses of —Edwin Hutchins artifacts that were made for other purposes, to completely opportunistic uses of natural structure. If one focuses on the products of cognitive activity, cog- References nitive artifacts seem to amplify human abilities. A calculator Beach, K. (1988). The role of external mnemonic symbols in seems to amplify my ability to do arithmetic, writing down acquiring an occupation. In M. M. Gruneberg, P. E. Morris, and something I want to remember seems to amplify my mem- R. N. Sykes, Eds., Practical Aspects of Memory: Current ory. Cole and Griffin (1980) point out that this is not quite Research and Issues, vol. 1. New York: Wiley. correct. When I remember something by writing it down Cole, M., and P. Griffin. (1980). Cultural amplifiers reconsidered. and reading it later, my memory has not been amplified. In D. R. Olson, Ed., The Social Foundations of Language and Rather, I am using a different set of functional skills to do Thought. New York: Norton, pp. 343–364. the memory task. Cognitive artifacts are involved in a pro- Frake, C. (1985). Cognitive maps of time and tide among medieval cess of organizing functional skills into functional systems. seafarers. Man 20: 254–270. Gladwin, T. (1970). East is a Big Bird. Cambridge, MA: Harvard Computers are an especially interesting class of cognitive University Press. artifact. Their effects on cognition are in part produced via Goody, J. (1977). The Domestication of the Savage Mind. Cam- the reorganization of human cognitive functions, as is true bridge: Cambridge University Press. of all other cognitive artifacts (Pea 1985). What sets com- Hutchins, E. (1995a). Cognition in the Wild. Cambridge, MA: MIT puters apart is that they may also mimic certain aspects of Press. human cognitive function. The complexity and power of the Hutchins, E. (1995b). How a cockpit remembers its speeds. Cogni- combination of these effects makes the study of HUMAN- tive Science 19: 265–288. COMPUTER INTERACTION both challenging and important. Hutchins, E., and G. E. Hinton. (1984). Why the islands move. While cognitive artifacts do not directly amplify or Perception 13: 629–632. change cognitive abilities, there are side effects of artifact Ifrah, G. (1987). From One to Zero. A Universal History of Num- bers. Trans. L. Bair. New York: Penguin Books. use. Functional skills that are frequently invoked in interac- Kirsh, D. (1995). The intelligent use of space. Artificial Intelli- tion with artifacts will tend to become highly developed, gence 72: 1–52. and those that are displaced by artifact use may atrophy. Latour, B. (1986). Visualization and cognition: thinking with eyes Any particular cognitive artifact typically supports some and hands. Knowledge and Society: Studies in the Sociology of tasks better than others. Some artifacts are tuned to very Culture Past and Present 6: 1–40. narrow contexts of use while others are quite general. The Lave, J., M. Murtaugh, and O. de la Rocha. (1984). The dialectic ones that are easy are easy because one can use very simple of arithmetic in grocery shopping. In B. Rogoff and J. Lave, cognitive and perceptual routines in interaction with the Eds., Everyday Cognition: Its Development in Social Context. technology in order to do the job (Norman 1987, 1993; Cambridge, MA: Harvard University Press, pp. 67–94. Hutchins 1995a; Zhang 1992). Lewis, D. (1972). We the Navigators. Honolulu: University of Hawaii Press. Cognitive artifacts are always embedded in larger socio- Norman, D. A. (1987). The Psychology of Everyday Things. New cultural systems that organize the practices in which they York: Basic Books. are used. The utility of a cognitive artifact depends on other Norman, D. A. (1993). Things That Make Us Smart. Reading, MA: processes that create the conditions and exploit the conse- Addison Wesley. quences of its use. In culturally elaborated activities, partial Pea, R. (1985). Beyond amplification: using computers to reorga- solutions to frequently encountered problems are often crys- nize human mental functioning. Educational Psychologist 20: tallized in practices, in knowledge, in material artifacts, and 167–182. in social arrangements. Rumelhart, D. E., P. Smolensky, J. L. McClelland, and G. E. Hin- Since artifacts require knowledge for use, the widespread ton. (1986). Schemata and sequential thought processes in PDP presence of a technology affects what people know. Most models. In J. L. McClelland, D. E. Rumelhart, and the PDP 128 Cognitive Development knowledge to begin with. These include symbolic connec- Group, Eds., Parallel Distributed Processing: Explorations in the Microstructure of Cognition. vol. 2: Psychological and Bio- tionist accounts that share much with associationist theo- logical Models. Cambridge, MA: MIT Press, pp. 7–57. ries and modern instantiations of rationalism. In the latter Scribner, S. (1984). Studying working intelligence. In B. Rogoff case, humans are endowed with some innate ideas, mod- and J. Lave, Eds., Everyday Cognition: Its Development in ules, and/or domain-specific structures of mind. Possible Social Context. Cambridge, MA: Harvard University Press, pp. candidates for innate endowments include implicit con- 9–40. cepts about natural number, objects, and kinds of energy Shore, B. (1996). Culture in Mind. New York: Oxford University sources of animate and inanimate objects (Gelman and Press. Williams 1997; Keil 1995; Pinker 1994). These kinds of Vygotsky, L. S. (1978). Mind and Society. Cambridge, MA: Har- models are learning models, ones built on the assumption vard University Press. that there is more than one learning mechanism. The idea is Vygotsky, L. S. (1986). Thought and Language. Cambridge, MA: MIT Press. that there is a small set of domain-specific, computational Wertsch, J. V. (1985). Vygotsky and the Social Formation of Mind. learning devices—each with a unique structure and infor- Cambridge, MA: Harvard University Press. mation processing system—that support and facilitate Zhang, J. (1992). Distributed Representation: The Interaction learning about the concepts of their domains. Gelman and Between Internal and External Information. Tech. Rep. no. Williams (1998) refer to these as skeletal-like, ready to 9201. La Jolla: University of California, San Diego, Depart- assimilate and accommodate domain-relevant inputs, but ment of Cognitive Science. very sketchy at first. Knowledge is not sitting in the head, ready to spring forth the moment the environment offers Cognitive Development one bit of relevant data. But structures, no matter how nascent, function to support movement along a domain-rel- Students of cognitive development ask how our young are evant learning path by encouraging attention to and foster- able to acquire and use knowledge about the physical, ing storage of domain-relevant data. social, cultural, and mental worlds. Questions of special For most associationist, stage, and information process- interest include: what is known when; whether what is ing theorists, it takes a long time for newcomers to the world known facilitates or serves as a barrier to the accurate inter- to develop concepts, because infants must first build up pretation and learning of new knowledge; how knowledge large memories of bits of sensory and response experiences; development serves and interacts with problem-solving associate, connect, or integrate these in ways that represent strategies; and the relationship between initial knowledge things or events; associate, connect, or integrate the latter, levels and ones achieved for everyday as opposed to expert and so on. The young mind’s progress toward conceptual use. understandings is slow, from reliance on the sensory, on to Several classes of theories share the premise that infants use of perceptions, and eventually to developing the where- lack abstract representational abilities and therefore CON- withal to form abstract concepts and engage in intelligent problem solving. This common commitment to the tradi- CEPTS. The infant mind to an associationist is a passive tional view that concepts develop from the concrete or per- “blank slate”—upon which a wash of sensations emanating ceptual to the abstract plays out differently depending on from the world is recorded as a result of the associative whether one is a stage theorist or not. For a non–stage theo- capacity. Stage theorists need not share the no innate rist, the march to the abstract level of knowing is linear and knowledge assumption. Interestingly, however, PIAGET, cumulative. For a stage theorist, the progress usually Bruner, and VYGOTSKY do, although to them infants are involves movement through qualitatively different mental able to participate actively in the construction of their own structures that can assimilate and use inputs in new ways. cognitive development. For Piaget, neonates spontaneously Many of the results of classification tasks used by Bruner, practice their reflexes, the effect being the differentiation of Piaget, and Vygotsky encourage the concrete-to-abstract inborn reflexes into different sensory-motor schemes. characterization of cognitive development. Repeatedly, it is Active use of these yields integrated action schemes, and found that two- to six-year-olds do not use classification cri- thus novel ways to act on the environment (Piaget 1970). teria in a consistent fashion. For example, when asked to put The information processing approach emphasizes the together objects that go together, one preschool child might development of general processes like ATTENTION, short- make a train, another a long line of alternating colors, while and long-term memory, organization, and problem solving. another might focus on appearance as opposed to reality, The focus is on how learning and/or maturation overcome and so on. limits on information processing demands (Anderson It is important to recognize that all theories of cognitive 1995) and the development of successful PROBLEM SOLV- development grant infants some innate abilities. The abilities ING (Siegler 1997). Much attention is paid to how knowl- to receive punctate sensations of light, sound, or pressure, and edge systems are acquired and circumvent these real-time so on, and form associations according to the laws of limits on various processes. When it comes to the matter of association (frequency and proximity) are foundational for what the newborn knows, the answer almost always is associationists. Association between sensations and responses “nothing” (but see Mandler 1997). Thus, many cognitive is the groundwork for knowledge of the world at a sensory development models are firmly grounded on associationist and motor level. These in turn support knowledge acquisition assumptions. at the perceptual level. Experiences at the perceptual level Reports of early conceptual competencies have encour- provide the opportunity for cross-modal associative learning aged the development of models that grant infants some Cognitive Development 129 ent (see Wellman and S. Gelman 1997 for details and more and the eventual induction of abstract concepts that are not examples). grounded on particular perceptual information. Although With development, core knowledge systems can become there are important differences in the foundational extremely rich, whether or not formal schooling is avail- assumptions of the association and traditional stage accounts, able—so much so that the existing knowledge structure their characterizations of an infant’s initial world are more behaves like a barrier for learning new structures in the similar than not. For example, associations are not Piaget’s domain (Gelman 1993). For example, the intuitive belief fundamental units of cognition; sensori-motor schemes are. that an inanimate object continues to move in a circle But to him, an infant’s initial knowledge is limited to innate because it currently has such a trajectory is inconsistent reflexes and is combined with an inclination to actively use with the theory of Newtonian mechanics. Yet the belief is and adapt these as a result of repeated interactions with held by many college students who have had physics objects. This eventually leads to the development of courses. Similarly, our well-developed NAIVE MATHEMAT- intercoordinated schemes and movement to action-based representations that take the infant from an out-of-sight, out- ICS, sometimes called “street mathematics” (Nunes, Schlie- of-mind stage to internalized representations, the mental mann, and Carraher 1993), makes it hard to learn school building blocks of a world of three-dimensional objects in a mathematics. In these cases, school lessons do not suffice to three-dimensional space. foster new understandings and kinds of expertise. Be they Piaget’s basic assumptions about the nature of the data nativist or nonnativist in spirit, efforts to account for the that feed early development apply to other stage theorists. In course of cognitive development will have to incorporate general, what initially count as relevant inputs are simple this fact about the effect, or lack of effect, of experience on motoric, sensory, or perceptual features. His emphasis is learning and concept acquisition more on children’s active participation in their own See also DOMAIN SPECIFICITY; INFANT COGNITION; MOD- cognitive development. Bruner and Vygotsky concentrate ULARITY OF MIND; NATIVISM; RATIONALISM VS. EMPIRICISM more on how others help the young child develop coherent —Rochel Gelman knowledge about their social, cultural, and historical environments. Still, all concur that initial “concepts” are References sensori-motor or perceptual in form and content; these are variously labeled as graphic collections, preconcepts, Anderson, J. R. (1995). Learning and Memory: An Integrated Approach. New York: Wiley. complexes, pseudoconcepts, and chain concepts. Thus, Gallistel, C. R., A. L. Brown, S. Carey, R. Gelman, and F. C. Keil. whether the account of the origins of knowledge is rooted in (1991). Lessons from animal learning for the study of cognitive an associationist, information processing, or stage theory, development. In R. Gelman and S. Carey, Eds., The Epigenesis the assumption is that first-order sense data, for example of Mind: Essays on Biology and Cognition. The Jean Piaget sensations of colored light, sound, pressure, etc., serve as Symposium Series. Hillsdale, NJ: Erlbaum, pp. 3–36. the foundation upon which knowledge is developed. Gelman, R. (1993). A rational-constructivist account of early Principles or structures that organize the representations of learning about numbers and objects. In D. Medin, Ed., Learn- concepts are a considerably advanced accomplishment, ing and Motivation. New York: Academic Press. taking hold somewhere between five and seven years of age. Gelman, R., and E. Williams. (1997). Enabling constraints on cog- Those who embrace more rationalist accounts assume nitive development. In D. Kuhn and R. Siegler, Eds., Cognition, Perception and Language. vol. 2. Handbook of Child Psychol- that the mind starts out with much more than the ability to ogy, fifth ed. W. Damon, Ed. New York: Wiley. sense and form associations or schemas about sensations Keil, F. C. (1995). The growth of causal understandings of natural and reflexes. Beginning learners have some skeletal struc- kinds. In D. Sperber, D. Premack, and A. J. Premack, Eds., tures with which to actively engage the environment; Causal Cognition: A Multidisciplinary Debate. Symposia of the domain-relevant inputs are those that can be brought Fyssen Foundation. New York: Clarendon Press/Oxford Uni- together and mapped to the existing mental structure. Put versity Press, pp. 234–267. differently, skeletal mental structures are attuned to infor- Mandler, J. M. (1997). Representation. In D. Kuhn and R. Siegler, mation in the environment at the level of structural univer- Eds., Cognition, Perception and Language. Vol. 2, Handbook of sals, not the level of surface characteristics. Thus the nature Child Psychology, 5th ed., W. Damon, Ed. New York: Wiley, of relevant data, even for beginning learners, can be rather pp. 255–308. Nunes, T., A. D. Schliemann, and D. W. Carraher. (1993). Street abstract. It need not be bits of sensation or concrete. Young Mathematics and School Mathematics. New York: Cambridge learners can have abstract concepts. University Press. We now know that preschool-age children appeal to Piaget, J. (1970). Piaget’s theory. In P. H. Mussen, Ed., Car- invisible entities to explain contamination, invoke internal michael’s Manual of Child Psychology. New York: Wiley. or invisible causal forces to explain why objects move and Pinker, S. (1994). The Language Instinct. New York: Harper- stop, reason about the contrast between the insides and out- Collins. sides of unfamiliar animals, choose strategies that are Siegler, R. S. (1997). Emerging Minds: The Process of Change in remarkably well suited to arithmetic problems, pretend that Children’s Thinking. New York: Oxford University Press. the same empty cup is first a full cup and then an empty cup, Wellman, H. M., and S. A. Gelman. (1997). Knowledge acquisi- etc. Five-month-old infants respond in ways consistent with tion and foundational domains. In D. Kuhn and R. Siegler, Eds., the beliefs that one solid object cannot pass through another Cognition, Perception and Language. Vol. 2, Handbook of solid object; an inanimate object cannot propel itself; and Child Psychology, 5th ed., W. Damon, Ed. New York: Wiley, that mechanical and biomechanical motion are very differ- pp. 523–573. 130 Cognitive Dissonance and coordinating with other people); on the other, by devel- Further Readings oping methods for redefining the engineering process of Bruner, J. S., R. R. Olver, and P. M. Greenfield. (1966). Studies in workplace design. The two activities are tightly coupled Cognitive Growth. New York: Wiley. because the integration of user needs and requirements in Carey, S., and R. Gelman, Eds. (1991). The Epigenesis of Mind: the design of systems and organizations is seen as the only Essays on Biology and Cognition. Hillsdale, NJ: Erlbaum. possible answer for successful transformation of the work- Dehaene, S. (1997). The Number Sense: How the Mind Creates place. User-centered design grounds the design process on Mathematics. New York: Oxford University Press. general and domain-specific models of cognitive activity Gelman, R., and T. Au, Eds. (1996). Cognitive and Perceptual and is characterized by extensive investigation of user’s Development. vol. 12. Handbook of Perception and Cognition. goals, tasks, job characteristics, and by continual iterations E. Carterette and M. Friedman, Eds. Academic Press. Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmental of design and user testing of solutions. User-centered design Perspective on Cognitive Science. Cambridge, MA: MIT Press. encompasses a variety of methods to collect and analyze Simon, T. J., and G. S. Halford, Eds., Developing Cognitive Com- data about tasks and context of use, and of techniques to test petence: New Approaches to Process Modeling. Hillsdale, NJ: and measure interactions between users and computer sys- Erlbaum. tems (Helander 1988). Vygotsky, L. S. (1962). Thought and Language. Cambridge, MA: General models of cognitive work-oriented activity have MIT Press. been developed to account for the complexity of human behaviors produced in work situations (Rasmussen 1986), Cognitive Dissonance to explain erroneous action (Norman 1981; Reason 1990), and to conceptualize human-computer interaction (Card, Moran, and Newell 1983; Norman and Draper 1986). Ras- See DISSONANCE mussen’s model of work activity distinguishes between automatic, automated, and deliberate behaviors, controlled Cognitive Ergonomics respectively by skills, rules, and knowledge. Whereas skills and rules are activated in familiar situations, the elaboration Cognitive ergonomics is the study of cognition in the work- of explicit understanding of the current situation and of a place with a view to design technologies, organizations, and deliberate plan for action is necessary to deal with unfamil- learning environments. Cognitive ergonomics analyzes iar or unexpected situations. This framework has spurred work in terms of cognitive representations and processes, several specific models of process control activities in, and contributes to designing workplaces that elicit and sup- among others, fighter aircraft (Amalberti 1996), nuclear port reliable, effective, and satisfactory cognitive process- power plants (Woods, O’Brien, and Hanes 1987), steel ing. Cognitive ergonomics overlaps with related disciplines plants (Hoc 1996), and surgical units (Cook and Woods such as human factors, applied psychology, organizational 1994; De Keyser and Nyssen 1993). studies, and HUMAN-COMPUTER INTERACTION. The distinction between levels of cognitive control has The emergence of cognitive ergonomics as a unitary field also been the key to apprehend the reliability of the human of research and practice coincides with the rapid transforma- component of the workplace. Reason’s model (1990) of tion of the workplace following the growth of complex tech- human error identifies slips and lapses caused respectively nological environments and the widespread introduction of by ATTENTION and WORKING MEMORY failures at the skill- information technologies to automate work processes. The level of control; rule-based and knowledge-based mistakes computerized workplace has generalized flexible and self- caused by the selection/application of the wrong rule or by directed forms of work organization (Zuboff 1988; Wino- inadequate knowledge; and violations that account for grad and Flores 1986). These transformations have raised a intentional breaches of operational procedures. set of psychological, social, and technological issues: the Norman (1986) formulates a general model of human- development of competencies to master work processes computer interaction in terms of a cycle of execution and through the new technology, the cognitive shift involved in evaluation. The user translates goals into intentions and into the transition from controlling a process to monitoring auto- action sequences compatible with physical variables, and mated systems (Bainbridge 1987; Sarter and Woods 1992), evaluates the system state with respect to initial goals after the acquisition of skills to use interactive tools, the transfer perceiving and interpreting the system state. Bridging what of knowledge and skills from the old to the new workplace. Hutchins, Hollan, and Norman (1986) have named the Gulf Technological innovation has also opened the possibility of of Execution and the Gulf of Evaluation, interaction can be improving human performance by aiding, expanding, and facilitated by providing input and output functions for the reorganizing human cognitive activities through the design user interface that match more closely the psychological of advanced tools, a challenge addressed by cognitive engi- needs of the user, building in AFFORDANCES that constrain neering and usability (Hollnagel and Woods 1983; Norman the interpretation of the system state, and by providing and Draper 1986; Nielsen 1993). coherent and clear design models that support users in Cognitive ergonomics approaches these issues, on the building mental models of the system. one hand, by developing models of the knowledge struc- The bulk of the research in cognitive ergonomics has tures and information-processing mechanisms that explain been carried out on domain-specific work processes. how individuals carry out their work tasks (doing PLAN- Viewed from an organizational perspective, work processes NING, PROBLEM SOLVING, DECISION-MAKING, using tools, can be decomposed into sets of tasks, of which explicit Cognitive Ergonomics 131 descriptions exist in the form of procedures. Task analysis Hollnagel, E., and D. D. Woods. (1983). Cognitive systems engi- neering: new wine in new bottles. International Journal of methods combine operational procedure analysis and inter- Man-Machine Studies 18: 583–600. views with experts to describe the cognitive requirements, Hoc, J. M. (1996). Supervision et Contrôle de Processus: la Cogni- i.e., demands on MEMORY, attention, understanding, and tion en Situation Dynamique. Grenoble: Presses Universitaires coordination for realizing each step of the task (Diaper de Grenoble. 1989). Leplat (1981) points out the gap that exists between Hutchins, E. (1991). Distributed cognition in an airline cockpit. In normative accounts of work and actual practice, and argues Y. Engstrom and D. Middleton, Eds., Communication and for activity analysis of work to be carried out in the field, Cognition at Work. New York: Cambridge University Press. using techniques derived from ethnography. Viewed from Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT an activity perspective, work processes involve problem set- Press. ting, problem solving, troubleshooting, and collaborating. Hutchins, E., J. Hollan, and D. A. Norman. (1986). Direct Manipulation Interfaces. In D. A. Norman and S. Draper, Roth and Woods (1988) see the work process as problem Eds., User Centered System Design: New Perspectives in solving and combine a conceptual analysis of what is Human-Computer Interaction. Hillsdale, NJ: Erlbaum. required to solve the problem, in terms of EXPERTISE, Leplat, J. (1981). Task analysis and activity analysis in field diag- CAUSAL REASONING, DEDUCTIVE REASONING, decision mak- nosis. In J. Rasmussen and W. B. Rouse, Eds., Human Detec- ing, and resource management, with empirical observation tion and Diagnosis of Systems Failure. New York: Plenum of how agents solve the problem in practice. Their study Press. provides options for better meeting the cognitive demands Nielsen, J. (1993). Usability Engineering. Boston, MA: Academic of the task and gives rise to proposals for the design and Press. development of new information displays that enhance Norman, D. A. (1981). Categorization of action slips. Psychologi- agents’ ability to anticipate process behavior. cal Review 88: 1–15. Norman, D. A. (1986). Cognitive engineering. In D. A. Norman A new perspective on work is emerging from Hutchins’s and S. Draper, Eds., User Centered System Design: New Per- research on distributed cognition. Hutchins (1995) takes the spectives in Human-Computer Interaction. Hillsdale, NJ: work process as problem solving that is dealt with by the Erlbaum. workplace as a whole: a culturally organized setting, com- Norman, D. A., and S. Draper, Eds. (1986). User Centered System prising individuals, organizational roles, procedures, tools, Design: New Perspectives in Human-Computer Interaction. and practices. The cognitive processes necessary to carry Hillsdale, NJ: Erlbaum. out tasks are distributed between cognitive agents and COG- Rasmussen, J. (1986). Information Processing and Human- NITIVE ARTIFACTS. Hutchins (1991) shows, for instance, that Machine Interaction. Amsterdam: Elsevier. aircraft are flown by a cockpit system that includes pilots, Reason, J. (1990). Human Error. Cambridge: Cambridge Univer- procedures, manuals, and instruments. This view, while sity Press. Roth, M., and D. D. Woods. (1988). Aiding human performance I: keeping within the information processing paradigm of cog- cognitive analysis. Le Travail Humain 51: 39–64. nition, recognizes fully the social and cultural dimensions of Sarter, N., and D. D. Woods. (1992). Pilot interaction with cockpit the workplace, counteracting a tendency to overestimate the automation: operational experiences with the Flight Manage- cognitive processes at the expense of environmental, organi- ment System. International Journal of Aviation Psychology 2: zational, and contextual factors. 303–321. See also AI AND EDUCATION; COGNITIVE ANTHROPOL- Winograd, T., and F. Flores. (1986). Understanding Computers OGY; HUMAN NAVIGATION and Cognition: A New Foundation for Design. Norwood, NJ: Ablex Corp. —Francesco Cara Woods, D. D., J. O’Brien, and L. F. Hanes. (1987). Human factors’ challenges in process-control: the case of nuclear power plants. In G. Salvendy, Ed., Handbook of Human Factors/Ergonomics. References New York: Wiley. Amalberti, R. (1996). La Conduite des Systèmes à Risque. Paris: Zuboff, S. (1988). In the Age of the Smart Machine: The Future of Presses Universitaires de France. Work and Power. Basic Books. Bainbridge, L. (1987). Ironies of automation. In J. Rasmussen, K. D. Duncan, and J. Leplat, Eds., New Technology and Human Further Readings Error. Chichester: Wiley, pp. 271–284. Card, S. K., T. P. Moran, and A. Newell. (1983). The Psychology of Button, G., Ed. (1993). Technology in Working Order. London: Human-Computer Interaction. Hillsdale, NJ: Erlbaum. Routledge. Carroll, J. M., Ed. (1991). Designing Interaction: Psychology at Carroll, J. M. (1997). Human-computer interaction: psychology as the Human-Computer Interface. New York: Cambridge Univer- a science of design. Annual Review of Psychology 48: 61–83. sity Press. Carroll, J. M., Ed. (1987). Interfacing Thought: Cognitive Aspects Cook, R. I., and D. D. Woods. (1994). Operating at the sharp end: of Human-Computer Interaction. Cambridge: Cambridge Uni- the complexity of human error. In M. S. Bogner, Ed., Human versity Press. Error in Medicine. Hillsdale, NJ: Erlbaum. Engestrom, Y., and D. Middleton, Eds. (1991). Communication De Keyser, V., and A. S. Nyssen. (1993). Les erreurs humaines en and Cognition at Work. New York: Cambridge University anesthésie. Le Travail Humain 56: 243–266. Press. Diaper, D., Ed. (1989). Task Analysis for Human-Computer Inter- Gallagher, J., R. Krant, and C. Egido, Eds. (1990). Intellectual action. New York: Wiley. Teamwork. Hillsdale, NJ: Erlbaum. Helander, M., Ed. (1988). Handbook of Human-Computer Interac- Greenbaum, J., and M. Kying, Eds. (1991). Design at Work: Coop- tion. New York: Elsevier. erative Design of Computer Systems. Hillsdale, NJ: Erlbaum. 132 Cognitive Ethology as well? (ii) Neurophysiological evidence—as far as can be Hollnagel, E., G. Mancini, and D. D. Woods, Eds. (1986). Intelli- gent Decision Support in Process Environments. New York: determined, nonhuman animals share many neuroanatomi- Springer-Verlag. cal structures, including kinds of neurotransmitters, neu- Norman, D. A. (1987). The Psychology of Everyday Things. Basic rons, synaptic connections, and nerve impulses. We agree Books. that humans are conscious, but, as yet, know of no special Norman, D. A. (1993). Things that Make Us Smart. Reading, MA: structure or process type that is responsible for conscious- Addison-Wesley. ness or awareness, nor of any neurophysiological process Shneiderman, B. (1982). Designing the User Interface: Strategies that is uniquely human. (iii) Behavioral complexity and for Effective Human-Computer Interaction. Reading, MA: flexibility, particularly in the face of obstacles to the Addison-Wesley. achievement of a goal, at least suggest that the organism Suchman, L. A., Ed. (1995). Special section on representations of may be thinking about alternative ways to behave. (iv) Com- work. Communications of ACM 38: 33–68. Woods, D. D., and M. Roth. (1988). Aiding human performance II: munication as a window on animal minds—an animal’s from cognitive analysis to support systems. Le Travail Humain attention and response to the species’ communicative sig- 51: 139–159. nals—may suggest something of the mental experience and likely behaviors of the communicating organism. Probably the most significant methodological character- Cognitive Ethology istics of research in cognitive ethology is approximating important aspects of the natural conditions of the species Cognitive ethology has been defined as the study of the under investigation and making very fine-grained analyses mental experiences of animals, particularly in their natural of behavior, usually necessitating videotaped data collec- environment, in the course of their daily lives. Data are tion. Many experiments are conducted in the natural envi- derived from the observation of naturally occurring behav- ronment: (i) Ristau’s (1991) studies of piping plovers’ use ior as well as from experimental investigations conducted in of “injury feigning” in staged encounters with intruders. (ii) the laboratory and in the field. By emphasizing naturally Observations of the social behavior of freely interacting occurring behaviors, cognitive ethologists recognize that the organisms, for example Bekoff’s (1995a) study of meta- problems faced in finding food and mates, rearing young, communicative signals in play between two pet dogs. The avoiding predators, creating shelters, and communicating dogs tended to use signals indicating “this is play” to initiate and engaging in social interactions may require consider- play bouts and after rough encounters possibly misinterpret- able cognitive skills, possibly more complex than and dif- able as aggressive (see SOCIAL PLAY BEHAVIOR). (iii) Struc- ferent from those usually examined in traditional psycho- turally altered environmental space, for example Bekoff’s logical laboratory studies. The term “mental experiences” (1995b) study of vigilance behavior as influenced by the acknowledges that the mental capabilities of animals, in geometry of birds’ spatial arrangement at a feeder. Birds addition to unconscious mental processes, may also include spent less time being vigilant for predators when perched in conscious states. This affords the animals sensory experi- a circle and easily able to determine whether others were ences, pleasure and pain, the use of mental imagery, and vigilant than when they were standing along a line so that involves at least simple intentional states such as wanting such visibility was impossible. (iv) Acoustic playback of and believing. recorded species-typic communication signals, for example Thus, broadly described, the subject of research in cogni- Seyfarth, Cheney, and Marler’s (1980) playbacks to vervet tive ethology includes any mental experiences and pro- monkeys. The vervets apparently can signal semantic infor- cesses, including studies of habituation and sensitization, mation, not simply emotional states. The monkeys have dis- learning and memory, problem solving, perception, decision tinct acoustic signals for a ground predator such as a making, natural communication, and the artificial languages leopard, a specific aerial predator, the martial eagle, and for taught to apes, dolphins, and parrots (reviewed by Ristau snakes (see SOCIAL COGNITION IN ANIMALS). and Robbins 1982; Ristau 1997). More narrowly defined, Cognitive ethology studies differ in interpretation from cognitive ethology emphasizes interest in possibly con- more traditional approaches. Many ethologists and compar- scious animal mental experiences, particularly as occurring ative psychologists have studied foraging behavior, but the in communication and other social interactions and studies cognitive ethologist is interested in different aspects of the of intention. The field can also be construed as an ethologi- data and draws different conclusions. The traditional etholo- cal approach to cognition. gist might produce an ethogram of an individual animal’s or Although the field of cognitive ethology has roots in species’ behavior, namely a time budget of the various COMPARATIVE PSYCHOLOGY, ETHOLOGY, and studies of ani- activities in which the animal(s) engage. A comparative mal learning and memory, it traces its birth to the publica- psychologist might study the animals under laboratory con- tion in 1976 of The Question of Animal Awareness by the ditions with strict stimulus control, striving to determine the biologist Donald Griffin (see also Griffin 1992). stimuli apparently controlling the organism’s behavior. In What is the rationale for the field of cognitive ethology? these various situations, the cognitive ethologists would be As conceived by Griffin (1976), there are several reasons for particularly interested in the cognitive capacities revealed, considering that animals may think and be aware: (i) Argu- particularly as they might relate to problems encountered by ment from evolutionary continuity—animals share so many the organism in its natural environment. similarities in structure, process, and behavior with humans, Cognitive ethological studies sometimes differ from why should we not expect them to share mental experience other research in their scientific paradigm. A typical learn- Cognitive Ethology 133 2. Does a displaying bird monitor the intruder’s behavior? ing paradigm as used by experimental psychologists might Though difficult to determine exactly what the plover test an animal’s discriminative abilities and specificity, was monitoring, often in the midst of a display, the bird duration and stimulus control of memories, but the proce- turned its head sharply back over its shoulder, eye toward dures rely on standard protocols with many trials. Learning the intruder. over many trials is not the same capacity as intelligence, 3. Does the bird modify its behavior if its goal is not being though it is necessary for intelligence. Intelligent behavior, achieved, specifically if the intruder does not follow the of particular interest to cognitive ethologists, is most likely displaying bird? Again, yes. The plover reapproached to be revealed when an organism encounters a novel prob- the intruder or increased the intensity of its display if the lem, particularly one for which the organism is not specifi- intruder was not following, but did not do so if the cally prepared by evolution. But one or few occurrences, at intruder kept following the display. least in the past, have been pejoratively termed anecdotal Other experiments determined that the plover was sensi- and excluded from the scientific literature. Science can tive to the direction of eye gaze of an intruder either toward progress by including such observations, particularly from or away from its nest. In others, the plover learned very rap- well-trained scientists who, with years of study, often have idly to discriminate which intruders were dangerous, i.e., unique opportunities to observe animals in unusual circum- had approached closely to its nest, and which were not, stances. If possible, one attempts to design experiments that those that had simply walked past the nest, rather far away. replicate the rare occurrence. The field of cognitive ethology has engendered consider- Since an organism’s most interesting actions may not be able controversy (for example, see Mason 1976). Some predictable by the researcher, many investigations in cogni- experimental and comparative psychologists dissociate tive ethology cannot proceed with predetermined checklists themselves from the endeavor and avoid use of mentalistic of behavior. Likewise the significance of given behaviors phrases such as “the organism wants to do X.” However, in may not be apparent at the time of observation, but may designing and interpreting the research, experimental psy- later be understood in terms of a broader context, often chologists have at least implicit assumptions about the extending over days, weeks, or years. These possibilities needs and intentions of their animal subjects. The lab rat again require extensive use of video and careful notation of must want the food pellet and thus performs the experi- details. For example, a parent piping plover who hears an menter’s task to get it. In a test of CER (conditioned emo- acoustic playback of a chick’s screech and then searches tional responding), the rat inhibits bar pressing in the unsuccessfully for the chick, sometimes next searches an presence of a stimulus associated with shock. We are inter- area where a playback had been conducted days earlier ested in the results only because we assume the rat is experi- (Ristau, pers. obs.). encing an emotional state related to human fear. Even the A cognitive ethological approach is illustrated in experi- description of an organism’s action contains implied attribu- mental field studies of piping plovers’ injury feigning or tions of intentions. Insofar as a rat is described as reaching broken wing display (BWD), a behavior performed when an for a pellet or walking to the water, a goal or motive is intruder/predator approaches the nest or young (Ristau implied. To avoid the for or to makes the description a mere 1991). The hypothesis to be evaluated was: The piping plo- delineation of changes of position in space and loses the sig- ver wants to lead the predator/intruder away from the nest/ nificance of the event. In similar vein, the experiments that young. Alternatively, the use of BWDs might be viewed as studiously avoid mentalistic phrases, upon closer examina- the result of confusion or conflicting motivations or as a tion, entail assumptions about animal states and intentions. fixed action pattern with no voluntary control. Theories of One can only hope that the common concerns of the how one might experimentally assess the plover’s purpose related disciplines will be recognized by researchers. It is drew inspiration from work such as that of the psychologist probable that philosophers would find useful, directive con- Tolman (1932), the philosopher Dennett (1983), and artifi- straints on their thinking from the data on animal cognition, cial intelligence researcher Boden (1983). Videotaped and that the psychologists and biologists would better experiments were conducted on beaches where plovers appreciate the contributions of philosophers that can help nested. Human intruders approached the nest or young in clarify concepts and reveal conceptual complexities in an directions unpredictable to the birds, so as to approximate experiment. the behavior of an actual predator. Questions asked were the See also ANIMAL COMMUNICATION; ANIMAL NAVIGA- following: TION; CONDITIONING; EMOTION AND THE ANIMAL BRAIN; INTENTIONAL STANCE; INTENTIONALITY 1. Is the bird’s behavior appropriate to achieve the pre- sumed goal? Specifically, does the plover move in a —Carolyn A. Ristau direction that would cause an intruder who was follow- ing to be further away from the nest/young? Answer: References Yes. Furthermore, the plover displayed a behavior that might be expected if the bird were trying to attract the Bekoff, M. (1995a). Play signals as punctuation: the structure of intruder’s attention and thus was selective about where it social play in canids. Behaviour 132: 419–429. displayed. The bird positioned itself before displaying, Bekoff, M. (1995b). Vigilance, flock size, and flock geometry: often even flying to a new location. Such new locations information gathering by western evening grosbeaks (Aves, were closer to the intruder and usually closer to the cen- fringillidae). Ethology 99: 150–161. ter of the intruder’s visual field/direction of movement as Bekoff, M., and Jamieson, D., Ed. (1996). Readings in Animal well. Cognition. Cambridge, MA: MIT Press. 134 Cognitive Linguistics plates in which verbs are embedded contribute to determin- Boden, M. (1983). Artificial intelligence and animal psychology. New Ideas in Psychology 1: 11–33. ing argument structure and MEANING (Goldberg 1995). Dennett, D. C. (1983). Intentional systems in cognitive ethology: Mental spaces theory (Fauconnier 1985) focuses on the sub- the panglossian paradigm defended. Behavioral and Brain Sci- tle relationships among elements in the various mental mod- ences 6: 343–390. els that speakers construct, relationships that underlie a vast Dennett, D. C. (1996). Kinds of Minds: Towards an Understanding array of semantic phenomena such as scope ambiguities, of Consciousness. New York: Basic Books. negation, counterfactuals, and opacity effects. Griffin, D. R. (1976). The Question of Animal Awareness: Evolu- A number of cognitive linguists are exploring the central tionary Continuity of Mental Experience. New York: Rock- role of METAPHOR in SEMANTICS and cognition (Lakoff and efeller University Press. Johnson 1980), while others focus on the specific relation- Griffin, D. R. (1992). Animal Minds. Chicago: University of Chi- ships between general cognitive faculties and language cago Press. Mason, W. (1976). Windows on other minds. Book review of The (Talmy 1988). Though not self-identified as a cognitive the- Question of Animal Awareness by Donald R. Griffin. Science ory, Chafe’s (1994) information flow theory is certainly 194: 930–931. compatible with the cognitive linguistic paradigm, as it Ristau, C. A. (1991). Aspects of the cognitive ethology of an explores the relationship between DISCOURSE units and injury-feigning bird, the piping plover. In Ristau, C. A., Ed., information units in mental processing. Cognitive Ethology: The Minds of Other Animals. Hillside, NJ: Arguably the most comprehensive theoretical frame- Erlbaum, pp. 91–126. work within this area is cognitive grammar (Langacker Seyfarth, R. M., D. L. Cheney, and P. Marler. (1980). Vervet mon- 1987, 1991). The goal of cognitive grammar is to develop a key alarm calls: evidence for predator classification and seman- theory of grammar and meaning that is cognitively plausi- tic communication. Animal Behaviour 28: 1070–1094. ble and that adheres to a kind of theoretical austerity Tolman, E. C. (1932). Purposive Behavior in Animals and Men. New York: Appleton-Century. summed up in the content requirement (Langacker 1987): the only structures permitted in the grammar of a language Further Readings are (1) phonological, semantic, or symbolic structures that actually occur in linguistic expressions (where a symbolic Allen, C., and M. Bekoff. (1997). Species of Mind: The Philosophy structure is a phonological structure paired with a semantic and Biology of Cognitive Ethology. Cambridge, MA: MIT structure, essentially a Saussurean “sign”); (2) schemas for Press. such structures, where a schema is a kind of pattern or tem- Beer, C. (1992). Conceptual Issues in Cognitive Ethology. plate that a speaker acquires through exposure to multiple Advances in the Study of Behavior 21: 69–109. exemplars of the pattern; and (3) categorizing relationships Jamieson, D., and M. Bekoff. (1993). On aims and methods of cognitive ethology. Philosophy of Science Association 2: 110– among the elements in (1) and (2) (for example, categoriz- 124. ing the syllable /pap/ as an instantiation of the schematic Lloyd, D. (1989). Small Minds. Cambridge, MA: MIT Press. pattern /CVC/). Ristau, C. A. (1997). Animal language and cognition research. In While not all cognitive linguists adopt the content A. Lock and C. R. Peters, Eds., The Handbook of Human Sym- requirement specifically, the principle of eschewing highly bolic Evolution. London: Oxford University Press, pp. 644–685. abstract theoretical constructs is endorsed by the cognitive Ristau, C. A., and D. Robbins. (1982). Language in the great apes: linguistic movement as a whole. a critical review. In J. S. Rosenblatt, R. A. Hinde, C. Beer, and The published work in cognitive grammar has focused on M.-C. Busnel, Eds., Advances in the Study of Behavior, vol. 12. developing a fundamental vocabulary for linguistic analysis New York: Academic Press, pp. 142–255. in which grammatical constructs are defined in terms of Cognitive Linguistics semantic notions, which are themselves characterized in terms of cognitive abilities that are well attested even out- side of the linguistic domain. Basic grammatical categories Cognitive linguistics is not a single theory but is rather best (noun, verb, subject, etc.) and constructions are given cogni- characterized as a paradigm within linguistics, subsuming a tive semantic characterizations, obviating the need for number of distinct theories and research programs. It is abstract syntactic features and representations. characterized by an emphasis on explicating the intimate The possibility of defining grammatical notions in seman- interrelationship between language and other cognitive fac- tic terms depends upon taking into account construal. Con- ulties. Cognitive linguistics began in the 1970s, and since strual refers to the way in which a speaker mentally shapes the mid-1980s has expanded to include research across the and structures the semantic content of an expression: the rel- full range of subject areas within linguistics: syntax, seman- ative prominence of elements, their schematicity or specific- tics, phonology, discourse, etc. The International Cognitive ity, and the point of view adopted. Many grammatical Linguistics Association holds bi-annual meetings and pub- distinctions that had been traditionally considered to be lishes the journal Cognitive Linguistics. “meaningless” (i.e., lacking in stable semantic content) are The various theoretical frameworks within the cognitive found instead to be markers of subtle distinctions in con- paradigm are to a large degree mutually compatible, differ- strual. The subject/object distinction, for example, is defined ing most obviously in the kinds of phenomena they are as a linguistic correlate of the more general ability to per- designed to address. Construction grammar advances the ceive figure versus ground. Active and passive versions of hypothesis that grammatical constructions are linguistic the “same sentence” therefore do not mean the same thing, units in their own right, and that the constructional tem- even if they have identical truth conditions; they differ in fig- Cognitive Maps 135 ure/ground organization. The semantic definitions of gram- Goldberg, A. (1995). Constructions. Chicago: University of Chi- cago Press. matical categories similarly rely upon construal, although the Haiman, J. (1980). Dictionaries and encyclopedias. Lingua 50: details would require a great deal more discussion. 329–357. There are several key principles uniting the various theo- Lakoff, G. (1987). Women, Fire and Dangerous Things. Chicago: ries and approaches within the cognitive linguistic para- University of Chicago Press. digm. Lakoff, G., and M. Johnson. (1980). Metaphors We Live By. Chi- cago: University of Chicago Press. 1. Conceptual (subjectivist) semantics. Meaning is charac- Langacker, R. W. (1987). Foundations of Cognitive Grammar, vol. terized as conceptualization: The meaning of an expres- 1: Theoretical Prerequisites. Stanford: Stanford University sion is the concepts that are activated in the speaker or Press. hearer's mind. In this view, meaning is characterized as Langacker, R. W. (1991). Foundations of Cognitive Grammar, vol. involving a relationship between words and the mind, 2: Descriptive Application. Stanford: Stanford University Press. not directly between words and the world (cf. INDIVIDU- Talmy, L. (1988). Force dynamics in language and cognition. Cog- ALISM). nitive Science 12 (1): 49–100. 2. Encyclopedic as opposed to dictionary semantics Taylor, J. (1989). Linguistic Categorization: Prototypes in Linguis- (Haiman 1980). Words and larger expressions are tic Theory. Oxford: Oxford University Press. viewed as entry points for accessing open-ended knowl- edge networks. Fully explicating the meaning of an Cognitive Maps expression frequently requires taking into account imag- ery (both visual and nonvisual), metaphorical associa- tions, mental models, and folk understandings of the Edward Tolman is credited with the introduction of the term world. Thus, the meaning of a word is generally not cap- cognitive map, using it in the title of his classic paper turable by means of a discrete dictionary-like definition. 3. Structured categories. Categories are not defined in (1948). He described experiments in which rats were trained terms of criterial-attribute models or membership deter- to follow a complex path involving numbers of turns and mined by necessary and sufficient features (Lakoff 1987; changes of direction to get to a food box. Subsequently in a Taylor 1989). Rather categories are organized around test situation the trained path was blocked off and a variety prototypes, family resemblances, and subjective relation- of alternative paths provided. A large majority of the rats ships between items in the category. chose a path that headed very close to the true direct direc- 4. Gradient grammaticality judgments. Grammaticality tion of the food box, and not one that was close to the origi- judgments involve a kind of categorization, inasmuch as nal direction of the path on which they had been trained. On a speaker judges an utterance to be a more or less accept- the basis of such data Tolman argued that the rat had able exemplar of an established linguistic pattern. Gram- “acquired not merely a strip-map . . . but, rather, a wider maticality judgments are therefore gradient rather than binary, and depend upon subtleties of context and mean- comprehensive map to the effect that the food was located in ing as well as conformity to grammatical conventions. such and such a direction in the room” (p. 204). Cognitive linguists in general do not endorse the goal of This concept of cognitive map has elicited considerable writing a grammar that will generate all and only the interest over the years since. Tolman’s results seem to imply grammatical sentences of a language (see GENERATIVE that animals or humans go beyond the information given GRAMMAR), as the gradience, variability, and context- when they go directly to a goal after having learned an indi- dependence of grammaticality judgments render such a rect path. That conclusion is strongest when the spatial cues goal unrealizable. marking the goal location are not visible from the starting 5. Intimate interrelationship of language and other cognitive position. Rieser, Guth, and Hill (1986) reported a compel- faculties. Cognitive linguists search for analogues for lin- ling experimental example of such behavior. Blindfolded guistic phenomena in general cognition (hence the name “cognitive” linguistics). Findings from psychology about observers learned the layout of objects in a room. They were the nature of human CATEGORIZATION, ATTENTION, MEM- trained specifically, and only, to walk without vision from a ORY, etc., are taken to inform linguistic theory directly. home base to each of three locations. Naturally they were 6. The nonautonomy of syntax. SYNTAX is understood to be also able to point from home base to the three locations the conventional patterns by which sounds (or signs) con- quickly and accurately without vision. Furthermore, when vey meanings. Syntax therefore does not require its own they walked to any one of the locations, still without vision, special primitives and theoretical constructs. Grammati- they could point as rapidly and almost as accurately to the cal knowledge is captured by positing conventionalized other two locations as they had from home base. As with or “entrenched” symbolic patterns that speakers acquire Tolman’s rats these observers seemed to have acquired through exposure to actually occurring expressions. knowledge of many of the spatial relations in a complex See alsoAMBIGUITY; FIGURATIVE LANGUAGE; LINGUIS- spatial layout from direct experience with only a small sub- TIC UNIVERSALS AND UNIVERSAL GRAMMAR set of those relations. In that sense they had acquired a cog- nitive map. More generally, two different criteria are often —Karen van Hoek used to attribute to people maplike organization of spatial knowledge. One, as in the previous example, is when spatial References inferences about the direction and distances among loca- tions can be made without direct experience. The other is Chafe, W. (1994). Discourse, Consciousness and Time. Chicago: when it is possible to take mentally a different perspective University of Chicago Press. on an entire spatial layout. This can be done by imagining Fauconnier, G. (1985). Mental Spaces. Cambridge, MA: MIT Press. 136 Cognitive Maps oneself in a different position with respect to a layout but ignored features on the left side. They were then asked (Hintzman, Odell, and Arndt 1981). to imagine viewing it from the other end. They now reported From where do such cognitive maps arise? One answer it including originally missing features that were now on the involves the specific kinds of experience one has with the right side, omitting originally reported features now on the particular spatial layout in question. For example, neglected left side. Thorndyke and Hayes-Roth (1982) compared observers’ A number of areas in the brain have been implicated in organization of spatial knowledge after studying a map of a spatial information processing by SINGLE-NEURON large building and after experience in actually navigating RECORDING studies in animals. Besides the posterior pari- around the building. They found that map experience led to etal cortex, the HIPPOCAMPUS has been found to play a par- more accurate estimation of the straight line or Euclidean ticularly important role. Indeed, O’keefe and Nadel (1978) distances between locations and understandably to better authored a book entitled The Hippocampus as Cognitive map drawing, whereas actual locomotion around the build- Map. Early spatially relevant single cell recording research ing led to more accurate route distance estimation and to resulted in the exciting discovery of “place” cells. These more accurate judgments of the actual direction of locations cells fire selectively for particular locations in a spatial from station points within the building. A second answer environment. However, place cells by themselves seem to involves the nature of one’s experience with spatial layouts reflect place recognition but not necessarily information in general. Thus congenitally blind observers who have about how to get from one place to another (especially if it never had visual experience show less maplike organization is out of sight). An analysis by McNaughton, Knierim, and of their spatial knowledge than do sighted ones (or at least Wilson (1994) suggests a vector subtraction model that take longer to develop such organization). Why might this could solve the wayfinding problem. In their prototypic be? One hypothesis (Rieser et al. 1995) is that sighted per- situation, a kind of detour behavior, an animal knows the sons have prolonged experience with optical flow patterns distance and direction from its home to landmark A. On as they move about during their lives. The information from some occasion it finds itself at an unknown landmark C, this optical stimulation specifies how the distance and direc- from which A is visible but not its home. Their model sug- tion of locations with respect to the observer change as they gests how hippocampus place cells in conjunction with move. Sighted persons use that knowledge to keep track of distance and heading information are sufficient to generate even out-of-sight locations as they move. Blind persons a straight line path to its home. Heading information is without the optical flow experience do not do this as well, potentially available from integration of vestibular stimu- and, of course, all their locomotion involves moving with lation, and there are a variety of visual sources for distance respect to out-of-sight locations. information. It seems clear that the absence of visual experience puts Spatial analogies have frequently been attractive ways of one at a disadvantage in developing cognitive maps of spa- describing nonspatial domains. Examples include kinship tial layout. When is that visual experience important? The relations, bureaucratic organizations, statistical analyses, literature comparing early and late blinded observers in spa- color perception, etc. An intriguing possibility is that our tial perception and cognition tasks often reports better per- spatial thinking and the idea of cognitive maps can apply to formance in the late blind (Warren 1984; Warren, other domains that are easily described in spatial terms. It is Anooshian, and Bollinger 1973). However, the age bound- possible to think of cognitive maps of data bases—and aries are very fuzzy because the ages used vary from study indeed the term cognitive map is often being used more and to study and because the availability of blind participants of more metaphorically. It is an empirical question with impor- specific ages is rather limited. With sighted participants, tant practical and theoretical implications to know how well maplike organization is evident at very early ages ranging the underlying spatial cognition transfers to such nonspatial from two years (Pick 1993) to six or seven years (Hazen, domains. Lockman, and Pick 1978), depending on the situation. See also ANIMAL NAVIGATION; ANIMAL NAVIGATION, Another kind of answer to the origin of maplike organi- NEURAL NETWORKS; HUMAN NAVIGATION; SPATIAL PERCEP- zation of spatial knowledge comes from the considerable TION research on underlying brain mechanisms. This research has —Herbert Pick, Jr. been particularly concerned with where and how spatial information is represented in the brain and how, neurologi- References cally, orientation is maintained with respect to spatial lay- outs. Lesion studies and studies of single cell recording Bisiach, E., and C. Luzzatti. (1978). Unilateral neglect of represen- have been among the most informative. Many human brain tational space. Cortex 14: 129–133. Hazen, N. L., J. J. Lockman, and H. L. Pick, Jr. (1978). The devel- damage studies have been concerned with VISUAL NEGLECT, opment of children’s representation of large–scale environ- a deficit in which part of a visual stimulus is ignored. Such ments. Child Development 49: 623–636. deficits have often been associated with parietal lobe dam- Hintzman, D. L., C. S. O’Dell, and D. R. Arndt. (1981). Orienta- age, and the neglect is of the visual space contralateral to the tion and cognitive maps. Cognitive Psychology 13: 149–206. damage. This neglect apparently operates in memory as McNaughton, B. L., J. J. Knierman, and M. A. Wilson. (1994). well as during perception. One example particularly rele- Vector encoding and the vestibular foundations of spatial cog- vant to cognitive mapping has been reported by Bisiach and nition: neurophysiological and computational mechanisms. In Luzzatti (1978). Patients were asked to describe a familiar M. S. Gazzaniga, Ed., The Cognitive Neurosciences. Cam- urban scene, when viewed from one end. They described it, bridge, MA: MIT Press. Cognitive Modeling, Connectionist 137 O’keefe, J., and L. Nadel. (1978). The Hippocampus as Cognitive Map. Oxford: Oxford University Press. Pick, H. L. Jr. (1993). Organization of spatial knowledge in chil- dren. In N. Eilan, R. McCarthy, and B. Brewer, Eds., Spatial Representation. Oxford: Blackwell Publishers, pp. 31–42. Rieser, J. J., D. A. Guth, and E. W. Hill. (1988). Sensitivity to perspective structure while walking without vision. Perception 15: 173–188. Rieser, J. J., H. L. Pick, Jr., D. H. Ashmead, and A. E. Garing. (1995). Calibration of human locomotion and models of per- ceptual–motor organization. Journal of Experimental Psychol- ogy: Human Perception and Performance 21: 480–497. Samsonovitch, A., and B. L. McNaughton. (1997). Path integration and cognitive mapping in a continuous attractor neural network model. Journal of Neuroscience 17: 5900–5920. Thorndyke, P. W., and B. Hayes-Roth. (1982). Differences in spa- tial knowledge acquired from maps and navigation. Cognitive Psychology 14: 560–589. Tolman, E. C. (1948). Cognitive maps in rats and man. Psycholog- ical Review 55: 189–208. Warren, D. H. (1984). Blindness and Early Childhood Develop- Figure 1. A picture illustrating how a large number of very partial ment. New York: American Foundation for the Blind. and ambiguous clues may lead to a perception of an object in a Warren, D. H., L. J. Anooshian, and J. G. Bollinger. (1973). Early scene. Reprinted with permission from Figure 3-1 of Marr (1982). vs. late blindness: the role of early vision in spatial behavior. American Foundation for the Blind Research Bulletin 26: 151– for further discussion, and and COGNI- PROBLEM SOLVING 170. TIVE MODELING, SYMBOLIC for other perspectives. Connectionist models—often called parallel distributed Further Readings processing models or NEURAL NETWORKS—begin with the assumption that natural cognition takes place through the Kitchin, R. M. (1994). Cognitive maps: what are they and why interactions of large numbers of simple processing units. study them? Journal of Environmental Psychology 14: 1–19. Inspiration for this approach comes from the fact that the Siegel, A. W., and S. H. White. (1975). The development of spatial representation of large-scale environments. In H. W. Reese, brain appears to consist of vast numbers of such units—neu- Ed., Advances in Child Development and Behavior, vol 10. rons. While connectionists often seek to capture putative prin- New York: Academic Press. ciples of neural COMPUTATION in their models, the units in an actual connectionist simulation model should not generally Cognitive Modeling, Connectionist be thought of as corresponding to individual neurons, because there are far fewer units in most simulations than neurons in the relevant brain regions, and because some of the properties Connectionist cognitive modeling is an approach to under- of the units used may not be exactly neuron-like. standing the mechanisms of human cognition through the In connectionist systems, an active mental representa- use of simulated networks of simple, neuronlike processing tion, such as a precept, is a pattern of activation over the set units. Connectionist models are most often applied to what of processing units in the model. Processing takes place via might be called natural cognitive tasks. These tasks include the propagation of activation among the units, via weighted perceiving the world of objects and events and interpreting it connections. The “knowledge” that governs processing con- for the purpose of organized behavior; retrieving contextu- sists of the values of the connection weights, and learning ally appropriate information from memory; perceiving and occurs through the gradual adaptation of the connection understanding language; and what might be called intuitive weights, which occur as a result of activity in the network, or implicit reasoning, in which an inference is derived or a sometimes taken together with “error” signals, either in the solution to a problem is discovered without the explicit form of a success or failure signal (cf. REINFORCEMENT application of a predefined ALGORITHM. Because connec- LEARNING) or an explicit computation of the mismatch tionist models capture cognition at a microstructural level, a between obtained results and some “teaching” signal (cf., more succinct characterization of a cognitive process—espe- error correction learning, back propagation). cially one that is temporally extended or involves explicit, verbal reasoning—can sometimes be given through the use Perception of a more symbolic modeling framework. However, many connectionists hold that a connectionist microstructure Perception is a highly context-dependent process. Individual underlies all aspects of human cognition, and a connectionist stimulus elements may be highly ambiguous, but when con- approach may well be necessary to capture the supreme sidered in light of other elements present in the input, achievements of human reasoning and problem solving, to together with knowledge of patterns of co-occurrence of ele- the extent that such achievements arise from sudden insight, ments, there may be a single, best interpretation. Such is the implicit reasoning, and/or imagining, as opposed to algorith- case with the famous Dalmatian dog figure, shown in figure mic derivation. See CONNECTIONISM, PHILOSOPHICAL ISSUES 1. An early connectionist model that captured the joint role of 138 Cognitive Modeling, Connectionist ing place through the interactions of simple processing units, just as in the case of perception. One can think of recall as a process of constructing a pattern of activation that is taken by the recaller to reflect not the present input to the senses, but some pattern previously experienced. Central to this view is the idea that recall is prone to a variety of influences that often help us fill in missing details but which are not always bound to fill in correct information. An early model of memory retrieval (McClelland 1981) showed how multi- ple items in memory can become partially activated, thereby filling in missing information, when memory is probed. The partial activation is based on similarity of the item in mem- ory to the probe and to the information initially retrieved in response to the probe. Similarity based generalization appears to be ubiquitous, and it can often lead to correct inference, but this is far from guaranteed, and indeed subse- quent work by Nystrom and McClelland (1992) showed how connectionist networks can lead to blend errors in recall. Connectionist models have also been applied productively to aspects of concept learning (Gluck and Bower 1988), proto- type formation (Knapp and Anderson 1984; McClelland and Figure 2. A fraction of the units and connections from the Rumelhart 1985) and the acquisition of conceptual represen- interactive activation model of visual word perception. Reprinted tations of concepts (Rumelhart and Todd 1993). A crucial with permission from figure 3 of McClelland and Rumelhart 1981. aspect of this latter work is the demonstration that connec- tionist models trained with back propagation can learn what stimulus and context information in perception was the inter- basis to use for representations of concepts, so that similarity active activation model (McClelland and Rumelhart 1981). based generalization can be based on deep or structural This model contained units for familiar words, for letters in rather than superficial aspects of similarity (Hinton 1989; each position within the words, and for features of letters in see McClelland 1994 for discussion). each position (fig. 2). Mutually consistent units had mutually Connectionist models also address the distinction excitatory connections (e.g., T unit for T in the first position between explicit and implicit memory. Implicit memory had a mutually excitatory connection with the units for words refers to an aftereffect of experience with an item in a task beginning with T, such as TIME, TAPE, etc.). Mutually that does not require explicit reference to the prior occur- inconsistent units had mutually inhibitory connections (e.g., rence of the item. These effects often occur without any rec- there can only be one letter per position, so the units for all of ollection of one having previously seen the item. the letters in a given position are mutually inhibitory). Simu- Connectionist models account for such findings in terms of lations of perception as occurring through the excitatory and the adjustments of the strengths of the connections among inhibitory interactions among these units have led to a the units in networks responsible for processing the stimuli detailed account of a large body of psychological evidence on (McClelland and Rumelhart 1985; Becker et al. 1997). the role of context in letter perception. The interactive activa- Explicit memory for recent events and experiences may be tion model further addresses the fact that perceptual enhance- profoundly impaired in individuals who show normal ment also occurs for novel, wordlike stimuli such as MAVE. implicit learning (Squire 1992), suggesting a special brain The presentation of an item like MAVE produces partial acti- system may be required for the formation of new explicit vation of a number of word units (such as SAVE, GAVE, memories. A number of connectionist models have been MAKE, MOVE, etc.). Each of these provides a small amount proposed in an effort to explain how and why these effects of feedback support to the units for the letters it contains, with occur (Murre 1997; Alvarez and Squire 1994; McClelland, the outcome that the letters in items like MAVE receive almost McNaughton, and O'Reilly 1995). as much feedback as letters in actual words. Stochastic ver- sions of the interactive activation model overcome empirical shortcomings of the original version (McClelland 1991). Language and Reading Other connectionist models have investigated issues in Connectionist models have suggested a clear alternative to the perception of spoken language (McClelland and Elman the notion that knowledge of language must be represented 1986), in VISUAL OBJECT RECOGNITION, AI (Hummel and as a system of explicit (though inaccessible) rules, and have Biederman 1992), and in the interaction of perceptual and presented mechanisms of morphological inflection, spelling- attentional processes (Phaf, van der Heijden, and Hudson sound conversion, and sentence processing and comprehen- 1990; Mozer 1987; Mozer and Behrmann 1990). sion that account for important aspects of the psychological phenomena of language that have been ignored by tradi- Memory and Learning tional accounts. Key among the phenomena not captured by A fundamental assumption of connectionist models of MEM- traditional, rule-based approaches have been the existence of ORY is that memory is inherently a constructive process, tak- quasi-regular structure, and the sensitivity of language Cognitive Modeling, Connectionist 139 behavior to varying degrees of frequency and consistency. out full implementations, perhaps in part because higher While all approaches acknowledge the existence of excep- level cognition often has a temporally extended character, tions, traditional approaches have failed to take account of not easily captured in a single settling of a network to an the fact that the exceptions are far from a random list of attractor state. Though there have been promising develop- completely arbitrary items. Exceptions to the regular past ments in the use of RECURRENT NETWORKS to model tem- tense of English, for example, come in clusters that share porally extended aspects of cognition, many researchers phonological characteristics (e.g., weep-wept, sleep-slept, have opted for “hybrid” models. These models often rely sweep-swept, creep-crept) and quite frequently have ele- on external, more traditional modeling frameworks to ments in common with the “regular” past tense (/d/ or /t/, assign units and connections so that appropriate constraint- like their “regular” counterparts). An early connectionist satisfaction processes can then be carried out in the connec- model of Rumelhart and McClelland (1986) showed that a tionist component. This approach has been used in the network model that learned connection weights to generate domain of analogical reasoning (Holyoak and Thagard the past tense of a word from its present tense could capture 1989). A slightly different approach, suggested by Rumel- a number of aspects of the acquisition of the past tense. Cri- hart (1989), assumes that concepts are represented by dis- tiques of aspects of this model (Pinker and Prince 1988; tributed patterns of activity that capture both their Lachter and Bever 1988) raised a number of objections, but superficial and their deeper conceptual and relational fea- subsequent modeling work (MacWhinney and Leinbach tures. Discovering an analogy then consists of activating 1991; Plunkett and Marchman 1993) has addressed many of the conceptual and relational features of the source con- the criticisms. Debate still revolves around the need to cept, which may then settle to an attractor state consisting assume that explicit, inaccessible rules arise at some point in of an analog in another domain that shares these same deep the course of normal development (Pinker 1991). A similar features but differs in superficial details. debate has arisen in the domain of word reading (see Colth- Many researchers in this area view the “binding prob- eart et al. 1993; Plaut et al. 1996). lem” (the assignment of arbitrary content to a slot in a struc- Connectionist approaches have also been used to account tural description) as a fundamental problem to be solved in for aspects of language comprehension and production. the implementation of connectionist models of reasoning, Connectionists suggest that language processing is a and several solutions have been proposed (Smolensky, Leg- constraint-satisfaction process sensitive to semantic and endre, and Miyata forthcoming; Shastri and Ajjanagadde contextual factors as well as syntactic constraints (Rumelhart 1993; Hummel and Holyoak 1997). However, networks can 1977; McClelland 1987). Considerable evidence (Taraban learn to create their own slots so that they can carry out nat- and McClelland 1988; MacDonald, Pearlmutter, and ural inferences in familiar content areas (St. John 1992). Seidenberg 1994; Tanenhaus et al. 1995) now supports the Whether learning mechanisms can yield a general enough constraint-satisfaction position, and a model that takes joint implementation to capture people's ability to reason in unfa- effects of content and sentence structure into account has miliar domains remains to be determined. been implemented (St. John and McClelland 1995). In See also COMPUTATION AND THE BRAIN; COMPUTA- production, evidence supporting a constraint-satisfaction TIONAL THEORY OF MIND; CONNECTIONIST APPROACHES TO approach to the generation of the sounds of a word has led to LANGUAGE; IMPLICIT VS. EXPLICIT MEMORY; RULES AND interactive connectionist models of word production (Dell REPRESENTATIONS; VISUAL WORD RECOGNITION 1986; Dell et al. forthcoming). Another, very important — James L. McClelland direction of connectionist work is the area of language centers on the learning of the grammatical structure of sentences in a class of connectionist networks known as the simple recurrent References net (Elman 1990). Such networks could learn to become Alvarez, P., and L. R. Squire. (1994). Memory consolidation and sensitive to long-distance dependencies characteristic of the medial temporal lobe: a simple network model. Proceed- sentences with embedded clauses, suggesting that there may ings of the National Academy of Sciences, USA 91: 7041–7045. not be a need to posit explicit, inaccessible rules to account Becker, S., M. Moscovitch, M. Behrmann, and S. Joordens. for human knowledge of syntax (Elman 1991; Servan- (1997). Long-term semantic priming: a computational account Schreiber, Cleeremans, and McClelland 1991; Rohde and and empirical evidence. Journal of Experimental Psychology: Plaut forthcoming). However, existing models have been Learning, Memory, and Cognition 23: 1059–1082. trained on very small “languages,” and successes with larger Coltheart, M., B. Curtis, P. Atkins, and M. Haller. (1993). Models language corpora, as well as demonstrations of sensitivity to of reading aloud: dual-route and parallel-distributed-processing approaches. Psychological Review 100: 589–608. additional aspects of syntax, are needed. Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review 93: 283–321. Reasoning and Problem Solving Dell, G. S., M. F. Schwartz, N. Martin, E. M. Saffran, and D. A. Gagnon. (forthcoming). Lexical access in normal and aphasic While connectionist models have had considerable success speakers. in many areas of cognition, their full promise for address- Elman, J. L. (1990). Finding structure in time. Cognitive Science ing higher level aspects of cognition, such as reasoning and 14: 179–211. problem solving, remains to be fully realized. A number of Elman, J. L. (1991). Distributed representations, simple recurrent papers point toward the prospect of connectionist models in networks, and grammatical structure. Machine Learning 7: these areas (Rumelhart et al. 1986; Rumelhart 1989) with- 194–220. 140 Cognitive Modeling, Connectionist Gluck, M. A., and G. H. Bower. (1988). Evaluating an adaptive Murre, J. M. (1997). Implicit and explicit memory in amnesia: network model of human learning. Journal of Memory and some explanations and predictions of the tracelink model. Language 27: 166–195. Memory 5: 213–232. Hinton, G. E. (1989). Learning distributed representations of con- Nystrom, L. E., and J. L. McClelland. (1992). Trace synthesis in cepts. In R. G. M. Morris, Ed., Parallel Distributed Processing: cued recall. Journal of Memory and Language 31: 591–614. Implications for Psychology and Neurobiology. Oxford, Phaf, R. H., A. H. C. van der Heijden, and P. T. W. Hudson. (1990). England: Clarendon Press, pp. 46–61. SLAM: A connectionist model for attention in visual selection Holyoak, K. J., and P. Thagard. (1989). Analogical mapping by tasks. Cognitive Psychology 22: 273–341. constraint satisfaction. Cognitive Science 13: 295–356. Pinker, S. (1991). Rules of language. Science 253: 530. Hummel, J. E., and I. Biederman. (1992). Dynamic binding in a Pinker, S., and A. Prince. (1988). On language and connectionism: neural network for shape recognition. Psychological Review 99: analysis of a parallel distributed processing model of language 480–517. acquisition. Cognition 28: 73–193. Hummel, J. E., and K. J. Holyoak. (1997). Distributed representa- Plaut, D. C., J. L. McClelland, M. S. Seidenberg, and K. E. Patter- tions of structure: a theory of analogical access and mapping. son. (1996). Understanding normal and impaired word reading: Psychological Review 104: 427–466. computational principles in quasi-regular domains. Psychologi- Knapp, A., and J. A. Anderson. (1984). A signal averaging model cal Review 103: 56–115. for concept formation. Journal of Experimental Psychology: Plunkett, K., and V. A. Marchman. (1993). From rote learning to Learning, Memory, and Cognition 10: 617–637. system building: acquiring verb morphology in children and Lachter, J., and T. G. Bever. (1988). The relation between linguis- connectionist nets. Cognition 48: 21–69. tic structure and theories of language learning: a constructive Rohde, D., and D. C. Plaut. (forthcoming). Simple Recurrent Net- critique of some connectionist learning models. Cognition 28: works and Natural Language: How Important is Starting 195–247. Small? Pittsburgh, PA: Center for the Neural Basis of Cogni- MacDonald, M. C., N. J. Pearlmutter, and M. S. Seidenberg. tion, Carnegie Mellon and the University of Pittsburgh. (1994). The lexical nature of syntactic ambiguity resolution. Rumelhart, D. E. (1977). Toward an interactive model of reading. Psychological Review 101: 676–703. In S. Dornic, Ed., Attention and Performance VI. Hillsdale, NJ: MacWhinney, B., and J. Leinbach. (1991). Implementations are Erlbaum. not conceptualizations: revising the verb learning model. Cog- Rumelhart, D. E. (1989). Toward a microstructural account of nition 40: 121–153. human reasoning. In S. Vosniadou and A. Ortony, Eds., Simi- Marr, D. (1982). Vision. W. H. Freeman. larity and Analogical Reasoning. New York: Cambridge Uni- McClelland, J. L. (1981). Retrieving general and specific informa- versity Press, pp. 298–312. tion from stored knowledge of specifics. In Proceedings of the Rumelhart, D. E., and J. L. McClelland. (1986). On learning the Third Annual Conference of the Cognitive Science Society. Ber- past tenses of English verbs. In J. L. McClelland, D. E. Rumel- keley, CA, pp. 170–172. hart, and the PDP Research Group, Eds., Parallel Distributed McClelland, J. L. (1987). The case for interactionism in language Processing: Explorations in the Microstructure of Cognition, processing. In M. Coltheart, Ed., Attention and Performance vol. 2. Cambridge, MA: MIT Press, pp. 216–271. XII: The Psychology of Reading. London: Erlbaum, pp. 1–36. Rumelhart, D. E., P. Smolensky, J. L. McClelland, and G. E. Hin- McClelland, J. L. (1991). Stochastic interactive activation and the ton. (1986). Schemata and sequential thought processes in PDP effect of context on perception. Cognitive Psychology 23: 1– models. In J. L. McClelland, D. E. Rumelhart, and the PDP 44. Research Group, Eds., Parallel Distributed Processing: Explo- McClelland, J. L. (1994). The interaction of nature and nurture in rations in the Microstructure of Cognition, vol. 2. Cambridge, development: a parallel distributed processing perspective. In P. MA: MIT Press, pp. 7–57. Bertelson, P. Eelen, and G. D'Ydewalle, Eds., International Rumelhart, D. E., and P. M. Todd. (1993). Learning and connec- Perspectives on Psychological Science, vol. 1: Leading Themes. tionist representations. In D. E. Meyer and S. Kornblum, Eds., Hillsdale, NJ: Erlbaum, pp. 57–88. Attention and Performance XIV: Synergies in Experimental McClelland, J. L., and J. L. Elman. (1986). The TRACE model of Psychology, Artificial Intelligence and Cognitive Neuroscience. speech perception. Cognitive Psychology 18: 1–86. Cambridge, MA: MIT Press, pp. 3–30. McClelland, J. L., B. L. McNaughton, and R. C. O'Reilly. (1995). Servan-Schreiber, D., A. Cleeremans, and J. L. McClelland. Why there are complementary learning systems in the (1991). Graded state machines: the representation of temporal hippocampus and neocortex: insights from the successes and contingencies in simple recurrent networks. Machine Learning failures of connectionist models of learning and memory. 7: 161–193. Psychological Review 102: 419–457. Shastri, L., and V. Ajjanagadde. (1993). From simple associations McClelland, J. L., and D. E. Rumelhart. (1981). An interactive to systematic reasoning: a connectionist representation of rules, activation model of context effects in letter perception: Part 1: variables and dynamic bindings using temporal synchrony. an account of basic findings. Psychological Review 88: 375– Behavioral and Brain Sciences 16: 417–494. 407. Smolensky, P., G. Legendre, and Y. Miyata. (forthcoming). Princi- McClelland, J. L., and D. E. Rumelhart. (1985). Distributed mem- ples for an Integrated Connectionist/Symbolic Theory of ory and the representation of general and specific information. Higher Cognition. Hillsdale, NJ: Erlbaum. (Also available as Journal of Experimental Psychology: General 114: 159–188. Technical Report CU-CS-600-92. Boulder: Computer Science Mozer, M. C. (1987). Early parallel processing in reading: a con- Department and Institute of Cognitive Science, University of nectionist approach. In M. Coltheart, Ed., Attention and Perfor- Colorado at Boulder.) mance XII: The Psychology of Reading. London: Erlbaum, pp. Squire, L. R. (1992). Memory and the hippocampus: a synthesis 83–104. from findings with rats, monkeys and humans. Psychological Mozer, M. C., and M. Behrmann. (1990). On the interaction of Review 99: 195–231. selective attention and lexical knowledge: a connectionist St. John, M. F. (1992). The story gestalt: a model of knowledge- account of neglect dyslexia. Journal of Cognitive Neuroscience intensive processes in text comprehension. Cognitive Science 2: 96–123. 16: 271–306. Cognitive Modeling, Symbolic 141 structures may be composed and interpreted, including St. John, M. F., and J. L. McClelland. (1990). Learning and apply- ing contextual constraints in sentence comprehension. Artificial structures that denote executable processes. Intelligence 46: 217–257. The extent to which symbolic processing is required for Tanenhaus, M. K., M. J. Spivey-Knowlton, K. M. Eberhard, and J. explaining cognition, and the extent to which connectionist C. Sedivy. (1995). Integration of visual and linguistic informa- models have symbolic properties, has been the topic of tion in spoken language comprehension. Science 268: 1632– ongoing debates in cognitive science (Fodor and Pylyshyn 1634. 1988; Rumelhart 1989; Simon 1996; see COGNITIVE MODEL- Taraban, R., and J. L. McClelland. (1988). Constituent attachment ING, CONNECTIONIST). Much of the debate has turned on the and thematic role assignment in sentence processing: influ- question of whether or not particular connectionist systems ences of content-based expectations. Journal of Memory and are able to compose and interpret novel structures. In partic- Language 27: 597–632 ular, Fodor and Pylyshyn argue that any valid cognitive the- ory must have the properties of productivity and Cognitive Modeling, Symbolic systematicity. Productivity refers to the ability to produce and entertain an unbounded set of novel propositions with Symbolic cognitive models are theories of human cognition finite means. Systematicity is most easily seen in linguistic that take the form of working computer programs. A cogni- processing, and refers to the intrinsic connection between tive model is intended to be an explanation of how some our ability to produce and comprehend certain linguistic aspect of cognition is accomplished by a set of primitive forms. For example, no speaker of English can understand computational processes. A model performs a specific cog- the utterance “John loves the girl” without also being able to nitive task or class of tasks and produces behavior that con- understand “the girl loves John,” or any other utterance stitutes a set of predictions that can be compared to data from the unbounded set of utterances of the form “X loves from human performance. Task domains that have received Y.” Both productivity and systematicity point to the need to considerable attention include problem solving, language posit underlying abstract structures that can be freely com- comprehension, memory tasks, and human-device interac- posed, instantiated with novel items, and interpreted on the tion. basis of their structure. The scientific questions cognitive modeling seeks to A variety of empirical constraints may be brought to bear answer belong to cognitive psychology, and the computa- on cognitive models. These include: basic functionality tional techniques are often drawn from artificial intelli- requirements (a model must actually perform the task to gence. Cognitive modeling differs from other forms of some approximation if it is to be veridical); data from verbal theorizing in psychology in its focus on functionality and protocols of human subjects thinking aloud while PROBLEM computational completeness. Cognitive modeling produces SOLVING (these reveal intermediate cognitive steps that may both a theory of human behavior on a task and a computa- be aligned with the model’s behavior; Newell and Simon tional artifact that performs the task. 1972; Ericsson and Simon 1984); chronometric data (such The theoretical foundation of cognitive modeling is the data can constrain a cognitive model once assumptions are idea that cognition is a kind of COMPUTATION (see also made about the time course of the component computa- COMPUTATIONAL THEORY OF MIND). The claim is that what tional processes; Newell 1990); eye movement data (eye fix- the mind does, in part, is perform cognitive tasks by com- ation durations are a function of cognitive, as well as puting. (This does not mean that the computer is a metaphor perceptual, complexity; Carpenter and Just 1987; Rayner for the mind, or that the architectures of modern digital 1977); error patterns; and data on learning rates and trans- computers can give us insights into human mental architec- fer of cognitive skill (such data constrain the increasing ture.) If this is the case, then it must be possible to explain number of cognitive models that are able to change behavior cognition as a dynamic unfolding of computational pro- over time; Singley and Anderson 1989). cesses. A cognitive model cast as a computer program is a Though the problem of under-constraining data is a uni- precise description of what those processes are and how versal issue in science, it is sometimes thought to be particu- they develop over time to realize some task. larly acute in computational cognitive modeling, despite the A cognitive model is considered to be a symbolic cogni- variety of empirical constraints described above. There are tive model if it has the properties of a symbolic system in two related sides to the problem. First, cognitive models are the technical sense of Newell and Simon’s (1976) physical often seen as making many detailed commitments about symbol system hypothesis (PSSH). The PSSH provides a aspects of processing for which no data distinguish among hypothesis about the necessary and sufficient conditions for alternatives. Second, because of the universality of compu- a physical system to realize intelligence. It is a reformula- tational frameworks, an infinite number of programs can be tion of Turing computation (see CHURCH-TURING THESIS) created that mimic the desired behavior (Anderson 1978). that identifies symbol processing as the key requirement for Theorists have responded to these problems in a variety of complex cognition. The requirement is that the system be ways. One way is to adopt different levels of abstraction in capable of manipulating and composing symbols and sym- the theoretical statements: in short, not all the details of the bol structures—physical patterns with associated processes computer model are part of the theory. NEWELL (Newell that give the patterns the power to denote either external 1990; Newell et al. 1991), MARR (1982), PYLYSHYN (1984) entities or other internal symbol structures (Newell 1980; and others have developed frameworks for specifying sys- Newell 1990; Pylyshyn 1989; Simon 1996). One of the dis- tems at multiple levels of abstraction. The weakest possible tinguishing characteristics of symbols systems is that novel correspondence between a model and human cognition is at 142 Cognitive Modeling, Symbolic the level of input/output: if the model only responds to func- vides accounts of many phenomena surrounding the recogni- tionality constraints, it is intended only as a sufficiency dem- tion and recall of verbal material (e.g., the fan effect), and onstration and formal task definition (Pylyshyn 1984, 1989). regularities in problem-solving strategies (Anderson 1993; The strongest kind of correspondence requires that the model Anderson and Lebiere forthcoming). EPAM is one of the execute the same algorithm (take the same intermediate earliest computational models in psychology and accounts computational steps) as human processing. (No theoretical for a significant body of data in the learning and high-level interpretation of a cognitive model, not even the strongest, perception of verbal material. It has been compared in some depends on the hardware details of the host machine.) detail to related connectionist accounts (Richman and Simon An important method for precisely specifying the 1989). SOAR is a learning architecture that has been applied intended level of abstraction is the use of programming lan- to domains ranging from rapid, immediate tasks such as typ- guages designed for cognitive modeling, such as PRODUC- ing and video game interaction (John, Vera and Newell TION SYSTEMS. Production systems were introduced by 1994) to long stretches of problem- solving behavior (Newell Newell (1973) as a flexible model of the control structure of 1990), building on the earlier analyses by Newell and Simon human cognition. The flow of processing is not controlled (1972). SOAR has also served as the foundation for a by a fixed program or procedure laid out in advance, as is detailed theory of sentence processing, which models both the case in standard procedural programming languages. the rapid on-line effects of semantics and context, as well as Instead, production systems posit a set of independent pro- subtle effects of syntactic structure on processing difficulty duction rules (condition-action pairs) that may fire any time across several typologically distinct languages (Lewis 1996, their conditions are satisfied. The flow of control is there- forthcoming). EPIC is a recent architecture that combines a fore determined at run time, and is a function of the dynam- parallel production system with models of peripheral and ically evolving contents of the working memory that motor components, and accounts for a substantial body of triggers the productions. A cognitive model written in a pro- data in the performance of dual cognitive tasks (Meyer and duction system makes theoretical commitments at the level Kieras 1997). CAPS is a good example of recent efforts in of the production rules, and defines a computationally com- symbolic modeling to account for individual differences in plete system at that level. The particular underlying imple- cognitive behavior. CAPS explains differences in language mentation (e.g., LISP or Java) is theoretically irrelevant. comprehension performance by appeal to differences in A complementary approach to reducing theoretical working memory capacity (Just and Carpenter 1992). Polk degrees of freedom is to apply the same model with minimal and Newell (1995) developed a constrained parametric variation to a wide range of tasks. Each new task is not an model of individual differences in syllogistic reasoning that unrelated pool of data to be arbitrarily fitted with a new provides close fits to particular individuals by making differ- model or with new parameters. For example, a computa- ent assumptions about the way they interpret certain linguis- tional model of short-term memory that accounts for imme- tic forms (see also Johnson-Laird and Byrne 1991). diate serial recall should also apply, with minimal strategy In short, modern symbolic cognitive modeling is charac- variations, to free recall tasks and recognition tasks as well terized by detailed accounts of chronometric data and error (Anderson and Matessa 1997). patterns; explorations of the explanatory role of the same Recent cognitive modeling research combines these basic architectural components across a range of cognitive approaches by building and working with cognitive archi- tasks; attempts to clearly distinguish the contributions of tectures. A COGNITIVE ARCHITECTURE posits a fixed set of relatively fixed architecture and more plastic task strategies and background knowledge; and attempts to explicitly deal computational mechanisms and resources that putatively with the problem of theoretical degrees of freedom. The underlie a wide range of human cognition. As these cogni- underlying goal of all these approaches is to produce more tive architectures never correspond to the architectures of unified accounts of cognition explicitly embodied in com- modern computers (for example, they may demand a higher putational mechanisms. degree of parallelism), the architectures must first be emu- lated on computers before cognitive models can be built See also KNOWLEDGE-BASED SYSTEMS; KNOWLEDGE within them for specific tasks. Such architectures, together REPRESENTATION; RULES AND REPRESENTATIONS with the variety of empirical constraints outlined above, —Richard L. Lewis place considerable constraint on task models. Examples of the architectural approach include ACT-R References (Anderson 1993), CAPS (Just and Carpenter 1992), SOAR (Newell 1990), EPAM (Feigenbaum and Simon 1984), and Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological Review 85: 249–277. Epic (Meyer and Kieras 1997). (All are production systems, Anderson, J. R. (1993). The Adaptive Character of Thought. Hills- with the exception of EPAM.) These architectures have col- dale, NJ: Erlbaum. lectively been applied to a broad set of phenomena in cogni- Anderson, J. R., and M. Matessa. (1997). A production system the- tive psychology. For example, Anderson and colleagues ory of serial memory. Psychological Review 104(4): 728–748. (Anderson 1993; Singley and Anderson 1989) have demon- Anderson, J. R., and C. Lebeire. (Forthcoming). ACT: Atomic strated that a production rule analysis of cognitive skill, Components of Thought. Hillsdale, NJ: Erlbaum. along with the learning mechanisms posited in the ACT Ericcson, K. A., and H. A. Simon. (1984). Protocol Analysis: Ver- architecture, provide detailed and explanatory accounts of a bal Reports as Data. Cambridge, MA: MIT Press. range of regularities in cognitive skill acquisition in complex Feigenbaum, E., and H. A. Simon. (1984). EPAM-like Models of domains such as learning to program LISP. ACT also pro- Recognition and Learning. Cognitive Science 8:305–336. Color Categorization 143 color categorization consists of the division of the color Fodor, J. A., and Z. W. Pylyshyn. (1988). Connectionism and cog- nitive architecture: A critical analysis. Cognition 28:3–71. sensations into classes by the perceptual processes of an John, B. E., A. H. Vera, and A. Newell. (1994). Toward real-time organism—human or nonhuman, adult or neonate, pos- GOMS: A model of expert behavior in a highly interactive task. sessed of knowledge of a language or not so possessed. Behavior and Information Technology 13: 255–267. Conflict among views on the relationship of lexical to per- Johnson-Laird, P. N., and R. M. J. Byrne. (1991). Deduction. Hills- ceptual color categorization has prevailed for over a cen- dale, NJ: Erlbaum. tury. Nineteenth-century classicists, anthropologists, and Just, M. A., and P. A. Carpenter. (1987). The Psychology of Read- opthalmologists were aware that all languages do not ing and Language Comprehension. Boston: Allyn and Bacon. reflect identical lexical classifications of color. The classi- Just, M. A., and P. A. Carpenter. (1992). A capacity theory of com- cist (and statesman) William Gladstone concluded that dif- prehension: Individual differences in working memory. Psycho- ferences in color lexicons reflect differences in perceptual logical Review 99 (1): 122–149. Lewis, R. L. (1996). Interference in short-term memory: The mag- abilities, for example, “that the organ of color and its ical number two (or three) in sentence processing. Journal of impressions were but partially developed among the Psycholinguistic Research. Greeks of the heroic age” (see Berlin and Kay 1969:135). Lewis, R. L. (Forthcoming). Cognitive and Computational Founda- The opthalmologist Hugo Magnus recognized that failure tions of Sentence Processing. Oxford: Oxford University Press. to distinguish colors lexically need not indicate inability to Marr, D. (1982). Vision. New York: Freeman. distinguish them perceptually (see Berlin and Kay 1969: Meyer, D. E., and D. E. Kieras. (1997). A computational theory of 144ff). These and other late nineteenth-century scholars executive cognitive processes and multiple-task performance: strongly tended to view differences in color lexicons in Part 1: basic mechanisms. Psychological Review 104(1): 3–65. evolutionary terms. Newell, A. (1973). Production systems: models of control struc- In the 1920s, 1930s, and 1940s, Edward SAPIR (e.g., tures. In W. G. Chase, Ed., Visual Information Processing. San Diego, CA: Academic Press, pp. 463–526. 1921: 219) and B. L. Whorf (e.g., 1956 [1940]: 212ff) Newell, A. (1980). Physical symbol system. Cognitive Science 4; rejected evolutionism for the doctrine of radical linguistic 135–183. and cultural relativity. The favorite field for the empirical Newell, A. (1990). Unified Theories of Cognition. Cambridge, establishment and rhetorical defense of the relativist view, MA: Harvard University Press. which became established doctrine in the 1950s and 1960s, Newell, A., and H. A. Simon. (1972). Human Problem Solving. was the lexicon of color. With respect to color categoriza- Englewood Cliffs, NJ: Prentice-Hall. tion, there have been two major traditions of research stem- Newell, A., and H. A. Simon. (1976). Computer science as empiri- ming from the relativity thesis: a within-language, cal inquiry: symbols and search. Communications of the ACM correlational line of research and a cross-language, descrip- 19: 113–126. tive one. Newell, A., G. Yost, J. E. Laird, P. S. Rosenbloom, and E. Alt- mann. (1991). Formulating the problem-space computational Early work in the former tradition (e.g., Brown and model. In R. F. Rashid, Ed., CMU Computer Science: A 25th Lenneberg 1954; Lenneberg and Roberts 1956) is primarily Anniversary Commemorative. Reading, MA: Addison-Wesley. concerned with establishing a correlation between a linguis- Polk, T. A., and A. Newell. (1995). Deduction as verbal reasoning. tic variable distinguishing colors (for example, how easy Psychological Review 102 (3): 533–566. different colors are to name or how easy they are to commu- Pylyshyn, Z. W. (1984). Computation and Cognition. Cambridge, nicate about) and a nonlinguistic cognitive variable over MA: Bradford/MIT Press. colors: memorability. Discovery of such a correlation was Pylyshyn, Z. W. (1989). Computing in cognitive science. In M. I. interpreted as support for the Sapir-Whorf view that linguis- Posner, Ed., Foundations of Cognitive Science. Cambridge, tic categorization can influence nonlinguistic perception/ MA: MIT Press. cognition. In the 1950s and 1960s, such correlations were Rayner, K. (1977). Visual attention in reading: eye movements reflect cognitive processes. Memory and Cognition 4: 443–448. reported within English and, to a limited extent, in other lan- Richman, H. B., and H. A. Simon. (1989). Context effects in letter guages (Stefflre, Castillo Vales, and Morley 1966). Because perception: comparison of two theories. Psychological Review it was assumed at the time that the linguistic variable (cod- 96 (3): 417–432. ability or communication accuracy) would vary across lan- Rumelhart, D. E. (1989). The architecture of mind: a connectionist guages, correlation between a linguistic and nonlinguistic approach. In M. I. Posner, Ed., Foundations of Cognitive Sci- variable within a single language (almost always English) ence. Cambridge, MA: MIT Press. was taken to validate the doctrine that the coding systems of Simon, H. A. (1996). The patterned matter that is mind. In D. different languages induce differences in the nonlinguistic Steier and T. Mitchell, Eds., Mind Matters: Contributions to cognition of their speakers. Eleanor Rosch (e.g., Heider Cognitive and Computer Science in Honor of Allen Newell. 1972) challenged this assumption on the basis of the appar- Hillsdale, NJ: Erlbaum. Singley, M. K., and J. R. Anderson. (1989). The transfer of cogni- ent universal lexical salience of certain “focal” colors (iden- tive skill. Cambridge, MA: Harvard University Press. tified by Berlin and Kay 1969). Rosch showed that universal perceptual salience determines both the nonlinguistic and the linguistic variables of the correlational approach, thus Color Categorization undercutting the logic of this line of research. Rosch’s view was criticized by Lucy and Shweder (1979), who also chal- Lexical color categorization consists of the division of lenged her experimental procedure; Lucy and Shweder’s color sensations into classes corresponding to the signifi- experimental procedure was in turn challenged by Kay and cata of the color words of a particular language. Perceptual Kempton (1984), who supported Rosch’s view of the matter. 144 Color Categorization (Kay and Kempton, using a noncorrelational, cross-linguis- References tic experimental procedure, showed that certain nonlinguis- Berlin, B., and E. A. Berlin. (1975). Aguaruna color categories. tic color classification judgments may be influenced by the American Ethnologist 2: 61–87. lexical classification of color in a language, while others are Berlin, B., and P. Kay. (1969). Basic Color Terms: Their Univer- not so influenced, thus re-establishing limited Whorfian sality and Evolution. Berkeley: University of California. effects in the color domain.) Brown, R. W. (1976). Reference. Cognition 4: 125–153. In the tradition of cross-language description, the studies Brown, R. W., and E. H. Lenneberg. (1954). A study of language of the 1950s and 1960s likewise reflected the dominance of and cognition. Journal of Abnormal and Social Psychology 49: radical linguistic relativism (Ray 1952; Conklin 1955; Glea- 454–462. son 1961: 4). These studies sought to discover and celebrate Collier, G. A. (1973). Review of Basic Color Terms. Language 49: 245–248. the differences among color lexicons. In 1969, using the orig- Conklin, H. C. (1955). Hanunóo color categories. Southwestern inal stimulus set of Lenneberg and Roberts (1956), Berlin Journal of Anthropology 11: 339–344. and Kay compared the denotation of basic color terms in De Valois, R. L., I. Abramov, and G. H. Jacobs. (1966). Analysis twenty languages and, based on these findings, examined of response patterns of LGN cells. Journal of the Optical Soci- descriptions of seventy-eight additional languages from the ety of America 56: 966–977. literature. They reported that there are universals in the Durbin, M. (1972). Review of Basic Color Terms. Semiotica 6: semantics of color: the major color terms of all languages are 257–278. focused on one of eleven landmark colors. Further, they pos- Gleason, H. A. (1961). An Introduction to Descriptive Linguistics. tulated an evolutionary sequence for the development of New York: Holt, Rinehart and Winston. color lexicons according to which black and white precede Heider, E. R. (1972). Universals in color naming and memory. Journal of Experimental Psychology 93: 1–20. red, red precedes green and yellow, green and yellow precede Hickerson, N. (1971). Review of Basic Color Terms. International blue, blue precedes brown, and brown precedes purple, pink, Journal of American Linguistics 37: 257–270. orange and gray. These results were challenged on experi- Kay, P. (1975). Synchronic variability and diachronic change in mental grounds, mostly by anthropologists (e.g., Hickerson basic color terms. Language in Society 4: 257–270. 1971; Durbin 1972; Collier 1973), and largely embraced by Kay, P., B. Berlin, L. Maffi, and W. Merrifield. (1997). Color nam- psychologists (e.g., Brown 1976; Miller and Johnson-Laird ing across languages. In C. L. Hardin and L. Maffi, Eds., Color 1976; Ratliff 1976). A number of field studies stimulated by Categories in Thought and Language. Cambridge: Cambridge Berlin and Kay tended to confirm the main lines of the uni- University Press. versal and evolutionary theory, while leading to reconceptu- Kay, P., and W. M. Kempton. (1984). What is the Sapir-Whorf alization of the encoding sequence (Berlin and Berlin 1975; hypothesis? American Anthropologist 86: 65–79. Kay, P., and C. K. McDaniel. (1978). The linguistic significance of Kay 1975). Based on earlier, unpublished work of Chad K. the meanings of basic color terms. Language 54: 610–646. McDaniel, which established the identity of some of the uni- Lenneberg, E. H., and J. M. Roberts. (1956). The language of versal semantic foci of Berlin and Kay with the psychophysi- experience: A study in methodology. Memoir 13 of Interna- cally determined unique hues (see also in this connection tional Journal of American Linguistics. Miller and Johnson-Laird 1976: 342–355; Ratliff 1976; Lucy, J. A. (1997). The linguistics of color. In C. L. Hardin and L. Zollinger 1979), Kay and McDaniel (1978) again reconcep- Maffi, Eds., Color Categories in Thought and Language. Cam- tualized the encoding sequence, introducing the notion of bridge: Cambridge University Press. fuzzy set into a formal model of color lexicons, and empha- Lucy, J. A., and R. A. Shweder. (1979). The effect of incidental sized the role in early systems of categories representing conversation on memory for focal colors. American Anthropol- fuzzy unions of Hering primaries. Kay and McDaniel also ogist 90: 923–931. MacLaury, R. E. (1997). Color and Cognition in Mesoamerica: related the universal semantics of color to the psychophysical Constructing Categories as Vantages. Austin: University of and neurophysiological results of Russel De Valois and his Texas Press. associates (e.g., De Valois, Abramov, and Jacobs 1966). Miller, G. A., and P. Johnson-Laird. (1976). Language and Percep- Since 1978, two important surveys of color lexicons have tion. Cambridge, MA: Harvard University Press. been conducted, both supporting the two broad Berlin and Ratliff, F. (1976). On the psychophysiological bases of universal Kay hypotheses of semantic universals and evolutionary color terms. Proceedings of the American Philosophical Soci- sequence in the lexical encoding of colors: the World Color ety 120: 311–330. Survey (Kay et al. 1997) and the Mesoamerican Color Sur- Ray, V. (1952). Techniques and problems in the study of human vey (MacLaury 1997). Relativist objection to the Berlin and color perception. Southwestern Journal of Anthropology 8: Kay paradigm of research on color categorization has con- 251–259. Sapir, E. (1921). Language. New York: Harcourt, Brace. tinued, emphasis shifting away from criticism of the rigor Saunders, B. A. C., and J. van Brakel. (1997). Are there non-trivial with which the Berlin and Kay procedures of mapping constraints on colour categorization? Brain and Behavioral Sci- words to colors were applied toward challenging the legiti- ences. macy of any such procedures (e.g., Lucy 1997; Saunders Stefflre, V., V. Castillo Vales, and L. Morley. (1966). Language and and van Brake l997) cognition in Yucatan: A cross-cultural replication. Journal of See also CATEGORIZATION; COLOR VISION; COLOR, NEU- Personality and Social Psychology 4: 112–115. ROPHYSIOLOGY OF; CULTURAL RELATIVISM; LINGUISTIC Whorf, B. L. (1956, 1940). Science and linguistics. In J. B. Carroll, RELATIVITY Ed., Language, Thought and Reality: The Collected Papers of Benjamin Lee Whorf. Cambridge, MA: MIT Press. Originally —Paul Kay published in Technology Review 42: 229–231, 247–248. Color, Neurophysiology of 145 Color, Neurophysiology of Zollinger, H. (1979). Correlations between the neurobiology of colour vision and the psycholinguistics of colour naming. Experientia 35: 1–8. Color vision is our ability to distinguish and classify lights Further Readings of different spectral distributions. The first requirement of color vision is the presence in the RETINA of different photo- Berlin, B., P. Kay, and W. R. Merrifield. (1985). Color term evolu- receptors with different spectral absorbances. The second tion: recent evidence from the World Color Survey. Paper pre- requirement is the presence in the retina of postreceptoral sented at the 84th Meeting of the American Anthropological neuronal mechanisms that process receptor outputs to pro- Association, Washington, D. C. duce suitable chromatic signals to be sent to the VISUAL Bornstein, M. H. (1973a). The psychophysiological component of cultural difference in color naming and illusion susceptibility. CORTEX. The third requirement is a central mechanism that Behavioral Science Notes 1: 41–101. transforms the incoming chromatic signals into the color Bornstein, M. H. (1973b). Color vision and color naming: A psy- space within which the normal observer maps his sensa- chophysiological hypothesis of cultural difference. Psychologi- tions. A good deal is known of the neurophysiology and cal Bulletin 80: 257–285. anatomy of the first two requirements, but very little is Burnham, R. W., and J. R. Clark. (1955). A test of hue memory. known of the third. Journal of Applied Psychology 39: 164–172. The visible range of light extends over a wavelength Collier, G. A. (1976). Further evidence for universal color catego- band from 400 to 700 nanometers (nm), from violet to red. ries. Language 52: 884–890. Color vision has evolved in a number of species, including De Valois, R. L., and G. H. Jacobs. (1968). Primate color vision. bees, fish, and birds, but among mammals only primates Science 162: 533–540. De Valois, R. L., H. C. Morgan, M. C. Polson, W. R. Mead, and E. appear to use color as a major tool in their processing of M. Hull. (1974). Psychophysical studies of monkey vision. I: visual scenes. Most natural colors, as opposed to those in Macaque luminosity and color vision tests. Vision Research 14: the laboratory, are spectrally broad-band. There are many 53–67. systems for specifying their spectral composition, ranging Dougherty, J. W. D. (1975). A Universalist Analysis of Variation from simply the physical distribution across the spectrum, and Change in Color Semantics. Ph.D. diss., University of Cal- through industry standards based on descriptions of spectral ifornia, Berkeley. mixtures as functions of three variables (for the three differ- Dougherty, J. W. D. (1977). Color categorization in West Futunese: ent photopigments in the human eye), to color spaces that variability and change. In B.G. Blount and M. Sanches, Eds., attempt to reproduce the way in which we perceptually Sociocultural Dimensions of Language Change. New York: order colors (Wyszecki and Stiles 1982). For a cognitive Plenum, pp. 133–148. Hage, P., and K. Hawkes. (1975). Binumarin color categories. Eth- psychologist, there are two different aspects of color vision: nology 24: 287–300. we are not only very good at distinguishing and classifying Hardin, C. L. (1988). Color for Philosophers. Indianapolis: colors (under optimal conditions wavelength differences of Hackett. just a few nanometers can be detected), but also color vision Hardin, C. L., and L. Maffi, Eds. (1997). Color Categories in is very important in visual segmentation and OBJECT REC- Thought and Language. Cambridge: Cambridge University OGNITION. Press. Humans and Old World primates possess three different Heider, E. R. (1972). Probabilities, sampling and the ethno- photoreceptor, or cone, types in the retina. A fourth type of graphic method: The case of Dani colour names. Man 7: 448– photoreceptor, rods, is concerned with vision at low light lev- 466. els and plays little role in color vision. The idea of three dif- Heider, E. R., and D. C. Olivier. (1972). The structure of the color space for naming and memory in two languages. Cognitive Psy- ferent cone types in human retina was derived from the chology 3: 337–354. empirical finding that any color can be matched by a mixture Kay, P., B. Berlin, and W. Merrifield. (1991). Biocultural implica- of three others. It was first proposed by Thomas Young in tions of systems of color naming. Journal of Linguistic Anthro- 1801, and taken up by Hermann von HELMHOLTZ later in the pology 1: 12–25. nineteenth century. The Young-Helmholtz theory came to be Kuschel, R., and T. Monberg. (1974). “We don’t talk much about generally accepted, and the spectral absorbances of the three colour here”: A study of colour semantics on Bellona Island. photopigments were determined through psychophysical Man 9: 213–242. measurements. However, it is only in recent years that it has Landar, H. J., S. M. Ervin, and A. E. Horrowitz. (1960). Navajo become possible to directly demonstrate the presence of the color categories. Language 36: 368–382. three cones, either by measuring their spectral absorptions Lenneberg, E. H. (1961). Color naming, color recognition, color discrimination: A reappraisal. Perceptual and Motor Skills 12: (Bowmaker 1991) or their electrical responses (Baylor, 375–382. Nunn, and Schnapf 1987). The spectral absorbances peak MacLaury, R. E. (1987). Coextensive semantic ranges: different close to 430, 535, and 565 nm, and preferred designations names for distinct vantages of one category. Papers from the are short (S), middle (M), or long (L) wavelength cones, 23rd Annual Meeting of the Chicago Linguistic Society, Part I, rather than color names such as blue, green, and red. It is 268–282. important to realize that a single cone cannot deconfound Shepard, R. (1992). The perceptual organization of colors. In J. intensity and wavelength; in order to distinguish colors, it is Barkow, L. Cosmides, and J. Tooby, Eds., The Adapted Mind. necessary to compare the outputs of two or more cone types. Oxford: Oxford University Press. There has been much recent interest in the molecular Zollinger, H. (1976). A linguistic approach to the cognition of genetics of the photopigments in the cones, especially in the colour vision. Folia Linguistica 9: 265–293. 146 Color, Neurophysiology of reason why about 10 percent of the male population suffer opponent space has different spectral properties compared from some degree of red-green color deficiency (Mollon to the cone difference signals that leave the retina, and that 1997). The amino acids in the photopigment molecule that we use to distinguish small color differences. Put another are responsible for pigment spectral tuning have been iden- way, the unique hues do not map directly onto the cone- tified. A change in just a single amino acid can be detectable opponent signals emanating from the retina. Transformation in an individual’s color matches; it is rare for such a small of retinal signals to produce our perceptual, opponent color molecular change to give rise to a measurable behavioral space can be modeled (Valberg and Seim 1991; DeValois effect. and DeValois 1993), but how this occurs neurophysiologi- The spectral absorbances of the photopigments are cally is unknown; it is even uncertain whether we should broad, and this means that the signals from the different look for single cells in the cortex that code for Hering’s cones are highly correlated, especially with the broad-band opponent processes or unique hues, or whether these are an spectra of the natural environment. This correlation gives emergent property of the cortical cell network (Mollon and rise to redundancy in the pattern of cone signals, and so Jordan 1997). postreceptoral mechanisms in the retina add or subtract It is generally thought that a high concentration of color- cone outputs to code spectral distributions more efficiently, specific neurons are found in the cytochrome oxidase blobs before signals are passed up the optic tract to the CEREBRAL in primary visual cortex (see Merigan and Maunsell 1993 for review). Some of these cells resemble retinal ganglion CORTEX (Buchsbaum and Gottschalk 1983). Again, the first cells in their spectral properties; others do not (Lennie, suggestion of such mechanisms came in the nineteenth cen- Krauskopf, and Sclar 1990). These color-specific signals are tury, from Oswald Hering. He proposed that there were then passed on to the temporal visual processing stream. black-white, red-green, and blue-yellow opponent pro- Color signals do not seem to flow into the parietal visual cesses in human vision, based on the impossibility of con- processing stream, which has much to do with motion per- ceiving of reddish-green or bluish-yellow hues, whereas ception and is dominated by the magnocellular pathway. reddish-yellow is an acceptable color. Red, green, blue, and Many cognitive aspects of COLOR VISION emerge at the yellow are called the unique hues; it is possible to pick a wavelength that is uniquely yellow, without a reddish or cortical level. For example, so-called surface colors depend greenish component. on the context in which they are seen; the color brown Three main VISUAL PROCESSING STREAMS leave the pri- depends on a given spectral composition being set in a mate retina to reach the cortex via the lateral geniculate brighter surround. If the same spectral composition were nucleus. Each of them is associated with a very specific reti- viewed in a black surround, it would appear yellowish. The nal circuitry, with signals being added or subtracted very limited range of colors that can be seen in isolation, in a dark soon after they have left the cones (Lee and Dacey 1997). surround, are called aperture colors. Another important One carries summed signals of the M- and L-cones; it is emergent feature is color constancy, our ability to correctly thought to form the basis of a luminance channel of PSYCHO- identify the color of objects despite wide changes in the PHYSICS and is heavily involved in flicker and MOTION, PER- spectral composition of illumination, which of course CEPTION OF. It originates in a specific ganglion cell class, the changes the spectral reflectance of light from a surface. parasol cells, and passes through the magnocellular layers of Color constancy is not perfect, and is a complex function of the lateral geniculate on the way to the cortex. A second the spectral and spatial characteristics of a scene (Pokorny, channel carries the difference signal between the M- and L- Shevell, and Smith 1991). It has been proposed that color cones, and forms the basis of a red-green detection channel constancy emerges in a secondary visual cortical area, V4 of psychophysics. It begins in the midget ganglion cells of (Zeki 1980, 1983), but this remains controversial. Lastly, the retina and passes through the parvocellular layers of the color also plays an important role in many higher order lateral geniculate. The third channel carries the difference visual functions, such as SURFACE PERCEPTION or object signal between the S-cones and a sum of the other two, and it identification. The neurophysiological correlates in the cor- is the basis of a blue-yellow detection mechanism. It begins tex of these higher order color vision functions are unknown, in the small bistratified ganglion cells and passes through and are likely to be very difficult to ascertain. It seems prob- intercalated layers in the lateral geniculate. These red-green able that many of these functions are distributed through cor- and blue-yellow mechanisms account well for our ability to tical neuronal networks, rather than existing as specific and detect small differences between colors. Neurophysiology well-defined chromatic channels, as in the retina. and anatomy of these peripheral chromatic mechanisms have See also COLOR CATEGORIZATION; ILLUSIONS; PHYSI- been established in some detail (Kaplan, Lee, and Shapley CALISM; QUALIA 1990; Lee 1996). It is thought that the S-cone and blue-yel- —Barry B. Lee low system is phylogenetically more ancient than the red- green one, and it is present in most mammals (Mollon 1991). References Separate M- and L-cones and the red-green mechanism have evolved only in primates. Baylor, D. A., B. J. Nunn, and J. L. Schnapf. (1987). Spectral sen- The lateral geniculate nucleus is not thought to substan- sitivity of cones of the monkey Macaca Fascicularis. Journal of tially modify color signals from the retina, but what happens Physiology 390: 145–160. in the cerebral cortex is uncertain. It is possible to determine Bowmaker, J. K. (1991). Visual pigments and color vision in pri- the spectral sensitivity of the opponent processes of Hering mates. In A. Valberg and B. B. Lee, Eds., From Pigments to Perception. New York: Plenum, pp. 1–10. (Hurvich 1981), and it turns out that our perceptual color Color Vision 147 In everyday use, we describe color appearance using Buchsbaum, G., and A. Gottschalk. (1983). Trichromacy, oppo- nent colours coding and optimum colour information transmis- simple color names, such as “red,” “green,” and “blue.” sion in the retina. Proceedings of the Royal Society of London B There is good agreement across observers and cultures 220: 89–113. about the appropriate color name for most stimuli (see DeValois, R. L., and K. K. DeValois. (1993). A multi-stage color COLOR CATEGORIZATION). Technical and scientific use model. Vision Research 33: 1053–1065. requires more precise terms. The purpose of a color order Hurvich, L. M. (1981). Color Vision. Sunderland, MA: Sinauer system is to connect physical descriptions of color stimuli Associates. with their appearance (see Derefeldt 1991; Wyszecki and Kaplan, E., B. B. Lee, and R. M. Shapley. (1990). New views of pri- Stiles 1982). Each color order system specifies a lexicon for mate retinal function. Progress in Retinal Research 9: 273–336. color. Observers then scale (see PSYCHOPHYSICS) a large Lee, B. B. (1996). Receptive fields in primate retina. Vision number of calibrated color stimuli using this lexicon, and Research 36: 631–644. Lee, B. B., and D. M. Dacey. (1997). Structure and function in pri- from such data the relation between stimuli and names is mate retina. In C.R. Cavonius, Ed., Color Vision Deficiencies determined. Examples of color order systems include the XIII. Dordrecht, Holland: Kluwer, pp. 107–118. Munsell Book of Color and the Swedish Natural Color Sys- Lennie, P., J. Krauskopf, and G. Sclar. (1990). Chromatic mecha- tem (NCS). Note that color order systems do not address the nisms in striate cortex of macaque. Journal of Neuroscience 10: problem of how context affects color appearance. Rather, 649–669. each system specifies a particular configuration in which Merigan, W. H., and J. H. R. Maunsell. (1993). How parallel are stimuli should be viewed if the system is to apply. the primate visual pathways? Annual Review of Neuroscience People are sensitive to wavelengths between 400 nano- 16: 369–402. meters and 700 nanometers (nm); hence this region is called Mollon, J. D. (1991). Uses and evolutionary origins of primate the visible spectrum. The chromatic properties of light are color vision. In J. R. Cronly-Dillon and R. L. Gregory, Eds., Evolution of the Eye and Visual System. London: Macmillan, specified by how much energy the light contains at every pp. 306–319. wavelength in the visible spectrum. This specification is Mollon, J. D. (1997). . . . aus dreyerly Arten von Membranen oder called the light’s spectral power distribution. Color vision is Molekülen: George Palmer’s legacy. In C. R. Cavonius, Ed., mediated by three classes of cone photoreceptor. Each class Color Vision Deficiencies XIII. Dordrecht, Holland: Kluwer. of cones is characterized by its spectral sensitivity, which Mollon, J. D., and G. Jordan. (1997). On the nature of unique hues. specifies how strongly that class responds to light energy at In C. Dickinson, I. Murray, and D. Carden, Eds., John Dalton’s different wavelengths; the three classes of cones have their Colour Vision Legacy. London: Taylor and Francis, pp. 381– peak sensitivities in different regions of the visible spec- 392. trum, roughly at long (L), middle (M), and short (S) wave- Pokorny, J., S. K. Shevell, and V. C. Smith. (1991). Colour appear- lengths. How the cones encode information about the ance and colour constancy. In P. Gouras, Ed., Vision and Visual Dysfunction, vol. 6: The Perception of Colour. London: Mac- spectral power distribution of light is discussed in RETINA millan, pp. 43–61. and COLOR, NEUROPHYSIOLOGY OF (see also Wandell 1995). Valberg, A., and T. Seim. (1991). On the physiological basis of Physically different lights can produce identical re- higher color metrics. In A. Valberg and B. B. Lee, Eds., From sponses in all three classes of cones. Such lights, called Pigments to Perception. New York: Plenum, pp. 425–436. metamers, are indistinguishable to the visual system (see Wyszecki, G., and W. S. Stiles. (1982). Color Science—Concepts Wyszecki and Stiles 1982; Brainard 1995). It is possible to and Methods, Quantitative Data and Formulae. Second edition. construct metamers by choosing three primary lights and al- New York: Wiley. lowing an observer to mix them in various proportions. If Zeki, S. (1980). The representation of colors in the cerbral cortex. the primaries are well chosen, essentially any other light can Nature 284: 412–418. be matched through their mixture. This fact is known as the Zeki, S. (1983). Colour coding in the cerebral cortex: The reaction of cells in monkey visual cortex to wavelengths and colors. trichromacy of human color vision (see Wyszecki and Stiles Neuroscience 9: 741–765. 1982; Wandell 1995; Kaiser and Boynton 1996). Trichro- macy facilitates most color reproduction technologies. Color television, for example, produces colors by mixing Color Vision light emitted by just three phosphors, and color printers use only a small number of inks (see Hunt 1987). Color vision is the ability to detect and analyze changes in Some people are color blind, usually because they lack the wavelength composition of light. As we admire a rain- one or more classes of cone. Individuals missing one class of bow, we perceive different colors because the light varies in cone are called dichromats. Consider a pair of lights which wavelength across the width of the bow. produce the same responses in the M- and S-cones but a dif- An important goal of color science is to develop PSYCHO- ferent response in the L-cones. A normal trichromat will LOGICAL LAWS that allow prediction of color appearance have no difficulty distinguishing the lights because of their from a physical description of stimuli. General laws have different L-cone response. To a dichromat with no L-cones, been elusive because the color appearance of a stimulus is however, the two lights will appear identical. Thus dichro- strongly affected by the context in which it is seen. One mats confuse lights that trichromats distinguish. In the most well-known example is simultaneous color contrast. Color common forms of dichromacy, either the L- or M-cones are plates illustrating the color contrast can be found in most missing. This is often called red-green color blindness be- perception textbooks (e.g., Wandell 1995; Goldstein 1996; cause of consequent confusions between reds and greens. see also Evans 1948; Albers 1975; Wyszecki 1986). Red-green dichromacy occurs more frequently in males 148 Columns and Modules (about 2% of Caucasian males) than females (about 0.03% believed to exhibit approximate color constancy (e.g., Bor- of Caucasian females). The rate in females is lower because ing 1942; Evans 1948). Developing a quantitative account the genes that code the L- and M-cone photopigments are on of human color constancy and understanding the computa- the X-chromosome. There are other forms of color blindness tions that underlie it is an area of active current research (see and anomalous color vision (see Pokorny et al. 1979). Wandell 1995; Kaiser and Boynton 1996). Different species code color differently. Many mammals See also ILLUSIONS; PERCEPTUAL DEVELOPMENT; VISUAL are dichromats with correspondingly less acute color vision ANATOMY; AND PHYSIOLOGY than most humans. In addition, for two species with the —David Brainard same number of cones, color vision can differ because the cones of each species have different spectral sensitivities. References Bees, for example, are trichromatic but have cones sensitive to ultraviolet light (Menzel and Backhaus 1991). Thus bees Albers, J. (1975). Interaction of Color. New Haven: Yale Univer- are sensitive to differences in spectral power distributions sity Press. that humans cannot perceive, and vice versa. Note that the Boring, E. G. (1942). Sensation and Perception in the History of Experimental Psychology. New York: D. Appleton Century. color categories perceived by humans are unlikely to match Brainard, D. H. (1995). Colorimetry. In M. Bass, E. Van Stryland, those of other species. Behavioral studies indicate that and D. Williams, Eds., Handbook of Optics: vol. 1. Fundamen- humans and pigeons group regions of the visible spectrum tals, Techniques, and Design, Second edition. New York: quite differently (see Jacobs 1981). McGraw-Hill, pp. 26.1–26.54. A key idea about postreceptoral color vision is the idea Derefeldt, G. (1991). Colour appearance systems. In P. Gouras, that signals from the separate cone classes are compared in Ed., Vision and Visual Dysfunction, vol. 6: The Perception of color opponent channels, one signaling redness and greenness Colour. London, Macmillan, pp. 218–261. and a second signaling blueness and yellowness (Wandell Evans, R. M. (1948). An Introduction to Color. New York: Wiley. 1995; Kaiser and Boynton 1996). An informal observation Goldstein, E. B. (1996). Sensation and Perception. Pacific Grove, that supports the idea of opponency is that we rarely experi- CA: Brooks/Cole Publishing Company. Hunt, R. W. G. (1987). The Reproduction of Colour. Tolworth, ence a single stimulus as being simultaneously red and green England: Fountain Press. or simultaneously blue and yellow. Quantitative behavioral Jacobs, G. H. (1981). Comparative Color Vision. New York: Aca- and physiological evidence also supports the idea of color demic Press. opponency. Kaiser, P. K., and R. M. Boynton. (1996). Human Color Vision. Why is color vision useful? With some notable excep- Washington, DC: Optical Society of America. tions (e.g. rainbows and signal lights), we rarely use color to Menzel, R., and W. Backhaus. (1991). Colour vision in insects. In describe the properties of lights per se. Rather, color vision P. Gouras, Ed., Vision and Visual Dysfunction, vol. 6: The Per- informs us about objects in the environment. First, color ception of Color. London: Macmillan, pp. 262–293. helps us distinguish objects from clutter: ripe fruit is easy to Mollon, J. D. (1989). Tho’ she kneel’d in that place where they find because of color contrast between the fruit and leaves. grew. The uses and origins of primate color vision. Journal of Experimental Biology 146: 21–38. Second, color tells us about the properties of objects: we Pokorny, J., V. C. Smith, G. Verriest, and A. J. L. G. Pinckers. avoid eating green bananas because their color provides a (1979). Congenital and Acquired Color Vision Defects. New cue to their ripeness. Finally, color helps us identify objects: York: Grune and Stratton. we find our car in a crowded parking lot because we know Wandell, B. A. (1995). Foundations of Vision. Sunderland, MA: its color (for more extended discussions, see Jacobs 1981; Sinauer. Mollon 1989). Wyszecki, G. (1986). Color appearance. In Handbook of Percep- The spectral power distribution of the color signal tion and Human Performance. New York, Wiley, 9.1–9.56. reflected from an object to an observer depends on two Wyszecki, G., and W. S. Stiles. (1982). Color Science—Concepts physical factors. The first factor is the spectral power distri- and Methods, Quantitative Data and Formulae. New York: bution of the illuminant incident on the object. This varies Wiley. considerably over the course of a day, with weather condi- tions, and between natural and artificial illumination. The Columns and Modules second factor is the object’s surface reflectance function, which specifies, at each wavelength, what fraction of the The CEREBRAL CORTEX sits atop the white matter of the incident illuminant is reflected. For color to be a reliable brain, its 2 mm thickness subdivided into about six layers. indicator about object properties and identity, the visual sys- Neurons with similar interests tend to cluster. Columns are tem must separate the influence of the illuminant on the usually subdivisions at the submillimeter scale, and mod- color signal from the influence of the surface reflectance. To ules are thought to occupy the intermediate millimeter the extent that the human visual system does so, we say that scale, between maps and columns. Empirically, a column it is color constant. is simply a submillimeter region where many (but not all) The problem of color constancy is analogous to the prob- neurons seem to have functional properties in common. lem of lightness constancy (see LIGHTNESS PERCEPTION). They come in two sizes, with separate organizational prin- More generally, color constancy embodies an ambiguity ciples. Minicolumns are about 23–65 µm across, and there that is at the core of many perceptual problems: multiple are hundreds of them inside any given 0.4–1.0 mm macro- physical configurations can produce the same image (see column. COMPUTATIONAL VISION). Human vision has long been Columns and Modules 149 Each cerebral hemisphere has about 52 “areas” distin- receptive field optimized for the same patch of body surface guished on the basis of differences between the thickness of (Favorov and Kelly 1994). their layers; on average, a human cortical area is about half Hubel and Wiesel, recording in monkey visual cortex, the size of a business card. Though area 17 seems to be a saw curtainlike clusters (“ocular dominance columns”) that consistent functional unit, other areas prove to contain a specialized in the left eye, with an adjacent cluster about 0.4 half-dozen distinct physiological subdivisions (“maps”) on mm away specializing in the right eye. As seems appropri- the centimeter scale. ate for an outcome of self-organization, average size varies Both columns and modules may be regarded as the out- among individuals over 0.4–0.7 mm; those with smaller comes of a self-organizing tendency during development, ocular dominance columns have more of them (Horton and patterns that emerge as surely as the hexagons of a honey- Hocking 1996). comb arise from the pounding of so many hemispherical Orientation columns are of minicolumn dimensions, bee’s heads on the soft wax of tunnel walls. The hexagonal within which the neurons prefer lines and edges that are shape is an emergent property of such competition for terri- tilted about the same angle from the vertical; there are tory; similar competition in the cortex may continue many such minicolumns specializing in various angles throughout adult life, maintaining a shifting mosaic of corti- within an ocular dominance macrocolumn (Hubel and cal columns. Wiesel 1977). The relationships between minicolumns and A column functionally ties together all six layers. Layers macrocolumns are best seen in VISUAL CORTEX, though it II and III can usually be lumped together, as when one talks may be hazardous to generalize from this because ocular of the superficial pyramidal neurons. But layer IV has had to dominance columns themselves are less than universal; for be repeatedly subdivided in the visual cortex (IVa, IVb, instance, they are not a typical feature of New World mon- IVcα, IVcβ). Layer IV neurons send most of their outputs up keys. to II and III. Some superficial neurons send messages down Color blobs are clusters of COLOR-sensitive neurons in to V and VI, though their most prominent connections the cortex at macrocolumnar spacing but involving only (either laterally in the layers or via U-fibers in white matter) neurons of the superficial layers and not extending through- are within their own layers. Layer VI sends messages back out all cortical layers, as in a proper column. Recently, down to the THALAMUS via the white matter, while V sends recurrent excitation in the superficial layers has been identi- signals to other deep and distant neural structures, some- fied as a coordinating (and perhaps self-organizing) princi- times even the spinal cord. ple among distant minicolumns (Calvin 1995). The For any column of cortex, the bottom layers are like a superficial pyramids send myelinated axons out of the corti- subcortical outgoing mailbox, the middle layer like an cal layers into the white matter; their eventual targets are inbox, and the superficial layers somewhat like an interof- typically the superficial layers of other cortical areas when fice mailbox spanning the columns and reaching out to other of the “feedback” type; when “feedforward” they terminate cortical areas (Calvin and Ojemann 1994). Indeed, Dia- in IV and deep III. mond (1979) argues that the “motor cortex” isn't restricted But superficial pyramidal neurons also send unmyeli- to the motor strip but is the fifth layer of the entire cerebral nated collaterals sideways, with an unusual patterning cortex. That's because V, whatever the area, contains neu- that suggests a columnar organizing principle. Like an rons that at some stage of development send their outputs express train that skips intermediate stops, the collateral down to the spinal cord, with copies to the brain stem, axon travels a characteristic lateral distance without giv- BASAL GANGLIA, and hypothalamus. Likewise the fourth ing off any terminal branches; then it produces a tight ter- layer everywhere is the “sensory cortex,” and the second minal cluster (see fig. 5 in Gilbert 1993). The axon may and third layers everywhere are the true “association cortex” continue for many millimeters, repeating such clusters in this view. about every 0.43 mm in primary visual cortex, 0.65 mm Minicolumns appear to be formed from dendritic bun- in the secondary visual areas, 0.73 mm in sensory strip, dles. Ramón y CAJAL saw connect-the-dots clusters of cell and 0.85 mm in motor cortex of monkeys (Lund, bodies, running from white matter to the cortical surface; Yoshioka, and Levitt 1993). Because of this local stan- these hair-thin columns are about 30 µm apart in human cor- dard for axon length, mutual re-excitation becomes prob- tex. It now appears that a column is like a stalk of celery, a able among some cell pairs. Macrocolumns of similar vertical bundle containing axons and apical dendrites from emphasis are seen to be connected by such synchronizing about 100 neurons (Peters and Yilmaz 1993) and their inter- excitation. Calvin (1996) argues that these express con- nal microcircuitry. nections could implement a Darwinian copying competi- Macrocolumns may, in contrast, reflect an organization tion among Hebbian cell-assemblies on the time scale of of the input wiring, for example, corticocortical termina- thought and action, providing one aspect of CONSCIOUS- tions from different areas often terminate in interdigitating NESS. zones about the width of a thin pencil lead. In 1957, Mount- Though COMPUTATIONAL NEUROANATOMY has proved castle and his coworkers discovered a tendency for soma- more complex, it has been widely expected that cerebral tosensory strip neurons responsive to skin stimulation (hair, cortex will turn out to have circuits which, in different cor- light touch) to alternate with those specializing in joint and tical patches, are merely repeats of a standard “modular” muscle receptors about every 0.5 mm. It now appears that pattern, something like modular kitchen cabinets. Col- there is a mosaic organization of similar dimensions, the umns, barrels, blobs, and stripes have all been called mod- neurons within each macrocolumn (or “segregate”) having a ules, and the term is loosely applied to any segmentation 150 Communication or repeated patchiness (Purves, Riddle, and LaMantia Katz, L. C., and E. M. Callaway. (1992). Development of local circuits in mammalian visual cortex. Ann. Rev. Neurosci. 15: 1992) and to a wide range of functional or anatomical col- 31–56. lectives. The best candidate for a true module was the Livingstone, M. S. (1996). Oscillatory firing and interneuronal cor- “hypercolumn” (Hubel and Wiesel 1977): two adjacent relations in squirrel monkey striate cortex. J. Neurophysiol. 75: ocular dominance columns, each containing a full set of 2467–2485. orientation columns, suggested similar internal wiring, Livingstone, M. S., and D. H. Hubel. (1988). Segregation of form, whatever the patch of visual field being represented. How- color, movement, and depth: Anatomy, physiology, and percep- ever, newer mapping techniques have shown that ocular tion. Science 240: 740–749. dominance repeats are somewhat independent of orienta- Lund, J. S., T. Yoshioka, and J. B. Levitt. (1993). Comparison of tion column repeats (Blasdel 1992), making adjacent intrinsic connectivity in different areas of macaque monkey hypercolumns internally nonidentical, that is, not iterated cerebral cortex. Cerebral Cortex 3: 148–162. Mountcastle, V. B. (1979). An organizing principle for cerebral circuitry. Module remains a fuzzy term for anything larger function: The unit module and the distributed system. In F. O. than a macrocolumn but smaller than a map—though one Schmitt and F. G. Worden, Eds., The Neurosciences Fourth increasingly sees it used as a trendy word denoting any Study Program. Cambridge, MA: MIT Press, pp. 21–42. cortical specialization, for example, modules as the foun- Peters, A., and E. Yilmaz. (1993). Neuronal organization in area 17 dation for “multiple intelligences.” of cat visual cortex. Cerebral Cortex 3: 49–68. See also CONSCIOUSNESS, NEUROBIOLOGY OF; NEURON; Purves, D., D. R. Riddle, and A-S. LaMantia. (1992). Iterated pat- SELF-ORGANIZING SYSTEMS; VISUAL CORTEX, CELL TYPES terns of brain circuitry (or how the cortex gets its spots). Trends AND CONNECTIONS IN; VISUAL PROCESSING STREAMS in the Neurosciences 15: 362–368 (see letters in 16: 178–181). Shaw, G. L., E. Harth, and A. B. Scheibel. (1982). Cooperativity in —William H. Calvin brain function: Assemblies of approximately 30 neurons. Exp. Neurol. 77: 324–358. White, E. L. (1989). Cortical Circuits. Boston: Birkhauser. References Yuste, R., and D. Simons. (1996). Barrels in the desert: the Sde Bartfeld, E., and A. Grinvald. (1992). Relationships between Boker workshop on neocortical circuits. Neuron 19: 231–237. orientation-preference pinwheels, cytochrome oxidase blobs, and ocular-dominance columns in primate striate cortex. Proc. Communication Natl. Acad. Sci. USA 89: 11905–11909. Blasdel, G. G. (1992). Orientation selectivity, preference, and continuity in monkey striate cortex. J. Neurosci. 12: 3139– SeeANIMAL COMMUNICATION; GRICE, H. PAUL; LANGUAGE 3161. AND COMMUNICATION Bullock, T. H. (1980). Reassessment of neural connectivity and its specification. In H. M. Pinsker and W. D. Willis, Jr., Eds., Comparative Psychology Information Processing in the Nervous System. New York: Raven Press, pp. 199–220. Calvin, W. H. (1995). Cortical columns, modules, and Hebbian The comparative study of animal and human cognition cell assemblies. In Michael A. Arbib, Ed., The Handbook of should be an important part of cognitive science. The field Brain Theory and Neural Networks. Cambridge, MA: Bradford of comparative psychology, however, emerged from the par- Books/MIT Press, pp. 269–272. adigm of BEHAVIORISM and so has not contributed greatly Calvin, W. H. (1996). The Cerebral Code: Thinking a Thought in toward this end. The reasons for this are telling and help to the Mosaics of the Mind. Cambridge, MA: MIT Press. explicate the main directions of modern evolutionary think- Calvin, W. H., and G. A. Ojemann. (1994). Conversations with Neil's Brain: The Neural Nature of Thought and Language. ing about behavior and cognition. Reading, MA: Addison-Wesley. The general program of comparative psychology began Diamond, I. (1979). The subdivisions of neocortex: A proposal to with Charles DARWIN’s Origin of Species (1859). Darwin revise the traditional view of sensory, motor, and association believed that the comparative study of animal behavior and areas. In J. M. Sprague and A. N. Epstein, Eds., Progress in cognition was crucial both for reconstructing the phyloge- Psychobiology and Physiological Psychology 8. New York: nies of extant species (behavioral comparisons thus supple- Academic Press, pp. 1–43. menting morphological comparisons) and for situating the Favorov, O. V., and D. G. Kelly. (1994). Minicolumnar organiza- behavior and cognition of particular species, including tion within somatosensory cortical segregates: I. Development humans, in their appropriate evolutionary contexts. Toward of afferent connections. Cerebral Cortex 4: 408–427. these ends, Darwin (1871, 1872) reported some informal Gilbert, C. D. (1993). Circuitry, architecture, and functional dynamics of visual cortex. Cerebral Cortex 3: 373–386. comparisons between the behavior of humans and nonhu- Goldman-Rakic, P. (1990). Parallel systems in the cerebral cortex: man animals, as did his disciples Spencer (1894), Hobhouse The topography of cognition. In M. A. Arbib and J. A. Robin- (1901), and Romanes (1882, 1883). The goal was thus clear: son, Eds., Natural and Artificial Parallel Computation. Cam- to shed light on human cognition through a study of its evo- bridge, MA: MIT Press, pp.155–176. lutionary roots as embodied in extant animal species. Horton, J. C., and D. R. Hocking. (1996). Intrinsic variability of Arising as a reaction to some of the anthropomorphic ocular dominance column periodicity in normal macaque mon- excesses of this tradition was behaviorism. During the early keys. J. Neurosci. 16 (22): 7228–7239. and middle parts of the century, researchers such as Watson, Hubel, D. H., and T. N. Wiesel. (1977). Functional architecture Thorndike, and Tolman espoused the view that the psychol- of macaque visual cortex. Proc. Roy. Soc. (London) 198B: 1– ogy of nonhuman animals was best studied not informally or 59. Comparative Psychology 151 anecdotally, but experimentally in the laboratory. Within this attention to the particular cognitive skills of particular spe- tradition, some psychologists became interested in comparing cies and how these are adapted to particular aspects of spe- the learning skills of different animal species in a quantitative cific ecological niches. This enterprise is sometimes called manner, and this procedure came to be known as comparative COGNITIVE ETHOLOGY. psychology. One especially well-known series of studies was For these same reasons, modern comparative studies typ- summarized by Bitterman (1965), who compared several spe- ically compare only species that are fairly closely related to cies of insect, fish, and mammal on such things as speed to one another phylogenetically—thus assuring at least some learn a simple perceptual discrimination, speed to learn a commonalities of ecology and adaptation based on their rel- reversal of contingencies, and other discrimination learning atively short times as distinct species. As one example, in skills. An implicit assumption of much of this work was that the modern study of primate cognition there are currently just as morphology became ever more complex from insect to debates over possible differences between Old World mon- fish to mammals to humans, so behavior should show this keys and apes, whose common ancestor lived about 20 to 30 same “progression” (see Rumbaugh 1970 and Roitblatt 1987 million years ago. Some researchers claim that monkeys for more modern versions of this approach). live in an exclusively sensori-motor world of the here-and- Comparative psychology came under attack from its now and that only apes have cognitive representations of a inception by researchers who felt that studying animals out- humanlike nature (Byrne 1995). Other researchers claim side of their natural ecologies, on experimental tasks for that all nonhuman primates cognitively represent their which they were not naturally adapted, was a futile, indeed a worlds for purposes of foraging and social interaction, but misguided, enterprise (e.g., Beach 1950; Hodos and Camp- that only humans employ the forms of symbolic representa- bell 1969). They charged that studies such as Bitterman’s tion that depend on culture, intersubjectivity, and language smacked of a scalae natura in which some animals were (Tomasello and Call 1997). These kinds of theoretical “higher” or “more intelligent” than others, with, of course, debates and the research they generate employ the compara- humans atop the heap. That is, many of the comparative tive method, but they do so in much more ecologically and studies of learning implicitly assumed that nonhuman ani- evolutionarily sensitive ways than most of the debates and mals represented primitive steps on the way to humans as research in classical comparative psychology. evolutionay telos. This contradicted the established Darwin- Comparative studies, in the broad sense of the term, are ian fact of treelike branching evolution in which no living important for cognitive science in general because: (1) they species was a primitive version of any other living species, document something of the range of cognitive skills that but rather each species was its own telos. have evolved in the natural world and how these work; (2) Another blow to comparative psychology came from they help to identify the functions for which particular cog- experiments such as those of Garcia and Koelling (1966), nitive skills have evolved, thus specifying an important which demonstrated that different species were evolution- dimension of their nature; and (3) they situate the cognition arily prepared to learn qualitatively different things from of particular species, including humans, in their appropriate their species-typical environments. More generally, many evolutionary contexts, which speaks directly to such crucial studies emanating from the traditions of ETHOLOGY and questions as the ontogenetic mechanisms by which cogni- behavioral ecology at this same time demonstrated that dif- tive skills develop in individuals. ferent animal species were adapted to very different aspects See also ADAPTATION AND ADAPTATIONISM; ECOLOGI- of the environment and therefore that comparisons along CAL VALIDITY; EVOLUTIONARY PSYCHOLOGY; PRIMATE any single behavioral dimension, such as learning or intelli- COGNITION gence, were hopelessly simplistic and missed the essential —Michael Tomasello richness of the behavioral ecology of organism-environment interactions (see Eibl-Eibesfeldt 1970 for a review). Etholo- References gists and behavioral ecologists were much less interested in finding general processes or principles that spanned all ani- Beach, F. (1950). The snark was a boojum. American Psychologist mal species than were comparative psychologists, and they 5: 115–124. were much less inclined to treat human beings as any kind Bitterman, M. (1965). Phyletic differences in learning. American of special species in the evolutionary scheme of things. Psychologist 20: 396–410. Today, most scientists who study animal behavior have Byrne, R. W. (1995). The Thinking Ape. Oxford: Oxford Univer- incorporated the insights of the ethologists and behavioral sity Press. ecologists into their thinking so that it would currently be Darwin, C. (1859). On the Origin of Species by Means of Natural Selection. London: John Murray. difficult to locate any individuals who call themselves com- Darwin, C. (1871). The Descent of Man and Selection in Relation parative psychologists in the classic meaning of the term to Sex. London: John Murray. (see Dewsbury 1984a, 1984b for a slightly different per- Darwin, C. (1872). The Expression of Emotions in Man and Ani- spective). However, there does exist a journal called the mals. London: John Murray. Journal of Comparative Psychology, and many important Dewsbury, D., Ed. (1984a). Foundations of Comparative Psychol- studies of animal behavior are published there—mostly ogy. New York: Van Nostrand. experimental studies of captive animals (as opposed to etho- Dewsbury, D. (1984b). Comparative Psychology in the Twentieth logical studies, which are more often naturalistic). In con- Century. Stroudsburg, PA: Hutchinson Ross. trast to the classic, behavioristic form of comparative Eibl-Eibesfeldt, I. (1970). Ethology: The Biology of Behavior. New psychology, modern comparative studies pay much more York: Holt, Rinehart, Winston. 152 Competence/Performance Distinction In some form, compositionality is a virtually necessary Garcia, J., and R. Koelling. (1966). The relation of cue to conse- quent in avoidance learning. Psychonomic Science 4: 123–124. principle, given the fact that natural languages can express Hobhouse, L. T. (1901). Mind in Evolution. London: Macmillan. an infinity of meanings and can be learned by humans with Hodos, W., and C. B. G. Campbell. (1969). Scala naturae: Why finite resources. Essentially, humans have to learn the mean- there is no theory in comparative psychology. Psychological ings of basic expressions, the words in the LEXICON (in the Review 76: 337–350. magnitude of 105), and the meaning effects of syntactic Roitblat, H. L. (1987). Introduction to Comparative Cognition. combinations (in the magnitude of 102; see SYNTAX). With New York: W. H. Freeman and Company. that they are ready to understand an infinite number of syn- Romanes, G. J. (1882). Animal Intelligence. London: Kegan, Paul tactically well-formed expressions. Thus, compositionality Trench and Co. is necessary if we see the language faculty, with Wilhelm Romanes, G. J. (1883). Mental Evolution in Animals. London: von Humboldt, as making infinite use of finite means. But Kegan, Paul Trench and Co. Rumbaugh, D. M. (1970). Learning skills of anthropoids. In L. A. compositionality also embodies the claim that semantic Rosenblum, Ed., Primate Behavior: Developments in Field and interpretation is local, or modular. In order to find out what Laboratory Research. New York: Academic Press, pp. 1–70. a (possibly complex) expression A means, we just have to Spencer, H. (1894). Principles of Psychology. London: Macmillan. look at A, and not at the context in which A occurs. In its Tomasello, M., and J. Call. (1997). Primate Cognition. Oxford strict version, this claim is clearly wrong, and defenders of University Press. compositionality have to account for the context sensitivity of intepretation in one way or other. Competence/Performance Distinction There are certain exceptions to compositionality in the form stated above. Idioms and compounds are syntactically complex but come with a meaning that cannot be derived See INTRODUCTION: LINGUISTICS AND LANGUAGE; LINGUIS- from their parts, like kick the bucket or blackbird. They have TICS, PHILOSOPHICAL ISSUES; PARAMETER-SETTING AP- to be learned just like basic words. But compositionality PROACHES TO ACQUISITION, CREOLIZATION, AND DIACHRONY does allow for cases in which the resulting meaning is due to a syntactic construction, as in the comparative construction Competition The higher they rise, the deeper they fall. Also, it allows for constructionally ambiguous expressions like French teacher: See COOPERATION AND COMPETITION; GAME THEORY French can be combined with teacher as a modifier (“teacher from France”), or as an argument (“teacher of French”). Competitive Learning Even though the constituents are arguably the same, the syn- tactic rules by which they are combined differ, a difference that incidentally shows up in stress (see STRESS, LINGUISTIC). See UNSUPERVISED LEARNING A hidden assumption in the formulation of the principle of compositionality is that the ways in which meanings are Compliant Control combined are, in some difficult-to-define sense, “natural.” Even an idiom like red herring would be compositional if we allowed for unnatural interpretation rules like “The See CONTROL THEORY; MANIPULATION AND GRASPING meaning of a complex noun consisting of an adjective and a noun is the set of objects that fall both under the meaning of Compositionality the adjective and the meaning of the noun, except if the adjective is red and the noun is herring, in which case it Compositionality, a guiding principle in research on the may also denote something that distracts from the real SYNTAX-SEMANTICS INTERFACE of natural languages, is typ- issue.” But often we need quite similar rules for apparently ically stated as follows: “The meaning of a complex expres- compositional expressions. For example, red hair seems to sion is a function of the meanings of its immediate syntactic be compositional, but if we just work with the usual mean- parts and the way in which they are combined.” It says, for ing of red (say, “of the color of blood”), then it would mean example, that the meaning of the sentence something like “hair of the color of blood.” Red hair can mean that (think of a punk's hair dyed red), but typically is S [NP Zuzana [VP [V owns ] [NP a schnauzer ]]], understood differently. Some researchers have questioned where the commonly assumed syntactic structure is indi- compositionality because of such context-dependent inter- cated by brackets, can be derived from the meanings of the pretations (cf. Langacker 1987). But a certain amount of NP Zuzana and the VP owns a schnauzer, and the fact that context sensitivity can be built into the meaning of lexical this NP and VP are combined to form a sentence. In turn, items. For example, the context-sensitive interpretation of the meaning of owns a schnauzer can be derived from the red can be given as: “When combined with a noun meaning meanings of owns and a schnauzer and the fact that they N, it singles out those objects in N that appear closest to the form a VP; hence, the principle of compositionality applies color of blood for the human eye.” This would identify ordi- recursively. The principle is implicit in the work of Gottlob nary red hair when combined with hair. Of course, prototyp- FREGE (1848–1920), and was explicitly assumed by Katz ical red hair is not prototypically red; see Kamp and Partee and Fodor (1963) and in the work of Richard Montague and (1995) for a discussion of compositionality and prototype his followers (cf. Dowty, Wall, and Peters 1981). theory. Computation 153 Another type of context-sensitive expression that consti- Kamp, H., and B. Partee. (1995). Protoype theory and composi- tionality. Cognition 57: 129–191. tutes a potential problem for compositionality is pronouns. Kamp, H., and U. Reyle. (1993). From Discourse to Logic. Dor- A sentence like She owns a schnauzer may mean different drecht: Kluwer. things in different contexts, depending on the antecedent of Katz, J., and J. Fodor. (1963). The structure of semantic theory. she. The common solution is to bring context into the for- Language 39: 120–210. mulation of the principle, usually by assuming that “mean- Landman, F., and I. Moerdijk. (1983). Compositionality and the ings” are devices that change contexts by adding new analysis of anaphora. Linguistics and Philosophy 6: 89–114. information (as in models of DYNAMIC SEMANTICS, cf. Langacker, R. (1987). Foundations of Cognitive Grammar. Stan- Heim 1982; Groenendijk and Stokhof 1991). In general, ford, CA: Stanford University Press. compositionality has led to more refined ways of under- Partee, B. H. (1984). Compositionality. In F. Landman and F. Velt- standing MEANING (cf. e.g., FOCUS). man, Eds., Varieties of Formal Semantics. Dordrecht: Foris. Partee, B. H. (1995). Lexical semantics and compositionality. In L. In the form stated above, compositionality imposes a R. Gleitman and M. Liberman, Eds., Language. An Invitation to homomorphism between syntactic structure and semantic Cognitive Science, vol. 1. Cambridge, MA: MIT Press, pp. interpretation: syntactic structure and semantic interpreta- 311–360. tion go hand in hand. This has led to a sophisticated analy- von Stechow, A. (1980). Modification of noun phrases: A chal- sis of the meaning of simple expressions. For example, lenge for compositional semantics. Theoretical Linguistics 7: while logic textbooks will give as a translation of John and 57–110. Mary came a formula like C(j) ^ C(m), it is obvious that the structures of these expressions are quite different—the syn- Computation tactic constituent John and Mary does not correspond to any constituent in the formula. But we can analyze John and Mary as a QUANTIFIER, λX[X(j) ^ X(m)], that is applied No idea in the history of studying mind has fueled such to came, C, and thus gain a structure that is similar to the enthusiasm—or such criticism—as the idea that the mind is English sentence. On the other hand, compositionality may a computer. This idea, known as “cognitivism” (Haugeland impose certain restrictions on syntactic structure. For 1981/1998) or the COMPUTATIONAL THEORY OF MIND, example, it favors the analysis of relative clauses as noun claims not merely that our minds, like the weather, can be modifiers, [every [girl who came]], over the analysis as NP modeled on a computer, but more strongly that, at an appro- modifiers [[every girl ] [who came ]], as only the first priate level of abstraction, we are computers. allows for a straightforward compositional interpretation Whether cognitivism is true—even what it means— (cf. von Stechow 1980). depends on what computation is. One strategy for answering Compositionality arguments became important in decid- that question is to defer to practice: to take as computational ing between theories of interpretation. In general, semantic whatever society calls computational. Cognitivism’s central theories that work with a representation language that allows tenet, in such a view, would be the thesis that people share for unconstrained symbolic manipulation (such as Discourse with computers whatever constitutive or essential property Representation Theory—Kamp 1981; Dynamic Semantics, binds computers into a coherent class. From such a vantage or Conceptual Semantics—Jackendoff 1990) give up the point, a theory of computation would be empirical, subject ideal of compositional interpretation. But typically, composi- to experimental evidence. That is, a theory of computation tional reformulations of such analyses are possible. would succeed or fail to the extent that it was able to account for the machines that made Silicon Valley famous: the See also DISCOURSE; SEMANTICS devices we design, build, sell, use, and maintain. —Manfred Krifka Within cognitive science, however, cognitivism is usually understood in a more specific, theory laden, way: as building References in one or more substantive claims about the nature of com- puting. Of these, the most influential (especially in cognitive Dowty, D., R. E. Wall, and S. Peters. (1981). Introduction to Mon- science and artificial intelligence) has been the claim that tague Semantics. Dordrecht: Reidel. computers are formal symbol manipulators (i.e., actively Groenendijk, J., and M. Stokhof. (1991). Dynamic predicate logic. embodied FORMAL SYSTEMS). In this three-part characteriza- Linguistics and Philosophy 14: 39–100. tion, the term “symbol” is taken to refer to any causally effi- Heim, I. (1982). The Semantics of Definite and Indefinite Noun cacious internal token of a concept, name, word, idea, Phrases. Ph.D. diss., University of Massachusetts at Amherst. representation, image, data structure, or other ingredient that Published by Garland, New York, 1989. Jackendoff, R. (1990). Semantic Structures. Cambridge, MA: MIT represents or carries information about something else (see Press. INTENTIONALITY). The predicate “formal” is understood in Janssen, T. M. V. (1993). Compositionality of meaning. In R. E. two simultaneous ways: positively, as something like shape, Asher, Ed., The Encyclopedia of Language and Linguistics. form, or syntax, and negatively, as independent of semantic Oxford: Pergamon Press, pp. 650–656. properties, such as reference or truth. Manipulation refers to Janssen, T. M. V. (1997). Compositionality. In J. van Benthem and the fact that computation is an active, embodied process— A. ter Meulen, Eds., Handbook of Logic and Language. something that takes place in time. Together, they character- Amsterdam: Elsevier. ize computation as involving the active manipulation of Kamp, H. (1981). A theory of truth and semantic representation. In semantic or intentional ingredients in a way that depends J. Groenendijk et al., Eds., Formal Methods in the Study of Lan- only on their formal properties. Given two data structures, guage. Amsterdam: Mathematical Centre. 154 Computation for example, one encoding the fact that Socrates is a man, bility is primarily concerned with input-output behavior, the other that all men are mortal, a computer (in this view) whereas formal symbol manipulation focuses on internal could draw the conclusion that Socrates is mortal formally mechanisms and ways of working. The proposals are dis- (i.e., would honor what Fodor 1975 calls the “formality con- tinct in extension as well, as continuous Turing machines dition”) if and only if it could do so by reacting to the form can solve problems unsolvable by digital (discrete) comput- or shape of the implicated data structures, without regard to ers; not all digital machines (such as Lincoln Log houses) their truth, reference, or MEANING. are symbol manipulators; and information, especially in its In spite of its historical importance, however, it turns out semantic reading as a counterfactual-supporting correlation, that the formal symbol manipulation construal of computing seems unrestricted by the causal locality that constrains the is only one of half a dozen different ideas at play in present- notion of effectiveness implicit in Turing machines. day intellectual discourse. It is not only at the level of basic theory or model that Among alternatives, most significant is the body of computation affects cognitive science. Whatever their work carried on in theoretical computer science under the underlying theoretical status, it is undeniable that com- label theory of computation. This largely automata-theoretic puters have had an unprecedented effect on the human conception is based on Turing’s famous construction of a lives that cognitive science studies—both technological “Turing machine”: a finite state controller able to read, (e-mail, electronic commerce, virtual reality, on-line write, and move sequentially back and forth along an infi- communities, etc.) and intellectual (in artificial intelli- nitely long tape that is inscribed with a finite but indefinite gence projects, attempts to fuse computation and quan- number of tokens or marks. It is this Turing-based theory tum mechanics, and the like). It is not yet clear, however, that explains the notion of a universal computer on which how these higher level developments relate to technical the CHURCH-TURING THESIS rests (that anything that can be debates within cognitive science itself. By far the major- done algorithmically at all can be done by a computer). ity of real-world systems are written in such high-level Although the architecture of Turing machines is specific, programming languages as Fortran, C++, and Java, for Turing’s original paper (1936/1965) introduced the now example. The total variety of software architectures built ubiquitous idea of implementing one (virtual) architecture in such languages is staggering—including data and con- on top of another, and so the Church-Turing thesis is not trol structures of seemingly unlimited variety (parallel considered to be architecturally bound. The formal theory and serial, local and distributed, declarative and proce- of Turing machines is relatively abstract, and used prima- dural, sentential and imagistic, etc.). To date, however, rily for theoretical purposes: to show what can and cannot this architectural profusion has not been fully recognized be computed, to demonstrate complexity results (how within cognitive science. Internal to cognitive science, much work is required to solve certain classes of prob- the label “computational” is often associated with one lems), to assign formal semantics to programming lan- rather specific class of architectures: serial systems based guages, etc. Turing machines have also figured in cognitive on relatively fixed, symbolic, explicit, discrete, high-level science at a more imaginative level, for example in the representations, exemplified by axiomatic inference sys- classic formulation of the Turing test: a proposal that a tems, KNOWLEDGE REPRESENTATION systems, KNOWL- computer can be counted as intelligent if it is able to mimic EDGE-BASED SYSTEMS, etc. (van Gelder 1995). a person answering a series of typed questions sufficiently This classic symbolic architecture is defended, for exam- well to fool an observer. ple by Fodor and Pylyshyn (1988), because of its claimed Because of their prominence in theoretical computer sci- superiority in dealing with the systematicity, productivity, ence, Turing machines are often thought to capture the and compositionality of high-level thought. Its very speci- essential nature of computing—even though, interestingly, ficity, however, especially in contrast with the wide variety they are not explicitly characterized in terms of either SYN- of architectures increasingly deployed in computational TAX or formality. Nor do these two models, formal systems practice, seems responsible for a variety of self-described and Turing machines, exhaust extant ideas about the funda- “non-” or even “anti-” computational movements that have mental nature of computing. One popular third alternative is sprung up in cognitive science over the last two decades: that computation is information processing—a broad intu- connectionist or neurally inspired architectures (see COGNI- ition that can be broken down into a variety of subreadings, TIVE MODELING, CONNECTIONIST and CONNECTIONISM, including syntactic or quantitative variants (à la Shannon, PHILOSOPHICAL ISSUES); DYNAMIC APPROACHES TO COGNI- Kolmogorov, Chaitin, etc.), semantic versions (à la Dretske, TION; shifts in emphasis from computational models to cog- Barwise and Perry, Halpern, Rosenschein, etc.), and histori- nitive neuroscience; embodied and situated architectures cal or socio-technical readings, such as the notion of infor- that exploit various forms of environmental context depen- mation that undergirds public conceptions of the Internet. dence (see SITUATEDNESS/EMBEDDEDNESS); so-called com- Yet another idea is that the essence of being a computer is to plex adaptive systems and models of ARTIFICIAL LIFE, be a digital state machine. In spite of various proofs of motivated in part by such biological phenomena as random broad behavioral equivalence among these proposals, for mutation and evolutionary selection; interactive or even purposes of cognitive science it is essential to recognize that merely reactive robotics, in the style of Brooks (1997), Agre these accounts are conceptually distinct. The notion of for- and Chapman (1987); and other movements. These alterna- mal symbol manipulation makes implicit reference to both tives differ from the classical model along a number of syntax and semantics, for example, whereas the notion of a dimensions: (i) they are more likely to be parallel than digital machine does neither; the theory of Turing computa- serial; (ii) their ingredient structures, particularly at the Computation and the Brain 155 design stage, are unlikely to be assigned representational or Haugeland, J. (1981/1998). The nature and plausibility of cogni- tivism. Behavioral and Brain Sciences I: 215–226. Reprinted semantic content; (iii) such content or interpretation as is in J. Haugeland, Ed., Having Thought: Essays in the Metaphys- ultimately assigned is typically based on use or experience ics of Mind. Cambridge, MA: Harvard University Press, 1988, rather than pure denotation or description, and to be “non- pp. 9–45. conceptual” in flavor, rather than “conceptualist” or “sym- Turing, A. M. (1936/1965). On computable numbers, with an bolic”; (iv) they are more likely to be used to model action, application to the Entscheidungsproblem. Proceedings of the perception, navigation, motor control, bodily movement, London Mathematical Society 2nd ser. 42: 230–265. Reprinted and other forms of coupled engagement with the world, in M. Davis, Ed., The Undecidable: Basic Papers on Undecid- instead of the traditional emphasis on detached and even able Propositions, Unsolvable Problems and Computable deductive ratiocination; (v) they are much more likely to be Functions. Hewlett, NY: Raven Press, 1965, pp. 116–154. “real time,” in the sense of requiring a close coupling van Gelder, T. J. (1995). What might cognition be, if not computa- tion? Journal of Philosophy 91: 345–381. between the temporality of the computational process and the temporality of the subject domain; and (vi) a higher pri- ority is typically accorded to flexibility and the ability to Computation and the Brain cope with unexpected richness and environmental variation rather than to deep reasoning or purely deductive skills. Two very different insights motivate characterizing the brain Taken collectively, these changes represent a sea change as a computer. The first and more fundamental assumes that in theoretical orientation within cognitive science—often the defining function of nervous systems is representa- described in contrast with traditional “computational” tional; that is, brain states represent states of some other approaches. Without a richer and more widely agreed upon system—the outside world or the body itself—where transi- theory of computation, however, able to account for the tions between states can be explained as computational variety of real-world computing, it is unclear that these operations on representations. The second insight derives newly proposed systems should be counted as genuinely from a domain of mathematical theory that defines comput- “noncomputational.” No one would argue that these new ability in a highly abstract sense. proposals escape the COMPUTATIONAL COMPLEXITY limits The mathematical approach is based on the idea of a Tur- established by computer science’s core mathematical the- ing machine. Not an actual machine, the Turing machine is a ory, for example. Nor can one deny that real-world comput- conceptual way of saying that any well-defined function ing systems are moving quickly in many of the same could be executed, step by step, according to simple “if you directions (toward parallelism, real-time interaction, etc.). are in state P and have input Q then do R” rules, given enough Indeed, the evolutionary pace of real-world computing is so time (maybe infinite time; see COMPUTATION). Insofar as the relentless that it seems a safe bet to suppose that, overall, brain is a device whose input and output can be characterized society’s understanding of “computation” will adapt, as nec- in terms of some mathematical function— however compli- essary, to subsume all these recent developments within its cated—then in that very abstract sense, it can be mimicked by ever-expanding scope. a Turing machine. Because neurobiological data indicate that See also ALGORITHM; COMPUTATION AND THE BRAIN; brains are indeed cause-effect machines, brains are, in this COMPUTATIONAL NEUROSCIENCE; CHINESE ROOM ARGU- formal sense, equivalent to a Turing machine (see CHURCH- MENT; INFORMATION THEORY; TURING TURING THESIS). Significant though this result is mathemati- —Brian Cantwell Smith cally, it reveals nothing specific about the nature of mind- brain representation and computation. It does not even imply that the best explanation of brain function will actually be in References computational/representational terms. For in this abstract Agre, P., and D. Chapman. (1987). Pengi: An implementation of a sense, livers, stomachs, and brains—not to mention sieves theory of activity. In The Proceedings of the Sixth National and the solar system—all compute. What is believed to make Conference on Artificial Intelligence. American Association for brains unique, however, is their evolved capacity to represent Artificial Intelligence. Seattle: Morgan Kaufmann, pp. 268– the brain’s body and its world, and by virtue of computation, 272. to produce coherent, adaptive motor behavior in real time. Barwise, J., and J. Perry. (1983). Situations and Attitudes. Cam- Precisely what properties enable brains to do this requires bridge, MA: MIT Press. empirical, not just mathematical, investigation. Brooks, R. A. (1997). Intelligence without representation. Artifi- Broadly speaking, there are two main approaches to cial Intelligence 47: 139-159. In J. Haugeland, Ed., Mind Design II: Philosophy, Psychology, Artificial Intelligence. 2nd addressing the substantive question of how in fact brains rep- ed., revised and enlarged. Cambridge, MA: MIT Press, pp. resent and compute. One exploits the model of the familiar 395–420. serial, digital computer, where representations are symbols, Dretske, F. (1981). Knowledge and the Flow of Information. Cam- in somewhat the way sentences are symbols, and computa- bridge, MA: MIT Press. tions are formal rules (algorithms) that operate on symbols, Fodor, J. (1975). The Language of Thought. Cambridge, MA: MIT rather like the way that “if-then” rules can be deployed in Press. formal logic and circuit design. The second approach is Fodor, J., and Z. Pylyshyn. (1988). Connectionism and cognitive rooted in neuroscience, drawing on data concerning how the architecture: A critical analysis. In S. Pinker and J. Mehler, cells of the brain (neurons) respond to outside signals such as Eds., Connections and Symbols. Cambridge, MA: MIT Press, light and sound, how they integrate signals to extract high- pp. 3–71. 156 Computation and the Brain Figure 1. Diagram showing the major levels of organization of the nervous system. order information, and how later stage neurons interact to Those who attacked problems of cognition from the per- yield decisions and motor commands. Although both spective of neurobiology argued that the neural architecture approaches ultimately seek to reproduce input-output behav- imposes powerful constraints on the nature and range of ior, the first is more “top-down,” relying heavily on com- computations that can be performed in real time. They sug- puter science principles, whereas the second tends to be gested that implementation and computation were much more “bottom-up,” aiming to reflect relevant neurobiological more interdependent than Marr’s analysis presumed. For constraints. A variety of terms are commonly used in distin- example a visual pattern recognition task can be performed in guishing the two: algorithmic computation versus signal pro- about 300 milliseconds (msec), but it takes about 5–10 msec cessing; classical artificial intelligence (AI) versus for a neuron to receive, integrate, and propagate a signal to connectionism; AI modeling versus neural net modeling. another neuron. This means that there is time for no more For some problems the two approaches can complement than about 20–30 neuronal steps from signal input to motor each other. There are, however, major differences in basic output. Because a serial model of the task would require assumptions that result in quite different models and theoret- many thousands of steps, the time constraints imply that the ical foci. A crucial difference concerns the idea of levels. In parallel architecture of the brain is critical, not irrelevant. 1982, David MARR characterized three levels in nervous sys- Marr’s tripartite division itself was challenged on tems. That analysis became influential for thinking about grounds that nervous systems display not a single level of computation. Based on working assumptions in computer “implementation,” but many levels of structured organiza- science, Marr’s proposal delineated (1) the computational tion, from molecules to synapses, neurons, networks, and so level of abstract problem analysis wherein the task is decom- forth (Churchland and Sejnowski 1992; fig. 1). Evidence posed according to plausible engineering principles; (2) the indicates that various structural levels have important func- level of the ALGORITHM, specifying a formal procedure to tional capacities, and that computation might be carried out perform the task, so that for a given input, the correct output not only at the level of the neuron, but also at a finer grain, results; and (3) the level of physical implementation, which namely the dendrite, as well as at a larger grain, namely the is relevant to constructing a working device using a particu- network. From the perspective of neuroscience, the hard- lar technology. An important aspect of Marr’s view was the ware/software distinction did not fall gracefully onto brains. claim that a higher level question was independent of levels What, in neural terms, are representations? Whereas the below it, and hence that problems of levels 1 and 2 could be AI approach equates representations with symbols, a term addressed independently of considering details of the imple- well defined in the context of conventional computers, con- mentation (the neuronal architecture). Consequently, many nectionists realized that “symbol” is essentially undefined in projects in AI were undertaken on the expectation that the neurobiological contexts. They therefore aimed to develop a known parallel, analog, continuously adapting, “messy” new theory of representation suitable to neurobiology. Thus architecture of the brain could be ignored as irrelevant to they hypothesized that occurrent representations (those hap- modeling mind/brain function. pening now) are patterns of activation across the units in a Computation and the Brain 157 neural net, characterized as a vector, , where The main reason for adhering to a framework with com- each element in the vector specifies the level of activity in a putational resources derives from the observation that neu- unit. Stored representations, by contrast, are believed to rons represent various nonneural parameters, such as head depend on the configuration of weights between units. In velocity or muscle tension or visual motion, and that com- neural terms, these weights are the strength of synaptic con- plex neuronal representations have to be constructed from nections between neurons. simpler ones. Recall the example of neurons in area 7a. Despite considerable progress, exactly how brains repre- Their response profiles indicate that they represent the posi- sent and compute remains an unsolved problem. This is tion of the visual stimulus in head-centered coordinates. mainly because many questions about how neurons code and Describing causal interactions between these cells and their decode information are still unresolved. New techniques in input signals without specifying anything about representa- neuroscience have revealed that timing of neuronal spikes is tional role masks their function in the animal’s visual capac- important in coding, but exactly how this works or how tem- ity. It omits explaining how these cells come to represent porally structured signals are decoded is not understood. what they do. Note that connectionist models can be dynam- In exploring the properties of nervous systems, artificial ical when they include back projections, time constants for NEURAL NETWORKS (ANNs) have generally been more use- signal propagation, channel open times, as well as mecha- ful to neuroscience than AI models. A useful strategy for nisms for adding units and connections, and so forth. investigating the functional role of an actual neural network In principle, dynamical models could be supplemented is to train an ANN to perform a similar information process- with representational resources in order to achieve more ing task, then to analyze its properties, and then compare revealing explanations. For instance, it is possible to treat them to the real system. For example, consider certain neu- certain parameter settings as inputs, and the resultant attrac- rons in the parietal cortex (area 7a) of the brain whose tor as an output, each carrying some representational con- response properties are correlated with the position of the tent. Furthermore, dynamical systems theory easily handles visual stimulus relative to head-centered coordinates. Since cases where the “output” is not a single static state (the the receptor sheets (RETINA, eye muscles) cannot provide result of a computation), but is rather a trajectory or limit that information directly, it has to be computed from various cycle. Another approach is to specify dynamical subsystems input signals. Two sets of neurons project to these cells: within the larger cognitive system that function as emulators some represent the position of the stimulus on the retina, of external domains, such as the task environment (see some represent the position of the eyeball in the head. Mod- Grush 1997). This approach embraces both the representa- eling these relationships via an artificial neural net shows tional characterization of the inner emulator (it represents how the eyeball/retinal position can be used to compute the the external domain), as well as a dynamical system’s char- position of the stimulus relative to the head (see OCULOMO- acterization of the brain’s overall function. TOR CONTROL). Once trained, the network’s structure can be See also AUTOMATA; COGNITIVE MODELING, CONNEC- analyzed to determine how the computation was achieved, TIONIST; COGNITIVE MODELING, SYMBOLIC; COMPUTA- and this may suggest neural experiments (Andersen 1995; TIONAL THEORY OF MIND; MENTAL REPRESENTATION see also COMPUTATIONAL NEUROSCIENCE). —Patricia S. Churchland and Rick Grush How biologically realistic to make an ANN depends on the purposes at hand, and different models are useful for dif- References ferent purposes. At certain levels and for certain purposes, abstract, simplifying models are precisely what is needed. Andersen, R. A. (1995). Coordinate transformations and motor Such a model will be more useful than a model slavishly planning in posterior parietal cortex. In M. Gazzaniga, Ed., The realistic with respect to every level, even the biochemical. Cognitive Neurosciences. Cambridge, MA: MIT Press. Churchland, P. S., and T. J. Sejnowski. (1992). The Computational Excessive realism may mean that the model is too compli- Brain. Cambridge, MA: MIT Press. cated to analyze or understand or run on the available com- Grush, R. (1997). The architecture of representation. Philosophical puters. For some projects such as modeling language Psychology 10 (1): 5–25. comprehension, less neural detail is required than for other Marr, D. (1982). Vision. New York: Freeman. projects, such as investigating dendritic spine dynamics. Port, R., and T. van Gelder. (1995). Mind as Motion: Explorations Although the assumption that nervous systems compute in the Dynamics of Cognition. Cambridge, MA: MIT Press. and represent seems reasonable, the assumption is not Thelen, E., and L. B. Smith. (1994). A Dynamical Systems proved and has been challenged. Stressing the interactive Approach to the Development of Cognition and Action. Cam- and time-dependent nature of nervous systems, some bridge, MA: MIT Press. researchers see the brain together with its body and environ- Further Readings ment as dynamical systems, best characterized by systems of differential equations describing the temporal evolution Abeles, M. (1991). Corticonics: Neural Circuits of the Cerebral of states of the brain (see DYNAMIC APPROACHES TO COGNI- Cortex. Cambridge: Cambridge University Press. TION, and Port and van Gelder 1995). In this view both the Arbib, A. M. (1995). The Handbook of Brain Theory and Neural brain and the liver can have their conduct adequately Networks. Cambridge, MA: MIT Press. described by systems of differential equations. Especially in Boden, M. (1988). Computer Models of the Mind. Cambridge: trying to explain the development of perceptual motor skills Cambridge University Press. in neonates, a dynamical systems approach has shown con- Churchland, P. (1995). The Engine of Reason, the Seat of the Soul. siderable promise (Thelen and Smith 1994). Cambridge, MA: MIT Press. 158 Computational Complexity exponentially is to say that there will be inputs that require Koch, C., and I. Segev. (1997). Methods in Neuronal Modeling: From Synapses to Networks. 2nd ed. Cambridge, MA: MIT exponential effort. However, these could turn out to be Press. extremely rare in real life, and the vast majority may very Sejnowski, T. (1997). Computational neuroscience. Encyclopedia well be unproblematic from a resource standpoint. Thus the of Neuroscience. Amsterdam: Elsevier Science Publishers. actual tractability of a task depends crucially on the range of inputs that need to be processed. But these considerations do not undermine the impor- Computational Complexity tance of complexity. To return to the visual interpretation example, we clearly ought not to be satisfied with a theory How is it possible for a biological system to perform activi- that claims that the brain computes P, but that in certain ties such as language comprehension, LEARNING, ANALOGY, extremely rare cases, it could be busy for hours doing so. PLANNING, or visual interpretation? The COMPUTATIONAL Either such cases cannot occur at all for some reason, or the THEORY OF MIND suggests that it is possible the same way it brain is able to deal with them: it gives up, it uses heuristics, is possible for an electronic computer to sort a list of num- it makes do. Either way, what the brain is actually doing is bers, simulate a weather system, or control an elevator: the no longer computing P, but some close approximation to P brain, it is claimed, is an organ capable of certain forms of that needs to be shown adequate for vision, and whose com- COMPUTATION, and mental activities such as those listed plexity in turn ought to be considered. above are computational ones. But it is not enough to say that Another complication that should be noted is that it has a mental activity is computational to account for its physical turned out to be extremely difficult to establish that a compu- or biological realizability; it must also be the sort of task that tational task requires an exponential number of steps. can be performed by the brain in a plausible amount of time. Instead, what has emerged is the theory of NP-completeness Computational complexity is the part of computer sci- (Garey and Johnson 1979). This starts with a specific compu- ence that studies the resource requirements in time and tational task called SAT (Cook 1971), which involves deter- memory of various computational tasks (Papadimitriou mining whether an input formula of propositional logic can 1994). Typically, these requirements are formulated as func- be made true. While SAT is not known to require exponential tions of the size of the input to be processed. The central time, all currently known methods do require an exponential tenet of computability theory (one form of the CHURCH- number of steps on some inputs. A computational task is NP- TURING THESIS) is that any computational task that can be complete when it is as difficult as SAT: an efficient way of performed by a physical system of whatever form, can be performing it would also lead to an efficient method for SAT, performed by a very simple device called a Turing machine. and vice versa. Thus, all NP-complete tasks, including SAT The central tenet of (sequential) complexity theory, then, is itself, and including a very large number seemingly related to that any computational task that can be performed by any assorted mental activities, are strongly believed (but are not physical system in F(n) steps, where n is the size of the known) to require exponential time. input, can be performed by a Turing machine in G(n) steps, The constraint of computational tractability has ended up where F and G differ by at most a polynomial. A conse- being a powerful forcing function in much of the technical quence of this is that any task requiring exponential time on work in artificial intelligence. For example, research in a Turing machine would not be computable in polynomial KNOWLEDGE REPRESENTATION can be profitably understood time by anything physical whatsoever. Because exponentials as investigating reasoning tasks that are not only semanti- grow so rapidly, such a task would be physically infeasible cally motivated and broadly applicable, but are also compu- except for very small inputs. tationally tractable (Levesque 1986). In many cases, this has How might this argument be relevant to cognitive sci- involved looking for plausible approximations, restrictions, ence? Consider a simple form of visual interpretation, for or deviations from the elegant but thoroughly intractable example. Suppose it is hypothesized that part of how vision models of inference based on classical LOGIC (Levesque works is that the brain determines some property P of the 1988). Indeed, practical KNOWLEDGE-BASED SYSTEMS have scene, given a visual grid provided by the retina as input. invariably been built around restricted forms of logical or Let us further suppose that the task is indeed computable, probabilistic inference. Similar considerations have led but an analysis shows that a Turing machine would require researchers away from classical decision theory to models of an exponential number of steps to do so. This means that to BOUNDED RATIONALITY (Russell and Wefald 1991). In the compute P quickly enough on all but very small grids, the area of COMPUTATIONAL VISION, it has been suggested that brain would have to perform an unreasonably large number visual attention is the mechanism used by the brain to tame of elementary operations per second. The hypothesis, then, otherwise intractable tasks (Tsotsos 1995). Because the becomes untenable: it fails to explain how the brain could resource demands of a task are so dependent on the range of perform the visual task within reasonable time bounds. inputs, recent work has attempted to understand what it is Unfortunately, this putative example is highly idealized. about certain inputs that makes NP-complete tasks problem- For one thing, computational complexity is concerned with atic (Hogg, Huberman, and Williams 1996). For a less tech- how the demand for resources scales as a function of the nical discussion of this whole issue, see Cherniak 1986. size of the input. If the required input does not get very large See also AUTOMATA; COMPUTATION AND THE BRAIN; (as with individual sentences of English, say, with a few RATIONAL AGENCY; TURING, ALAN notable exceptions), complexity theory has little to contrib- ute. For another, to say that the resource demands scale —Hector Levesque Computational Learning Theory 159 tured by the fact that the distribution P is arbitrary, and the References function class F is typically too large to permit an exhaus- Cherniak, C. (1986). Minimal Rationality. Cambridge, MA: MIT tive search for a good match to the observed data. A typical Press. example sets F to be the class of all linear-threshold func- Cook, S. A. (1971). The complexity of theorem-proving proce- tions (perceptrons) over n-dimensional real inputs. In this dures. Proceedings of the 3rd Annual ACM Symposium on the case, the model would ask whether there is a learning algo- Theory of Computing, pp. 151–158. rithm that, for any input dimension n and any desired error Garey, M., and D. Johnson. (1979). Computers and Intractability: ε > 0, requires a sample size and execution time bounded by A Guide to the Theory of NP-Completeness. New York: Wiley, fixed polynomials in n and 1/ε, and produces (with high Freeman and Co. probability) a hypothesis function h such that the probabil- Hogg, T., B. Huberman, and C. Williams. (1996). Frontiers in ity that h(x)≠f(x) is smaller than ε under P. Note that we problem solving: Phase transitions and complexity. Artificial Intelligence 81 (1–2). demand that the hypothesis function h generalize to unseen Levesque, H. (1986). Knowledge representation and reasoning. data (as represented by the distribution P), and not simply fit Annual Review of Computer Science 1: 255–287. the observed training data. Levesque, H. (1988). Logic and the complexity of reasoning. Jour- The last decade has yielded a wealth of results and analy- nal of Philosophical Logic 17: 355–389. ses in the model sketched above and related variants. Many Papadimitriou, C. (1994). Computational Complexity. New York: of the early papers demonstrated that simple and natural Addison-Wesley. learning problems can be computationally difficult for a Russell, S., and E. Wefald. (1991). Do the Right Thing. Cambridge, variety of interesting reasons. For instance, Pitt and Valiant MA: MIT Press. (1988) showed learning problems for which the “natural” Tsotsos, J. (1995). Behaviorist intelligence and the scaling prob- lem. Artificial Intelligence 75 (2): 135–160. choice for the form of the hypothesis function h leads to an NP-hard problem, but for which a more general hypothesis representation leads to an efficient algorithm. Kearns and Computational Learning Theory Valiant (1994) exhibited close connections between hard learning problems and cryptography by showing that several Computational learning theory is the study of a collection natural problems, including learning finite automata and of mathematical models of machine learning, and has boolean formula, are computationally difficult regardless of among its goals the development of new algorithms for the method used to represent the hypothesis. learning from data, and the elucidation of computational These results demonstrated that powerful learning algo- and information-theoretic barriers to learning. As such, it is rithms were unlikely to be developed within Valiant’s origi- closely related to disciplines with similar aims, such as statis- nal framework without some modifications, and researchers tics and MACHINE LEARNING (see also NEURAL NETWORKS; turned to a number of reasonable relaxations. One fruitful PATTERN RECOGNITION AND FEEDFORWARD NETWORKS; and variant has been to supplement the random sample given to BAYESIAN LEARNING). However, it is perhaps coarsely distin- the learning algorithm with a mechanism to answer queries guished from these other areas by an explicit and central con- —that is, rather than simply passively receiving pairs cern for computational efficiency and the COMPUTATIONAL drawn from some distribution, the algorithm may now COMPLEXITY of learning problems. actively request the classification of any desired x under the Most of the problems receiving the greatest scrutiny to unknown target function f. With this additional mechanism, date can be traced back to the seminal paper of L. G. Valiant a number of influential and elegant algorithms have been (1984), where a simple and appealing model is proposed discovered, including for finite automata by Angluin (1987, that emphasizes three important notions. First, learning is 1988; to be contrasted with the intractability without the probabilistic: a finite sample of data is generated randomly query mechanism, mentioned above) and for decision trees according to an unknown process, thus necessitating that by Kushelevitz and Mansour (1993). we tolerate some error, hopefully quantifiable, in the One drawback to the query mechanism is the difficulty of hypothesis output by a learning algorithm. Second, learning simulating such a mechanism in real machine learning algorithms must be computationally efficient, in the stan- applications, where typically only passive observations of dard complexity-theoretic sense: given a sample of m obser- the kind posited in Valiant’s original model are available. vations from the unknown random process, the execution An alternative but perhaps more widely applicable relax- time of a successful learning algorithm will be bounded by ation of this model is known as the weak learning or boost- a fixed polynomial in m. Third, learning algorithms should ing model. Here it is assumed that we already have be appropriately general: they should process the finite possession of an algorithm that is efficient, but meets only sample to obtain a hypothesis with good generalization very weak (but still nontrivial) generalization guarantees. ability under a reasonably large set of circumstances. This is formalized by asserting that the weak learning algo- In Valiant’s original paper, the random process consisted rithm always outputs a hypothesis whose error with respect of an unknown distribution or density P over an input space to the unknown target function is slightly better than “ran- X, and an unknown Boolean (two-valued) target function f dom guessing.” The goal of a boosting algorithm is then to over X, chosen from a known class F of such functions. The combine the many mediocre hypotheses’ output by several finite sample given to the learning algorithm consists of executions of the weak learning algorithm into a single pairs , where x is distributed according to P and y = hypothesis that is much better than random guessing. This is f(x). The demand that learning algorithms be general is cap- made possible by assuming that the weak learning algorithm 160 Computational Lexicons will perform better than random guessing on many different Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM 27 (11): 1134–1142. distributions on the inputs. Initially proposed to settle some Vapnik, V. N. (1982). Estimation of Dependences Based on Empir- rather theoretical questions in Valiant’s original model, the ical Data. New York: Springer-Verlag. boosting framework has recently resulted in new learning algorithms enjoying widespread experimental success and influence (Freund and Schapire 1997), as well as new analy- Computational Lexicons ses of some classic machine learning heuristics, such as the CART and C4.5 decision tree programs (Kearns and Man- sour 1996). A computational lexicon has traditionally been viewed as Although computational considerations are the primary a repository of lexical information for specific tasks, such distinguishing feature of computational learning theory, and as parsing, generation, or translation. From this viewpoint, have been the emphasis here, it should be noted that a signif- it must contain two types of knowledge: (1) knowledge icant fraction of the work and interest in the field is devoted needed for syntactic analysis and synthesis, and (2) to questions of a primarily statistical or information-theoretic knowledge needed for semantic interpretation. More nature. Thus, characterizations of the number of observa- recently, the definition of a computational lexicon has tions required by any algorithm (computationally efficient or undergone major revision as the fields of COMPUTATIONAL otherwise) for good generalization have been the subject of LINGUISTICS and semantics have matured. In particular, intense and prolonged scrutiny, building on the foundational two new trends have driven the design concerns of work of Vapnik (1982; see STATISTICAL LEARNING THEORY). researchers: Recent work has also sought similar characterizations with- • Attempts at closer integration of compositional semantic out probabilistic assumptions on the generation of observa- operations with the lexical information structures that tions (Littlestone 1988), and has led to a series of related and bear them experimentally successful algorithms. • A serious concern with how lexical types reflect the A more extensive bibliography and an introduction to underlying ontological categories of the systems being some of the central topics of computational learning theory modeled is contained in Kearns and Vazirani 1994. See also INDUCTION THEORY; INFORMATION THEORY; Two new approaches to modeling the structure of the LEXICON have recently emerged in computational linguis- LEARNING SYSTEMS; STATISTICAL TECHNIQUES IN NATURAL tics: (1) theoretical studies of how computations take place LANGUAGE PROCESSING in the mental lexicon; (2) developments of computational —Michael Kearns models of information as structured in lexical databases. The differences between the computational study of the lex- References icon and more traditional linguistic approaches can be sum- marized as follows: Angluin, D. (1987). Learning regular sets from queries and coun- terexamples. Information and Computation 75 (2): 87–106. Lexical representations must be explicit. The knowledge Angluin, D. (1988). Queries and concept learning. Machine Learn- contained in them must be sufficiently detailed to support ing 2 (4): 319–342. one or more processing applications. Blumer, A., A. Ehrenfeucht, D. Haussler, and M. Warmuth. (1989). The global structure of the lexicon must be modeled. Real lex- Learnability and the Vapnik-Chervonenkis dimension. Journal icons are complex knowledge bases, and hence the struc- of the ACM 36 (4): 929–965. tures relating entire words are as important as those Freund, Y., and R. Schapire. (1997). A decision–theoretic generali- relating components of words. Furthermore, lexical entries zation of on-line learning and an application to boosting. Jour- consisting of more than one orthographic word (colloca- nal of Computer and System Sciences 55 (1): 119–139. tions, idioms, and compounds) must also be represented. Kearns, M., and Y. Mansour. (1996). On the boosting ability of The lexicon must provide sufficient coverage of its domain. top-down decision tree learning algorithms. In Proceedings of the 28th Annual ACM Symposium on the Theory of Comput- Real lexicons can typically contain up to 400,000 entries. ing. Forthcoming in Journal of Computer and Systems Sci- For example, a typical configuration might be: verbs (5K), ences. nouns (30K), adjectives (5K), adverbs (<1K), logical Kearns, M., and L. G. Valiant. (1994). Cryptographic limitations terms (<1K), rhetorical terms (<1K), compounds (2K), on learning Boolean formulae and finite automata. Journal of proper names (300K), and various sublanguage terms. the ACM 41 (1): 67–95. Computational lexicons must be evaluable. Computational Kearns, M., and U. Vazirani. (1994). An Introduction to Computa- lexicons are typically evaluated in terms of: (i) coverage: tional Learning Theory. Cambridge: MIT Press. both breadth of the lexicon and depth of lexical informa- Kushelevitz, E., and Y. Mansour. (1993). Learning decision trees tion; (ii) extensibility: how easily can information be using the Fourier Spectrum. SIAM Journal on Computing 22 added to the lexical entry? How readily is new informa- (6): 1331–1348. Littlestone, N. (1988). Learning when irrelevant attributes abound: A tion made consistent with the other lexical structures? new linear-threshold algorithm. Machine Learning 2: 285–318. (iii) utility: how useful are the lexical entries for specific Pitt, L., and L. G. Valiant. (1988). Computational limitations on tasks and applications? learning from examples. Journal of the ACM 35: 965–984. Viewed independently of any specific application and Schapire, R. (1990). The strength of weak learnability. Machine evaluated in terms of its relevance to cognitive science, the Learning 5 (2): 197–227. Computational Lexicons 161 recent work on computational lexicons makes several Global Structure of the Lexicon important points. The first is that the lexical and interlexi- From the discussion above, the entries in a lexicon would cal structures employed in computational studies have pro- appear to encode only concepts such as category informa- vided some of the most complete descriptions of the lexical tion, selectional restrictions, number, type and case roles of bases of natural languages. Besides the broad descriptive arguments, and so forth. While the utility of this kind of coverage of these lexicons, the architectural decisions information is beyond doubt, the emphasis on the individual involved in these systems have important linguistic and entry misses out on the issue of global lexical organization. psychological consequences. For example, the legitimacy This is not to dismiss ongoing work that does focus precisely and usefulness of many theoretical constructions and on this issue; for instance, attempts to relate grammatical abstract descriptions can be tested and verified by attempt- alternations with semantic classes (e.g., Levin 1993). ing to instantiate them in as complete and robust a lexicon One obvious way to organize lexical knowledge is by as possible. Of course, completeness doesn't ensure cor- means of lexical inheritance mechanisms. In fact, much rectness nor does it ensure a particularly interesting lexi- recent work has focused on how to provide shared data con from a theoretical point of view, but explicit structures for syntactic and morphological knowledge representations do reveal the limitations of a given analyti- (Flickinger, Pollard, and Wasow 1985). Evans and Gazdar cal framework. (1990) provide a formal characterization of how to perform inferences in a language for multiple and default inheritance Content of a Single Lexical Entry of linguistic knowledge. The language developed for that purpose, DATR, uses value-terminated attribute trees to Although there are many competing views on the exact encode lexical information. Taking a different approach, structure of lexical entries, there are some important com- Briscoe, dePaiva, and Copestake (1993) describe a rich sys- mon assumptions about the content of a lexical entry. It is tem of types for allowing default mechanisms into lexical generally agreed that there are three necessary components type descriptions. to the structure of a lexical item: orthographic and morpho- Along a similar line, Pustejovsky and Boguraev (1993) logical information; i.e. how the word is spelled and what describe a theory of shared semantic information based on forms it appears in; syntactic information; for instance, what orthogonal typed inheritance principles, where there are part of speech the word is; and semantic information; i.e., several distinct levels of semantic description for a lexical what representation the word translates to. item. In particular, a set of semantic roles called qualia Syntactic information may be divided into the subtypes structure is relevant to just this issue. These roles specify the of category and subcategory. Category information includes purpose (telic), origin (agentive), basic form (formal), and traditional categories such as noun, verb, adjective, adverb, constitution (const) of the lexical item. In this view, a lexical and preposition. While most systems agree on these “major” item inherits information according to the qualia structure it categories, there are often great differences in the ways they carries. In this view, multiple inheritance can be largely classify “minor” categories, such as conjunctions, quantifier avoided because the qualia constrain the types of concepts elements, determiners, etc. that can be put together. For example, the predicates cat and Subcategory information is information that divides syn- pet refer to formal and telic qualia, respectively. tactic categories into subclasses. This sort of information may be usefully separated into two types, contextual fea- tures and inherent features. The former are features that The Computational Lexicon as Knowledge Base may be defined in terms of the contexts in which a given lexical entry may occur. Subcategorization information The interplay of the lexical needs of current language pro- marks the local legitimate context for a word to appear in a cessing frameworks and contemporary lexical semantic the- syntactic structure. For example, the verb devour is never ories very much influences the direction of computational intransitive in English and requires a direct object; hence the dictionary analysis research for lexical acquisition. Given lexicon tags the verb with a subcategorization requiring an the increasingly prominent place the lexicon is assigned—in NP object. Another type of context encoding is colloca- linguistic theories, in language processing technology, and tional information, where patterns that are not fully produc- in domain descriptions—it is no accident, nor is it mere tive in the grammar can be tagged. For example, the rhetoric, that the term “lexical knowledge base” has become adjective heavy as applied to drinker and smoker is colloca- a widely accepted one. Researchers use it to refer to a large- tional and not freely productive in nature. scale repository of lexical information, which incorporates Inherent features are features of lexical entries that can- more than just “static” descriptions of words, for example, not, or cannot easily, be reduced to a contextual definition. clusters of properties and associated values. A lexical They include such features as count/mass (e.g., pebble vs. knowledge base should state constraints on word behavior, water), abstract, animate, human, and so on. dependence of word interpretation on context, and distribu- Semantic information can also be separated into two sub- tion of linguistic generalizations. categories, base semantic typing and selectional typing. A lexicon is essentially a dynamic object, as it incorpo- While the former identifies the broad semantic class that a rates, in addition to its information types, the ability to per- lexical item belongs to (such as event, proposition, predi- form inference over them and thus induce word meaning in cate), the latter class specifies the semantic features of argu- context. This is what a computational lexicon is: a theoreti- ments and adjuncts to the lexical item. cally sound and computationally useful resource for real 162 Computational Linguistics application tasks and for gaining insights into human cogni- Hobbs, J., W. Croft, T. Davies, D. Edwards, and K. Laws. (1987). Commonsense metaphysics and lexical semantics. Computa- tive abilities. tional Linguistics 13: 241–250. See also COMPUTATIONAL PSYCHOLINGUISTICS; CON- Ingria, R., B. Boguraev, and J. Pustejovsky. (1992). Dictionary/ CEPTS; KNOWLEDGE REPRESENTATION; NATURAL LANGUAGE Lexicon. In Stuart Shapiro, Ed., Encyclopedia of Artificial GENERATION; NATURAL LANGUAGE PROCESSING; STATISTI- Intelligence. 2nd ed. New York: Wiley. CAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING Miller, G. (1991). The Science of Words. Scientific American Library. —James Pustejovsky Pustejovsky, J. (1992). Lexical semantics. In Stuart Shapiro, Ed., Encyclopedia of Artificial Intelligence. 2nd ed. New York: References Wiley. Briscoe, T., V. de Paiva, and A. Copestake, Eds. (1993). Inheri- Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, MA: tance, Defaults, and the Lexicon. Cambridge: Cambridge Uni- MIT Press. versity Press. Pustejovsky, J., S. Bergler, and P. Anick. (1993). Lexical semantic Evans, R., and G. Gazdar. (1990). Inference in DATR. Proceedings techniques for corpus analysis. Computational Linguistics 19 (2). of the Fourth European ACL Conference, April 10–12, 1989, Salton, G. (1991). Developments in automatic text retrieval. Sci- Manchester, England. ence 253: 974. Flickinger, D., C. Pollard, and T. Wasow. (1985). Structure–shar- Weinreich, U. (1972). Explorations in Semantic Theory. The ing in lexical representation. Proceedings of 23rd Annual Meet- Hague: Mouton. ing of the ACL, Chicago, IL, pp. 262–267. Wilks, Y. (1975). An intelligent analyzer and understander for Grimshaw, J. (1990). Argument Structure. Cambridge, MA: MIT English. Communications of the ACM 18: 264–274. Press. Wilks, Y., D. Fass, C-M. Guo, J. McDonald, T. Plate, and B. Slator. Guthrie, L., J. Pustejovsky, Y. Wilks, and B. Slator. (1996). The (1989). A tractable machine dictionary as a resource for com- role of lexicons in natural language processing. Communica- putational semantics. In B. Boguraev and E. Briscoe, Eds., tions of the ACM 39:1. Computational Lexicography for Natural Language Process- Levin, B. (1993). Towards a Lexical Organization of English ing. Longman, Harlow and London, pp. 193–228. Verbs. Chicago: University of Chicago Press. Miller, G. WordNet: an on-line lexical database. International Computational Linguistics Journal of Lexicography 3: 235–312. Pollard, C., and I. Sag. (1987). Information–Based Syntax and Semantics. CSLI Lecture Notes Number 13. Stanford, CA: CSLI. Computational linguistics (CL; also called natural language Pustejovsky, J., and P. Boguraev. (1993). Lexical knowledge repre- processing, or NLP) is concerned with (1) the study of com- sentation and natural language processing. Artificial Intelli- putational models of the structure and function of language, gence 63: 193–223. its use, and its acquisition; and (2) the design, development, and implementation of a wide range of systems such as Further Readings SPEECH RECOGNITION, language understanding and NATU- RAL LANGUAGE GENERATION. CL applications include Atkins, B. (1990). Building a lexicon: Reconciling anisomorphic interfaces to databases, text processing, and message under- sense differentiations in machine-readable dictionaries. Paper standing, multilingual interfaces as aids for foreign lan- presented at BBN Symposium: Natural Language in the 90s— Language and Action in the World, Cambridge, MA. guage correspondences, web pages, and speech-to-speech Boguraev, B., and E. Briscoe. (1989). Computational Lexicogra- translation in limited domains. On the theoretical side, CL phy for Natural Language Processing. Longman, Harlow and uses computational modeling to investigate syntax, seman- London. tics, pragmatics (that is, certain aspects of the relationship Boguraev, B., and J. Pustejovsky. (1996). Corpus Processing for of the speaker and the hearer, or of the user and the system Lexical Acquisition. Cambridge, MA: Bradford Books/MIT in the case of a CL system), and discourse aspects of lan- Press. guage. These investigations are interdisciplinary and Briscoe, E., A. Copestake, and B. Boguraev. (1990). Enjoy the involve concepts from artificial intelligence (AI), linguis- paper: Lexical semantics via lexicology. Proceedings of 13th tics, logic, and psychology. By connecting the closely inter- International Conference on Computational Linguistics, Hel- related fields of computer science and linguistics, CL plays sinki, Finland, pp. 42–47. Calzolari, N. (1992). Acquiring and representing semantic infor- a key role in cognitive science. mation in a lexical knowledge base. In J. Pustejovsky and S. Because it is impossible to cover the whole range of the- Bergler, Eds., Lexical Semantics and Knowledge Representa- oretical and practical issues in CL in the limited space avail- tion. New York: Springer Verlag. able, only a few related topics are discussed here in some Copestake, A., and E. Briscoe. (1992). Lexical operations in a detail, to give the reader a sense of important issues. Fortu- unification–based framework. In J. Pustejovsky and S. Bergler, nately, there exists a comprehensive source, documenting all Eds., Lexical Semantics and Knowledge Representation. New the major aspects of CL (Survey of the State of Art in York: Springer Verlag. Human Language Technology forthcoming). Evens, M. (1987). Relational Models of the Lexicon. Cambridge: Cambridge University Press. Grammars and Parsers Grishman, R., and J. Sterling. (1992). Acquisition of selectional patterns. Proceedings of the 14th International Conf. on Com- Almost every NLP system has a grammar and an associated putational Linguistics (COLING 92), Nantes, France. parser. A grammar is a finite specification of a potentially Hirst, G. (1987). Semantic Interpretation and the Resolution of infinite number of sentences, and a parser for the grammar is Ambiguity. Cambridge: Cambridge University Press. Computational Linguistics 163 an ALGORITHM that analyzes a sentence and, if possible, more complex patterns. Precise statements of such depen- assigns one or more grammar-based structural descriptions dencies and the domains over which they operate consti- to the sentence. The structural descriptions are necessary for tute the major activity in the specification of a grammar. further processing—for example, for semantic interpretation. Computational modeling of these dependencies is one of Many CL systems are based on context-free grammars the key areas in CL. Many (for example, the crossed (CFGs). One such grammar, G, consists of a finite set of dependencies discussed above) cannot be described by nonterminals (for example, S: sentence; NP: noun phrase; context-free grammars and require grammars with larger VP: verb phrase; V: verb; ADV: adverb), a finite set of ter- domains of locality. One such class is constituted by the minals, or lexical items, and a finite set of rewrite rules of so-called mildly context-sensitive grammars (Joshi and the form A → W, where A is a nonterminal and W is a string Schabes 1997). Two grammar formalisms in this class are of nonterminals and terminals. S is a special nonterminal tree-adjoining grammars (TAGs) and combinatory catego- called the “start symbol.” rial grammars (CCGs; Joshi and Schabes 1997; Steedman CFGs are inadequate and need to be augmented for a 1997; see also CATEGORIAL GRAMMAR). TAGs are also variety of reasons. For one, the information associated with unification-based grammars. a phrase (a string of terminals) is not just the atomic sym- Parsing sentences according to different grammars is bols used as nonterminals, but a complex bundle of infor- another important research area in CL. Indeed, parsing algo- mation (sets of attribute-value pairs, called “feature rithms are known for almost all context-free grammars used structures”) that needs to be associated with strings. More- in CL. The time required to parse a sentence of length n is at most Kn3, where K depends on the size of the grammar and over, appropriate structures and operations for combining them are needed, together with a CFG skeleton. For can become very large in the worst case. Most parsers per- another, the string-combining operation in a CFG is concat- form much better than the worst case on typical sentences, enation, (for example, if u and v are strings, v concatenated however, even though there are no computational results, as with u gives the string w = uv, that is, u followed by v), and yet, to characterize their behavior on typical sentences. more complex string-combining as well as tree-combining Grammars that are more powerful than CFGs are, of course, operations are needed to describe various linguistic phe- harder to parse, as far as the worst case is concerned. Like nomena. Finally, the categories in a grammar need to be CFGs, mildly context-sensitive grammars can all be parsed augmented by associating them with feature structures, a in polynomial time, although the exponent for n is 6 instead set of attribute-value pairs that are then combined in an of 3. A crucial problem in parsing is not just to get all possi- operation called “unification.” A variety of grammars such ble parses for a sentence but to rank the parses according to as generalized phrase structure grammar (GPSG), HEAD- some criteria. A grammar can be combined with statistical DRIVEN PHRASE STRUCTURE GRAMMAR (HPSG), and LEXI- information to provide this ranking. CAL FUNCTIONAL GRAMMAR (LFG) are essentially CFG- Thus far, we have assumed that the parser only handles based unification grammars (Bresnan and Kaplan 1983; complete sentences and either succeeds or fails to find the Gazdar et al. 1985; Pollard and Sag 1994). parses. In practice, we want the parser to be flexible. It Computational grammars need to describe a wide should be able to handle fragments of sentences, or if it range of dependencies among the different elements in the fails, it should fail gracefully, providing as much analysis as grammar. Some of these dependencies involve agreement possible for as many fragments of the sentence as possible, features, such as person, number, and gender. For exam- even if it cannot glue all the pieces together. ple, in English, the verb agrees with the subject in person Finally, even though the actual grammars in major CL and number. Others involve verb subcategorization, in systems are large, their coverage is not adequate. Building a which each verb specifies one (or more) subcategorization grammar by hand soon reaches its limit in coping with free frames for its complements. For instance, sleep (as in text (say, text from a newspaper). Increasing attention is Harry sleeps) does not require any complement, while being paid both to automatically acquiring grammars from a like (as in Harry likes peanuts) requires one complement; large corpus and to statistical parsers that produce parses give (as in Harry gives Susan a flower) requires two com- directly based on the training data of the parsed annotated plements; and so forth. Sometimes the dependent ele- corpora. ments do not appear in their normal positions. In Who i did John invite ei, where ei is a stand-in for whoi, whoi is the Statistical Approaches filler for the gap ei (see WH-MOVEMENT). The filler and Although the idea of somehow combining structural and the gap need not be at a fixed distance. Thus in Whoi did statistical information was already suggested in the late Bill ask John to invite e i, the filler and the gap are more 1950s, it is only now, when we have formal frameworks distant than in the previous sentence. Sometimes the suitable for combining structural and statistical information dependencies are nested. In German, for example, one in a principled manner and can use very large corpora to could have Hansi Peterj Mariek schwimmenk lassenj sahi reliably estimate the statistics needed to deduce linguistic (Hans saw Peter make Marie swim), where the nouns structure, that this idea has blossomed in CL. (arguments) and verbs are in nested order, as the sub- scripts indicate. And sometimes they are crossed. In STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PRO- CESSING in conjunction with large corpora (raw texts or texts Dutch, for example, one could have Jani Pietj Mariek zagi annotated in various ways) have also been used to automati- latenj zwemmenk (Jan saw Piet make Marie swim). There cally acquire linguistic information about MORPHOLOGY are, of course, situations where the dependencies have 164 Computational Neuroanatomy (prefixes, suffixes, inflected forms), subcategorization, supraneuronal continuum, or “field theory,” approach to semantic classes (classification of nouns based on what neural structure and function, to the “particulate” approach predicates they go with; compound nouns such as jet associated with the properties of single neurons. In particu- engines, stock market prices; classification of verbs, for lar, the details of spatial neuroanatomy of the nervous sys- example, to know, or events, for example, to look; and so tem are viewed as computational, rather than as mere on), and, of course, grammatical structure itself. Such results “packaging” (see Schwartz 1994 for a comprehensive have opened up a new direction of research in CL, often review of the structural and functional correlates of neu- described as “corpus-based CL.” roanatomy). It should be clear from the previous discussion that, for Global Map Structure the development of corpus-based CL, very large quantities of data are required (the Brown corpus from the 1960s is It is widely accepted that the cortical magnification factor in about 1 million words). Researchers estimate that about 100 primates is approximately inverse linear, at least for the cen- million words will be required for some tasks. The technol- tral twenty degrees of field (e.g., Tootell et al. 1985; van ogies that will benefit from corpus-based NLP include Essen, Newsome, and Maunsell 1984; Dow, Vautin, and speech recognition, SPEECH SYNTHESIS, MACHINE TRANSLA- Bauer 1985), and preliminary results from functional MAG- TION, full-text information retrieval, and message under- NETIC RESONANCE IMAGING (fMRI) suggest roughly the standing, among others. The need for establishing very large same for humans. The simple mathematical argument show- text and speech databases, annotated in various ways, is now ing there is only one complex analytic two-dimensional map well understood. One such database is the Linguistic Data function that has this property, namely, the complex loga- Consortium (LDC), a national and international resource rithm (Schwartz 1977, 1994), suggested an experiment. supported by federal and industrial grants, and useful also Determine the point correspondence data from a 2DG (2- for psycholinguistic research (especially in experimental deoxyglucose) experiment, accurately measure the flattened design for controlling lexical and grammatical frequency cortical surface, and check the validity of the Reimann map- biases; see http://www.ldc.upenn.edu). ping theorem prediction of primary visual cortex (V1) See also COMPUTATIONAL COMPLEXITY; COMPUTA- topography, or, equivalently, of the hypothesis that global TIONAL LEXICONS; COMPUTATIONAL PSYCHOLINGUISTICS; topography in V1 is a generalized conformal map. The NATURAL LANGUAGE PROCESSING; PSYCHOLINGUISTICS; results of this experiment (Schwartz, Munsif, and Albright SENTENCE PROCESSING 1989; Schwartz 1994) confirmed that cortical topography is —Aravind K. Joshi in strong agreement with the conformal mapping hypothe- sis, up to an error estimated to be roughly 20 percent. References Local Map Structure Bresnan, J., and R. M. Kaplan. (1983). The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press. A large number of NEURAL NETWORK models have been Gazdar, G., E. Klein, G. K. Pullum, and I. A. Sag. (1985). General- constructed to address the generation of ocular dominance ized Phrase Structure Grammar. Cambridge, MA: Harvard and orientation columns in cat and monkey VISUAL COR- University Press. TEX. Surprisingly, the common element in all these models, Joshi, A. K., and Y. Schabes. (1997). Tree-adjoining grammars. In often not explicitly stated, is the use of a spatial filter A. Salomma and G. Rozenberg, Eds., Handbook of Formal applied to spatial white noise (Rojer and Schwartz 1990a). Languages and Automata. Berlin: Springer. Pollard, C., and I. A. Sag. (1994). Head-Driven Phrase Structure Thus ocular dominance columns are the result of band-pass Grammar. Chicago: University of Chicago Press. filtering a scalar white noise variable (ocularity). Orienta- Steedman, M. (1997). Surface Structure and Interpretation. Cam- tion columns, as originally pointed out by Rojer and bridge, MA: MIT Press. Schwartz (1990a), could be understood as the result of Survey of the State of Art in Human Language Technology. (Forth- applying a band-pass filter to vector noise, that is, to a vec- coming). Cambridge: Cambridge University Press. A survey tor quantity whose magnitude represented strength of tun- sponsored by the National Science Foundation, the Directorate ing, and whose argument represented orientation. One XIII-E of the Commission of the European Communities, and recent result of this analysis is that the zero-crossings of the the Center for Spoken Language Understanding, Oregon Grad- cortical orientation map were predicted, on topological uate Institute, Corvalis. Currently available at http:// grounds, to provide a coordinate system in which left- and www.cse.ogi.edu/CSLU/HLTsurvey. right-handed orientation vortices should alternate in hand- Computational Neuroanatomy edness (i.e., clockwise or counterclockwise orientation change). This prediction was tested with optical recording data on primate visual cortex orientation maps and found to The term computational anatomy was introduced in be in perfect agreement with the data (Tal and Schwartz Schwartz 1980, which suggested that the observables of 1997). functional anatomy, such as columnar and topographic structure, be made the basis of the state variables for percep- Unified Global and Local Map Structure tual computations (rather than, as is universally assumed, some combination of single neuronal response properties). A joint map structure to express the global conformal topo- The goal of computational neuroanatomy is to construct a graphic structure, and, at the same time, the local orientation Computational Neuroanatomy 165 column and ocular dominance column structure of primate (1974), suggested the hypothesis that a local analysis of V1, was introduced by Landau and Schwartz (1994), mak- shape, in terms of periodic changes in orientation of a stim- ing use of a new construct in computational geometry called ulus outline, might provide a basis for shape analysis the “protocolumn.” (Schwartz 1984). A parametric set of shape descriptors, based on shapes whose boundary curvature varied sinusoi- dally, was used as a probe for the response properties of Global Map Function neurons in infero-temporal cortex, which is one of the final One obvious functional advantage of using strongly space- targets for V1, and which is widely believed to be an impor- variant (e.g., foveal) architecture in vision is data compres- tant site for shape recognition. This work found that a subset sion. It has been estimated that a constant-resolution version of the infero-temporal neurons examined were tuned to of visual cortex, were it to retain the full human visual field stimuli with sinusoidal curvature variation (so-called Fou- and maximum human visual resolution, would require rier descriptors), and that these responses showed a signifi- roughly 104 as many cells as our actual cortex (and would cant amount of size, rotation, and shift invariance (Schwartz weigh, by inference, roughly 15,000 pounds; Rojer and et al. 1983). Schwartz 1990b). The problem of viewing a wide-angle See also COLUMNS AND MODULES; COMPUTATIONAL work space at high resolution would seem to be best per- NEUROSCIENCE; COMPUTATIONAL VISION; COMPUTING IN formed with space-variant visual architectures, an important SINGLE NEURONS; STEREO AND MOTION PERCEPTION; theme in MACHINE VISION (Schwartz, Greve, and Bonmas- VISUAL ANATOMY AND PHYSIOLOGY sar 1995). The complex logarithmic mapping has special —Eric Schwartz properties with respect to size and rotation invariance. For a given fixation point, changing the size or rotating a stimulus References causes its cortical representation to shift, but to otherwise remain invariant (Schwartz 1977). This symmetry property Ballard, D., T. Becker, and C. Brown. (1988). The Rochester robot. provides an excellent example of computational neuroanat- Tech. Report University of Rochester Dept. of Computer Sci- omy: simply by virtue of the spatial properties of cortical ence 257: 1–65. topography, size and rotation symmetries may be converted Bonmassar, G., and E. Schwartz. (Forthcoming). Space-variant into the simpler symmetry of shift. One obvious problem Fourier analysis: The exponential chirp transform. IEEE Pat- with this idea is that it only works for a given fixation direc- tern Analysis and Machine Vision. tion. As the eye scans an image, translation invariance is Cavanagh, P. (1978). Size and position invariance in the visual sys- badly broken. Recently, a computational solution to this tem. Perception 7: 167–177. Dow, B., R. G. Vautin, and R. Bauer. (1985). The mapping of problem has been found, by generalizing the Fourier trans- visual space onto foveal striate cortex in the macaque monkey. form to complex logarithmic coordinate systems, resulting J. Neuroscience 5: 890–902. in a new form of spatial transform, called the “exponential Hubel, D. H., and T. N. Wiesel. (1974). Sequence regularity and chirp transform” (Bonmassar and Schwartz forthcoming.) geometry of orientation columns in the monkey striate cortex. The exponential chirp transform, unlike earlier attempts to J. Comp. Neurol. 158: 267–293. incorporate Fourier analysis in the context of human vision Landau, P., and E. L. Schwartz. (1994). Subset warping: Rubber (e.g., Cavanagh 1978), provides size, rotation, and shift sheeting with cuts. Computer Vision, Graphics and Image Pro- invariance properties, while retaining the fundamental cessing 56: 247–266. space-variant structure of the visual field. Rojer, A., and E. L. Schwartz. (1990a). Cat and monkey cortical columnar patterns modeled by bandpass-filtered 2D white noise. Biological Cybernetics 62: 381-391. Local Map Function Rojer, A., and E. L. Schwartz. (1990b). Design considerations for a space-variant visual sensor with complex-logarithmic geome- The ocular dominance column presents a binocular view of try. In 10th International Conference on Pattern Recognition, the visual world in the form of thin “stripes,” alternating vol. 2. pp. 278–285. between left- and right-eye representations. One question Schwartz, E. L. (1977). Spatial mapping in primate sensory projec- that immediately arises is how this aspect of cortical anat- tion: Analytic structure and relevance to perception. Biological omy functionally relates to binocular stereopsis. Yeshurun Cybernetics 25: 181–194. and Schwartz (1989) constructed a computational stereo Schwartz, E. L. (1980). Computational anatomy and functional algorithm based on the assumption that the ocular domi- architecture of striate cortex: A spatial mapping approach to nance column structure is a direct representation, as an ana- perceptual coding. Vision Research 20: 645–669. Schwartz, E. L. (1984). Anatomical and physiological correlates of tomical pattern, of the stereo percept. It was shown that the human visual perception. IEEE Trans. Systems, Man and power spectrum of the log power spectrum (also known as Cybernetics 14: 257-271. the “cepstrum”) of the interlaced cortical “image” provided Schwartz, E. L. (1994). Computational studies of the spatial archi- a simple and direct measure of stereo disparity of objects in tecture of primate visual cortex: Columns, maps, and proto- the visual scene. This idea has been subsequently used in a maps. In A. Peters and K. Rocklund, Eds., Primary Visual successful machine vision algorithm for stereo vision (Bal- Cortex in Primates, Vol. 10 of Cerebral Cortex. New York: Ple- lard, Becker, and Brown 1988), and provides another excel- num Press. lent illustration of computational neuroanatomy. Schwartz, E. L., R. Desimone, T. Albright, and C. G. Gross. The regular local spatial map of orientation response in (1983). Shape recognition and inferior temporal neurons. Pro- cat and monkey, originally described by Hubel and Wiesel ceedings of the National Academy of Sciences 80: 5776–5778. 166 Computational Neuroscience to the algorithms implemented, and the computational func- Schwartz, E. L., D. Greve, and G. Bonmassar. (1995). Space- variant active vision: Definition, overview and examples. Neu- tion of that area (figure 1), which might not be apparent in a ral Networks 8: 1297–1308. general-purpose computer whose function depends on soft- Schwartz, E. L., A. Munsif, and T. D. Albright. (1989). The topo- ware. graphic map of macaque V1 measured via 3D computer recon- Another major difference between the brain and a struction of serial sections, numerical flattening of cortex, and general-purpose digital computer is that the connectivity conformal image modeling. Investigative Opthalmol. Supple- between neurons and their properties are shaped by the ment, p. 298. environment during development and remain plastic even in Tal, D., and E. L. Schwartz. (1997). Topological singularities in adulthood. Thus, as the brain processes information, it cortical orientation maps: The sign theorem correctly predicts changes its own structure in response to the information orientation column patterns in primate striate cortex. Network: being processed. Adaptation and learning are important Computation Neural Sys. 8: 229–238. Tootell, R. B., M. S. Silverman, E. Switkes, and R. deValois. mechanisms that allow brains to respond flexibly as the (1985). Deoxyglucose, retinotopic mapping and the complex world changes on a wide range of time scales, from seconds log model in striate cortex. Science 227: 1066. to years. The flexibility of the brain has survival advantages van Essen, D. C., W. T. Newsome, and J. H. R. Maunsell. (1984). when the environment is nonstationary and the evolution of The visual representation in striate cortex of the macaque mon- cognitive skills may deeply depend on genetic processes key: Asymmetries, anisotropies, and individual variability. that have extended the time scales for brain plasticity. Vision Research 24: 429–448. Brains are complex, dynamic systems, and brain models Yeshurun, Y., and E. L. Schwartz. (1989). Cepstral filtering on a provide intuition about the possible behaviors of such sys- columnar image architecture: A fast algorithm for binocular tems, especially when they are nonlinear and have feedback stereo segmentation. IEEE Trans. Pattern Analysis and loops. The predictions of a model make explicit the conse- Machine Intelligence 11(7): 759–767. quences of the underlying assumptions, and comparison with experimental results can lead to new insights and dis- Computational Neuroscience coveries. Emergent properties of neural systems, such as oscillatory behaviors, depend on both the intrinsic properties The goal of computational neuroscience is to explain in of the neurons and the pattern of connectivity between them. computational terms how brains generate behaviors. Com- Perhaps the most successful model at the level of the putational models of the brain explore how populations of NEURON has been the classic Hodgkin-Huxley (1952) model highly interconnected neurons are formed during develop- of the action potential in the giant axon of the squid (Koch ment and how they come to represent, process, store, act on, and Segev 1998). Data were collected under a variety of and be altered by information present in the environment conditions, and a model later constructed to integrate the (Churchland and Sejnowski 1992). Techniques from com- data into a unified framework. Because most of the vari- puter science and mathematics are used to simulate and ana- ables in the model are measured experimentally, only a few lyze these computational models to provide links between unknown parameters need to be fit to the experimental data. the widely ranging levels of investigation, from the molecu- Detailed models can be used to choose among experiments lar to the systems levels. Only a few key aspects of compu- that could be used to distinguish between different explana- tational neuroscience are covered here (see Arbib 1995 for a tions of the data. comprehensive handbook of brain theory and neural net- The classic model of a neuron, in which information works). flows from the dendrites, where synaptic signals are inte- The term computational refers both to the techniques grated, to the soma of the neuron, where action potentials used in computational neuroscience and to the way brains process information. Many different types of physical sys- tems can solve computational problems, including slide rules and optical analog analyzers as well as digital comput- ers, which are analog at the level of transistors and must set- tle into a stable state on each clock cycle. What these have in common is an underlying correspondence between an abstract computational description of a problem, an algo- rithm that can solve it, and the states of the physical system that implement it (figure 1). This is a broader approach to COMPUTATION than one based purely on symbol processing. There is an important distinction between general- purpose computers, which can be programmed to solve many different types of algorithms, and special-purpose computers, which are designed to solve only a limited range of problems. Most neural systems are specialized for partic- ular tasks, such as the RETINA, which is dedicated to visual transduction and image processing. As a consequence of the Figure 1. Levels of analysis (Marr 1982). The two-way arrows close coupling between structure and function in a brain indicate that constraints between levels can be used to gain insights area, anatomy and physiology can provide important clues in both directions. Computational Neuroscience 167 are initiated and carried to other neurons through long frames are used in the cortex for performing sensorimotor axons, views dendrites as passive cables. Recently, however, transformations, the model makes predictions for experi- voltage-dependent sodium, calcium, and potassium chan- ments performed on patients with lesions of the parietal cor- nels have been observed in the dendrites of cortical neurons, tex who display spatial neglect. which greatly increases the complexity of synaptic integra- Conceptual models can be helpful in organizing experi- tion. Experiments and models have shown that these active mental facts. Although thalamic neurons that project to the currents can carry information in a retrograde direction cortex are called “relay cells,” they almost surely have addi- from the cell body back to the distal synapse tree (see also tional functions because the visual cortex makes massive COMPUTING IN SINGLE NEURONS). Thus it is possible for feedback projections back to them. Francis Crick (1994) has spikes in the soma to affect synaptic plasticity through proposed that the relay cells in the THALAMUS may be mechanisms that were suggested by Donald HEBB in 1949. involved in ATTENTION, and has provided an explanation for Realistic models with several thousand cortical neurons how this could be accomplished based on the anatomy of can be explored on the current generation of workstation. the thalamus. His searchlight model of attention and other The first model for the orientation specificity of neurons in hypotheses for the function of the thalamus are being the VISUAL CORTEX was the feedforward model proposed by explored with computational models and new experimental Hubel and Wiesel (1962), which assumed that the orienta- techniques. Detailed models of thalamocortical networks tion preference of cortical cells was determined primarily by can already reproduce the low-frequency oscillations converging inputs from thalamic relay neurons. Although observed during SLEEP states, when feedback connections to solid experimental evidence supports this model, local corti- the thalamus affect the spatial organization of the rhythms. cal circuits have been shown to be important in amplifying These sleep rhythms may be important for memory consoli- weak signals and suppressing noise as well as performing dation (Sejnowski 1995). gain control to extend the dynamic range. These models are Finally, small neural systems have been analyzed with governed by the type of attractor dynamics analyzed by dynamic systems theory. This approach is feasible when the John Hopfield (1982), who provided a conceptual frame- numbers of parameters and variables are small. Most mod- work for the dynamics of feedback networks (Churchland els of neural networks involve a large number of variables, and Sejnowski 1992). such as membrane potentials, firing rates, and concentra- Although the spike train of cortical neurons is highly tions of ions, with an even greater number of unknown irregular, and is typically treated statistically, information parameters, such as synaptic strengths, rate constants, and may be contained in the timing of the spikes in addition to ionic conductances. In the limit that the number of neurons the average firing rate. This has already been established for and parameters is very large, techniques from statistical a variety of sensory systems in invertebrates and for periph- physics can be applied to predict the average behavior of eral sensory systems in mammals (Rieke et al. 1996). large systems. There is a midrange of systems where neither Whether spike timing carries information in cortical neu- type of limiting analysis is possible, but where simulations rons remains, however, an open research issue (Ritz and can be performed. One danger of relying solely on computer Sejnowski 1997). In addition to representing information, simulations is that they may be as complex and difficult to spike timing could also be used to control synaptic plasticity interpret as the biological systems themselves. through Hebbian mechanisms for synaptic plasticity. To better understand the higher cognitive functions, we Other models have been used to analyze experimental will need to scale up simulations from thousands to millions data in order to determine whether they are consistent with a of neurons. While parallel computers are available that per- particular computational assumption. For example, Aposto- mit massively parallel simulations, the difficulty of pro- los Georgopoulos has used a “vector-averaging” technique to gramming these computers has limited their usefulness. A compute the direction of arm motion from the responses of new approach to massively parallel models has been intro- cortical neurons, and William Newsome and his colleagues duced by Carver Mead (1989), who builds subthreshold (Newsome, Britten, and Movshon 1989) have used SIGNAL complementary metal-oxide semiconductor Very-Large- DETECTION THEORY to analyze the information from cortical Scale Integrated (CMOS VLSI) circuits with components neurons responding to visual motion stimuli (Churchland that directly mimic the analog computational operations in and Sejnowski 1992). In these examples, the computational neurons. Several large silicon chips have been built that model was used to explore the information in the data but mimic the visual processing found in retinas. Analog VLSI was not meant to be a model for the actual cortical mecha- cochleas have also been built that can analyze sound in real nisms. Nonetheless, these models have been highly influen- time. These chips use analog voltages and currents to repre- tial and have provided new ideas for how the cortex may sent the signals, and are extremely efficient in their use of represent sensory information and motor commands. power compared to digital VLSI chips. A new branch of A NEURAL NETWORK model that simplifies the intrinsic engineering called “neuromorphic engineering” has arisen properties of neurons can help us understand the informa- to exploit this technology. tion contained in populations of neurons and the computa- Recently, analog VLSI chips have been designed and built tional consequences. An example of this approach is a that mimic the detailed biophysical properties of neurons, recent model of parietal cortex (Pouget and Sejnowski including dendritic processing and synaptic conductances 1997) based on the response properties of cortical neurons, (Douglas, Mahowald, and Mead 1995), which has opened which are involved in representing spatial location of the possibility of building a “silicon cortex.” Protocols are objects in the environment. Examining which reference being designed for long-distance communication between 168 Computational Psycholinguistics analog VLSI chips using the equivalent of all-or-none spikes, Newsome, W. T., K. H. Britten, and J. A. Movshon. (1989). Neu- ronal correlates of a perceptual decision. Nature 341: 52–54 to mimic long-distance communication between neurons. Pouget, A., and T. J. Sejnowski. (1997). A new view of hemine- Many of the design issues that govern the evolution of glect based on the response properties of parietal neurons. biological systems also arise in neuromorphic systems, such Philosophical Transactions of the Royal Society 352: 1449– as the trade-off in cost between short-range connections and 1459. expensive long-range communication. Computational mod- Rieke, F., D. Warland, R. de Ruyter van Steveninck, and W. Bialek. els that quantify this trade-off and apply a minimization pro- (1996). Spikes: Exploring the Neural Code. Cambridge, MA: cedure can predict the overall organization of topographical MIT Press. maps and columnar organization of the CEREBRAL CORTEX. Ritz, R., and T. J. Sejnowski. (1997). Synchronous oscillatory Although brain models are now routinely used as tools activity in sensory systems: New vistas on mechanisms. Cur- for interpreting data and generating hypotheses, we are still rent Opinion in Neurobiology 7: 536–546. Sejnowski, T. J. (1995). Sleep and memory. Current Biology 5: a long way from having explanatory theories of brain func- 832–834. tion. For example, despite the relatively stereotyped ana- tomical structure of the CEREBELLUM, we still do not understand its computational functions. Recent evidence Computational Psycholinguistics from functional imaging of the cerebellum suggests that it is involved in higher cognitive functions, and not just a motor In PSYCHOLINGUISTICS, computational models are becom- controller. Modeling studies may help to sort out competing ing increasingly important both for helping us understand hypotheses. This has already occurred in the oculomotor and develop our theories and for deriving empirical predic- system, which has a long tradition of using control theory tions from those theories. How a theory of language pro- models to guide experimental studies. cessing behaves usually depends not just on the mechanics Computational neuroscience is a relatively young, rap- of the model itself, but also on the properties of the linguis- idly growing discipline. Although we can now simulate only tic input. Even when the theory is conceptually simple, the small parts of neural systems, as digital computers continue interaction between theory and language is often too com- to increase in speed, it should become possible to approach plex to be explored without the benefit of computer simula- more complex problems. Most of the models developed thus tions. It is no surprise then that computational models have far have been aimed at interpreting experimental data and been at the center of some of the most significant recent providing a conceptual framework for the dynamic proper- developments in psycholinguistics. ties of neural systems. A more comprehensive theory of The main area of contact with empirical data has been brain function should arise as we gain a broader understand- made by models operating roughly at the level of the word. ing of the computational resources of nervous systems at all Although there are active and productive efforts underway levels of organization. to develop models of higher-level processes such as syntac- See also COMPUTATION AND THE BRAIN; COMPUTATIONAL tic parsing (Kempen and Vosse 1989; McRoy and Hirst NEUROANATOMY; COMPUTATIONAL THEORY OF MIND 1990; Marcus 1980) and discourse (Kintsch and van Dijk —Terrence J. Sejnowski 1978; Kintsch 1988; Sharkey 1990), the complexity of these processes makes it harder to derive detailed experimental References predictions. The neighborhood activation model (Luce, Pisoni, and Arbib, A. M. (1995). The Handbook of Brain Theory and Neural Goldinger 1990) gives a computational account of isolated Networks. Cambridge, MA: MIT Press. word recognition, but only TRACE (McClelland and Elman Churchland P. S., and T. J. Sejnowski. (1992). The Computational Brain. Cambridge, MA: MIT Press. 1986) and Shortlist (Norris 1994a) have been applied to the Crick, F. H. C. (1994). The Astonishing Hypothesis: The Scientific more difficult problem of how words can be recognized in Search for the Soul. New York: Scribner. continuous speech, where the input may contain no reliable Douglas R., M. Mahowald, and C. Mead. (1995). Neuromorphic cues to indicate where one word ends and another begins. analogue VLSI. Annual Review of Neuroscience 18: 255–281. Both of these models are descendants of McClelland and Hebb, D. O. (1949). Organization of Behavior. New York: Wiley. Rumelhart’s (1981) connectionist interactive activation Hodgkin, A. L., and A. F. Huxley. (1952). A quantitative descrip- model (IAM) of VISUAL WORD RECOGNITION. The central tion of membrane current and its application to conduction and principle of both models is that the input can activate multi- excitation in nerve. J. Physiol. 117: 500–544. ple word candidates, represented by nodes in a network, and Hopfield, J. J. (1982). Neural networks and physical systems with that these candidates then compete with each other by means emergent collective computational abilities. Proc Natl Acad Sci USA 79: 2554–2558. of inhibitory links between overlapping candidates. Thus the Hubel, D., and T. Wiesel. (1962). Receptive fields, binocular inter- spoken input “get in” might activate “tin” as well as “get” action and functional architecture in the cat's visual cortex. and “in,” but “tin” would be inhibited by the other two over- Journal of Physiology 160: 106–154. lapping words. TRACE and Shortlist represent opposite Koch, C., and I. Segev. (1998). Methods in Neuronal Modeling: positions in the debate over whether SPOKEN WORD RECOG- From Synapses to Networks. Second edition. Cambridge, MA: NITION is an interactive process. In TRACE there is continu- MIT Press. ous interaction between the lexical and phonemic levels of Marr, D. (1982). Vision. San Francisco: Freeman. representation, whereas Shortlist has a completely bottom- Mead, C., and M. Ismail, Eds. (1989). Analog VLSI implementa- up, modular architecture. They also differ in their solution to tion of neural systems. Boston: Kluwer Academic Publishers. Computational Psycholinguistics 169 the problem of how to recognize words beginning at differ- The model was fiercely criticized by Pinker and Prince ent points in time. TRACE uses a permanent set of complete (1988). Later models by MacWhinney and Leinbach (1991) lexical networks beginning at each point where a word and Plunkett and Marchman (1991, 1993), using backpropa- might begin. In Shortlist the network performing the lexical gation with hidden units, rectified some of the technical defi- competition is created dynamically and contains only those ciencies of the original model, and claimed to give a more candidates identified by a bottom-up analysis of the input. accurate account of the developmental data, but the debate On a purely practical level, at least, this has the advantage of between the connectionist and symbolic camps continues. enabling Shortlist to perform simulations with realistically Recently reported neuropsychological and neuroimaging sized lexicons of twenty or thirty thousand words. Shortlist data suggest a neuroanatomical distinction between mecha- has also been extended (Norris, McQueen and Cutler 1995) nisms underlying the rule-based and non-rule-based pro- to incorporate the metrical segmentation strategy of Cutler cesses (e.g., Marslen-Wilson and Tyler forthcoming). and Norris (1988), which enables the model to make use of A parallel set of arguments has surrounded models of metrical cues to word boundaries. reading aloud. The relationship between spelling and sound The most significant nonconnectionist model of spoken is another example of a quasi-regular system where back- word recognition has been Oden and Massaro’s (1978) propagation networks have been used to give a unitary fuzzy logical model of perception (FLMP). FLMP differs account of the READING process rather than incorporating from TRACE and Shortlist in that it can be seen as a generic spelling-to-sound rules and a list of exception words (e.g., account of how decisions are made on the basis of informa- yacht, choir) not pronounced according to the rules (Seiden- tion from different sources. FLMP itself has nothing to say, berg and McClelland 1989; Plaut et al. 1996). The more tra- for example, about the competition process vital for the rec- ditional two-process view is represented by Coltheart et al. ognition of words in continuous speech in both TRACE and (1993), while an interactive activation model by Norris Shortlist. Comparisons between FLMP and TRACE in (1994b) takes an intermediate stance. terms of their treatment of the relationship between lexical See also COGNITIVE MODELING, CONNECTIONIST; COM- and phonemic information led to a major revision of the PUTATIONAL LINGUISTICS; CONNECTIONIST APPROACHES TO IAM framework and the development of the stochastic LANGUAGE; PROSODY AND INTONATION, PROCESSING interactive activation model (Massaro 1989; McClelland ISSUES; SENTENCE PROCESSING 1991). —Dennis Norris IAMs and spreading activation models have also been predominant in the area of LANGUAGE PRODUCTION (Dell References 1986, 1988; Harley 1993; Roelofs 1992; see also Houghton Coltheart, M., B. Curtis, P. Atkins, and M. Haller. (1993). Models 1990). Dell’s model is designed to account for the nature of reading aloud: Dual-route and parallel-distributed-process- and distribution of speech errors. It takes as its input an ing approaches. Psychological Review 100: 589–608. ordered set of word units (lemmas), representing the Cutler, A., and D. Norris. (1988). The role of strong syllables in speaker’s intended production, and produces as its output a segmentation for lexical access. Journal of Experimental Psy- string of phonemes that may be corrupted or misordered. In chology: Human Perception and Performance 14: 113–121. Dell (1988) the model consists of a lexical network in which Dell, G. S. (1986). A spreading-activation theory of retrieval in word nodes are connected to their constituent phonemes by sentence production. Psychological Review 82: 407–428. reciprocal links and by a word shape network that reads out Dell, G. S. (1988). The retrieval of phonological forms in produc- successive phonemes in the appropriate syllable structure. tion: Tests of predictions from a connectionist model. Journal The main effect of the reciprocal links from phonemes to of Memory and Language 27: 124–142. Dijkstra, T., and K. de Smedt, Eds. (1996). Computational Psycho- words is to give the model a tendency for its errors to form linguistics. London: Taylor and Francis. real rather than nonsense words. Whether the production Harley, T. A. (1993). Phonological activation of semantic competi- system really does contain these feedback links has been the tors during lexical access in speech production. Language and topic of extensive debate between Dell and Levelt. Levelt, Cognitive Processes 8: 291–309. Roelofs, and Meyer (forthcoming) describe the latest com- Houghton, G. (1990). The problem of serial order: A neural net- putational implementation of the WEAVER++ model, work model of sequence learning and recall. In R. Dale, C. which is a noninteractive spreading activation model Mellish, and M. Zock, Eds., Current Research in Natural Lan- designed to account for an extensive body of response time guage Generation. London: Academic Press, pp. 289–319. (RT) data on production as well as speech error data. Kempen, G., and T. Vosse. (1989). Incremental syntactic tree for- The most controversial connectionist model of language mation in human sentence processing: A cognitive architecture based on activation decay and simulated annealing. Connection has been the Rumelhart and McClelland (1986) model for Science 1: 273–290. acquisition of the past tense of verbs. Conventional linguistic Kintsch, W. (1988). The role of knowledge in discourse compre- accounts of quasi-regular systems such as the past tense hension: A construction-integration model. Psychological assume that the proper explanation is in terms of rules and a Review 95: 163–182. list of exceptions. Rumelhart and McClelland modeled the Kintsch, W., and T. A. van Dijk. (1978). Towards a model of text acquisition of the past tense using a simple pattern associator comprehension and production. Psychological Review 85: 363– that mapped the phonology of verb roots (e.g., kill, run) onto 394. their past tense forms (killed, ran). They claimed that their Levelt, W. J. M., A. Roelofs, and A. S. Meyer. (Forthcoming). A model not only explained important facts about the acquisi- theory of lexical access in speech production. Brain and Behav- tion of verbs, but that it did so without using linguistic rules. ioural Sciences. 170 Computational Theory of Mind Computational Theory of Mind Luce, P. A., D. B. Pisoni, and S. D. Goldinger. (1990). Similarity neighborhoods of spoken words. In G. T. M. Altmann, Ed., Cognitive Models of Speech Processing: Psycholinguistic and Computational Perspectives. Cambridge, MA: MIT Press, pp. The computational theory of mind (CTM) holds that the 122–147. mind is a digital computer: a discrete-state device that stores MacWhinney, B., and J. Leinbach. (1991). Implementations are symbolic representations and manipulates them according not conceptualizations: Revising the verb learning model. Cog- to syntactic rules; that thoughts are mental representa- nition 40: 121–157. tions—more specifically, symbolic representations in a Marcus, M. P. (1980). A Theory of Syntactic Recognition for Natu- LANGUAGE OF THOUGHT; and that mental processes are ral Language. Cambridge, MA: MIT Press. causal sequences driven by the syntactic, but not the seman- Marslen-Wilson, W. M., and L. K. Tyler. (Forthcoming). Dissociat- ing types of mental computation. Nature. tic, properties of the symbols. Putnam (1975) was perhaps Massaro, D. W. (1989). Testing between the TRACE model and the the first to articulate CTM, but it has found many propo- Fuzzy Logical Model of Perception. Cognitive Psychology 21: nents, the most influential being Fodor (1975, 1981, 1987, 398–421. 1990, 1993) and Pylyshyn (1980, 1984). McClelland, J. L. (1991). Stochastic interactive processes and the CTM’s proponents view the theory as an extension of the effects of context on perception. Cognitive Psychology 23: 1–44. much older idea that thought is MENTAL REPRESENTA- McClelland, J. L., and J. L. Elman. (1986). The TRACE model of TION—an extension that shows us how a commitment to speech perception. Cognitive Psychology 18: 1–86. mental states can be compatible with a causal account of McClelland, J. L., and D. E. Rumelhart. (1981). An interactive mental processes and with a commitment to materialism activation model of context effects in letter perception: 1. An and the generality of physics. Older breeds of representa- account of the basic findings. Psychological Review 88: 375– 407. tionalism were unable to explain how mental processes McRoy, S. W., and G. Hirst. (1990). Race-based parsing and syn- could be semantically coherent—how thoughts could follow tactic disambiguation. Cognitive Science 14: 313–353. one another in a fashion appropriate to their meanings, Norris, D. (1994a). Shortlist: A connectionist model of continuous while also being bona fide causal processes that did not speech recognition. Cognition 52: 189–234. depend on an inner homunculus who understood the mean- Norris, D. (1994b). A quantitative multiple-levels model of reading ings of the representations. Using formalization and digital aloud. Journal of Experimental Psychology: Human Perception computers, however, we can explain how this occurs. For- and Performance 20(6): 1212–1232. malization shows us how to link semantics to syntax. For Norris, D., J. M. McQueen, and A. Cutler. (1995). Competition any formalizable symbol system, it is possible to develop a and segmentation in spoken word recognition. Journal of set of formal derivation rules, based wholly on syntactic Experimental Psychology: Learning, Memory, and Cognition 21: 1209–1228. properties, that license all and only the inferences permissi- Oden, G. C., and D. W. Massaro. (1978). Integration of featural ble on semantic grounds. Computers show us how to link information in speech perception. Psychological Review 1985: syntax to causation. For any finite formal system, it is possi- 172–191. ble to construct a digital computer that automates the deri- Pinker, S., and A. Prince. (1988). On language and connectionism: vations of that system. Thus, together, formalization and Analysis of a parallel distributed processing model of language computation show us how to link semantics to causation in a acquisition. Cognition 28: 73–193. material system like a digital computer: design a set of syn- Plaut, D. C., J. L. McClelland, M. Seidenberg, and K. E. Patterson. tactic rules that “track” the semantic properties of the sym- (1996). Understanding normal and impaired word reading: bols (i.e., formalize the system), and then implement those Computational principles in quasi-regular domains. Psycholog- rules in a computer. Because digital computers are purely ical Review 103: 56–115. Plunkett, K., and V. Marchman. (1991). U-shaped learning and fre- physical systems, this shows us it is possible for a purely quency effects in a multi-layered perceptron: Implications for physical system to carry out symbolic inferences that child language acquisition. Cognition 38: 43–102. respect the semantics of the symbols without recourse to a Plunkett, K., and V. Marchman. (1993). From rote learning to sys- homunculus or to any other nonphysical agency. Syntactic tem building: Acquiring verb morphology in children and con- properties are the causal determinants of reasoning, syntax nectionist nets. Cognition 48: 21–69. tracks semantics, and syntactic properties can be imple- Roelofs, A. (1992). A spreading-activation theory of lemma mented in a physical system. retrieval in speaking. Cognition 42: 107–142. CTM has been touted both for its connections to success- Roelofs, A. (1996). Serial order in planning the production of suc- ful empirical research in cognitive science and for its prom- cessive morphemes of a word. Journal of Memory and Lan- ise in resolving philosophical problems. The main argument guage 35: 854–876. Rumelhart, D. E., and J. L. McClelland. (1986). On learning the in favor of the language of thought hypothesis and CTM has past tense of English verbs. In McClelland and Rumelhart, been the “only game in town” argument: cognitive theories Eds., Parallel Distributed Processing: Explorations in the of language, learning, and other psychological phenomena Microstructure of Cognition. Cambridge, MA: MIT Press. are the only viable theories we possess, and these theories Seidenberg, M., and J. L. McClelland. (1989). A distributed devel- presuppose an inner representational system. Therefore we opmental model of word recognition and naming. Psychologi- have a prima facie commitment to the existence of such a cal Review 96: 523–568. representational system (Fodor 1975). Some have claimed Sharkey, N. E. (1990). A connectionist model of text comprehen- that CTM also explains the INTENTIONALITY of mental sion. In D.A. Balota and G. B. Flores d’Arcais, Eds., Compre- states and that it reconciles mentalism with materialism. hension Processes in Reading. Hillsdale, NJ: Erlbaum, pp. The meanings and intentionality of mental states are “inher- 487–514. Computational Theory of Mind 171 ited from” the meanings and intentionality of the “men- the typing of mental states needed to be insensitive to fea- talese” symbols (Fodor 1981). And because symbols, the tures outside of the cognizer because the computational ultimate bearers of semantic properties and intentionality, processes that determined thought have access only to men- can both have meaning and be physical objects, there is not tal representations. At the same time, CTM required that even a prima facie conflict between a commitment to the typing of mental states reflect their semantic properties. semantics and intentionality and a commitment to material- These two commitments together seemed to be incompati- ism. Finally, CTM has been held to explain the generative ble with externalist theories of content, which hold that the and creative powers of thought that result from the COMPO- meanings of many terms are at least partially determined by SITIONALITY of the language of thought. Chomskian linguis- factors that lie outside of the cognizer, such as its physical tics shows us how an infinite number of possible sentences (Putnam 1975) and linguistic (Burge 1979, 1986) environ- can be generated out of a finite number of atomic lexical ment. This was used by some externalists (e.g., Baker units, syntactic structures, and transformation rules. If the 1987) as an argument against computationalism, and was basis of thought is a symbolic language, these same used at least at one time by Fodor (1980) as a reason to resources can be applied directly to explain the composi- reject externalism. Nevertheless, at least some computa- tionality of thought. tionalists, including Fodor (1993), have now embraced Although CTM gained a great deal of currency in the late strategies for reconciling computational theories of mental 1970s and 1980s, it has since been criticized on a number of processes with externalist theories of meaning for mental fronts. First, with philosophers’ rediscovery in the late representations. 1980s of alternative approaches to psychological modeling, See also CHINESE ROOM ARGUMENT; COMPUTATION AND represented in NEURAL NETWORKS and dynamic adaptive THE BRAIN; CONNECTIONISM, PHILOSOPHICAL ISSUES; FUNC- systems, the empirical premise of the “only game in town” TIONALISM; NARROW CONTENT; RULES AND REPRESENTA- argument has been brought into question. Indeed, the main TIONS thrust of philosophical debate about neural networks and —Steven Horst connectionism has been over whether their models of psy- chological phenomena are viable alternatives to rule-and- representation models. References Second, writers such as Dreyfus (1972, 1992) and Wino- grad and Flores (1986) have claimed that much human Baker, L. R. (1987). Saving Belief: A Critique of Physicalism. thought and behavior cannot be reduced to explicit rules, Princeton: Princeton University Press. and hence cannot be formalized or reduced to a computer Burge, T. (1979). Individualism and the mental. In P. French, T. Euhling, and H. Wettstein, Eds., Studies in Epistemology, Mid- program. Thus, even if CTM does say something signifi- west Studies in Philosophy, vol. 4. Minneapolis: University of cant about the parts of human cognition that can be formal- Minnesota Press. ized, there are large portions of human mental life about Burge, T. (1986). Individualism and psychology. Philosophical which it can say nothing. Dreyfus and others have Review 95(1): 3–45. attempted to argue that this includes all expert knowledge Dreyfus, H. (1972). What Computers Can’t Do. New York: Harper and such simple skills as knowing how to drive a car or and Row. order in a restaurant. Dreyfus, H. (1992). What Computers Still Can’t Do. Cambridge, A third line of criticism has been directed at CTM’s use MA: MIT Press. of symbolic meaning to explain the semantics of thought, on Fodor, J. (1975). The Language of Thought. New York: Crowell. the grounds that symbolic meaning is derivative from the Fodor, J. (1980). Methodological solipsism considered as a research strategy in cognitive science. Behavioral and Brain intentionality of thought, either causally (Searle 1980; Sciences 3: 63–73. Haugeland 1978; Sayre 1986) or conceptually (Horst 1996). Fodor, J. (1981). Representations. Cambridge, MA: MIT Press. Thus the attempt to explain intentionality by appeal to sym- Fodor, J. (1987). Psychosemantics. Cambridge, MA: MIT Press. bols is circular and regressive. Searle (1990) and Horst Fodor, J. (1990). A Theory of Content and Other Essays. Cam- (1996) have taken this line of argument even further, claim- bridge, MA: MIT Press. ing that the “representations” in computers are not even Fodor, J. (1993). The Elm and the Expert. Cambridge, MA: MIT symbolic or syntactic in their own right, but possess these Press. properties by virtue of the intentions and conventions of Haugeland, J. (1978). The nature and plausibility of cognitivism. computer users: a digital machine not connected to our inter- Behavioral and Brain Sciences 2: 215–226. pretive practices has a “syntax” only in a metaphorical sense Haugeland, J., Ed. (1981). Mind Design. Cambridge, MA: MIT Press. of that word. Horst’s version of these criticisms also yields Horst, S. (1996). Symbols, Computation and Intentionality: A Cri- an argument against the claim to reconcile mentalism with tique of the Computational Theory of Mind. Berkeley and Los materialism: what digital computers show us how to do is to Angeles: University of California Press. link convention-laden symbolic meaning with CAUSATION Putnam, H. (1975). The Meaning of “Meaning.” In K. Gunderson, by way of convention-laden syntax, not to link the sense of Ed., Language, Mind and Knowledge. Minnesota Studies in the “meaning” attributed to mental states with causation. Philosophy of Science, vol. 7. Minneapolis: University of Min- A fourth line of criticism has come from advocates of nesota Press. externalist theories of meaning. For many years, advocates Pylyshyn, Z. (1980). Computation and cognition: Issues in the of CTM tended also to be advocates of a “methodological foundation of cognitive science. Behavioral and Brain Sciences solipsism” (Fodor 1980) or INDIVIDUALISM who held that 3: 111–132. 172 Computational Vision the scene. Computational studies suggest three primary rep- Pylyshyn, Z. (1984). Computation and Cognition: Toward a Foun- dation for Cognitive Science. Cambridge, MA: MIT Press. resentational stages. Early representations may capture Sayre, K. (1986). Intentionality and information processing: An information such as the location, contrast, and sharpness of alternative model for cognitive science. Behavioral and Brain significant intensity changes or edges in the image. Such Sciences 9(1): 121–138. changes correspond to physical features such as object Searle, J. (1980). Minds, brains and programs. Behavioral and boundaries, texture contours, and markings on object sur- Brain Sciences 3: 417–424. faces, shadow boundaries, and highlights. In the case of a Searle, J. (1984). Minds, brains and science. Cambridge, MA: Har- dynamically changing scene, the early representations may vard University Press. also describe the direction and speed of movement of image Searle, J. (1990). Presidential address. Proceedings of the Ameri- intensity changes. Intermediate representations describe can Philosophical Association. information about the three-dimensional (3-D) shape of Searle, J. (1992). The Rediscovery of the Mind. Cambridge, MA: MIT Press. object surfaces from the perspective of the viewer, such as Winograd, T., and F. Flores. (1986). Understanding Computers the orientation of small surface regions or the distance to and Cognition. Norwood, NJ: Ablex. surface points from the eye. Such representations may also describe the motion of surface features in three dimensions. Visual processing may then proceed to higher-level repre- Further Readings sentations of objects that describe their 3-D shape, form, Cummins, R. (1989). Meaning and Mental Representation. Cam- and orientation relative to a coordinate frame based on the bridge, MA: MIT Press. objects or on a fixed location in the world. Tasks such as Garfield, J. (1988). Belief in Psychology: A Study in the Ontology object recognition, object manipulation, and navigation may of Mind. Cambridge, MA: MIT Press. operate from the intermediate or higher-level representa- Newell, A., and H. Simon. (1975). Computer science as empirical tions of the 3-D layout of objects in the world. (See also inquiry. (1975 Turing Lecture.) Reprinted in J. Haugeland, Ed., MACHINE VISION for a discussion of representations for Mind Design. Cambridge, MA: MIT Press, 1981, pp. 35–66. visual processing.) Putnam, H. (1960). Minds and machines. In S. Hook, Ed., Dimen- Models for computing the early representations of inten- sions of Mind. New York: New York University Press, pp. 138– 164. sity edges typically begin by filtering the image with filters Putnam, H. (1961). Brains and behavior. Reprinted in Ned Block, that smooth and differentiate the image intensities. Smooth- Ed., Readings in Philosophy of Psychology. Cambridge, MA: ing at multiple spatial scales allows the simultaneous repre- Harvard University Press, 1980, pp. 24–36. sentation of the gross structure of image contours, while Putnam, H. (1967). The nature of mental states. In W. H. Capitan preserving the fine detail of surface markings and TEXTURE. and D. D. Merrill, Eds., Art, Mind and Religion. Pittsburgh: The differentiation operation transforms the image into a University of Pittsburgh Press. Reprinted in Ned Block, Ed., representation that facilitates the localization of edge con- Readings in Philosophy of Psychology. Cambridge, MA: Har- tours and computation of properties such as their sharpness vard University Press, 1980, pp. 223–231. and contrast. Significant intensity changes may correspond Rumelhart, D. E., J. McClelland, and the PDP Research Group. to maxima, or peaks, in the first derivative, or to zero-cross- (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: MIT Press. ings in the second derivative, of the image intensities. Sub- Sayre, K. (1987). Cognitive science and the problem of semantic sequent image analysis may operate on a representation of content. Synthèse 70: 247–269. image contours. Alternative models suggest that later pro- Smolensky, P. (1988). The proper treatment of Connectionism. cesses operate directly on the result of the filtering stage. Behavioral and Brain Sciences 11(1): 1–74. Several sources of information are used to compute the 3-D shape of object surfaces. Binocular stereo uses the relative location of corresponding features in the images Computational Vision seen by the left and right eyes to infer the distance to object surfaces. Abrupt changes in motion between adja- The analysis of a visual image yields a rich understanding cent image regions indicate object boundaries, while of what is in the world, where objects are located, and how smooth variations in the direction and speed of motion they are changing with time, allowing a biological or within image regions can be used to recover surface shape. machine system to recognize and manipulate objects and to Other cues include systematic variations in the geometric interact physically with its environment. The computational structure of image textures, such as changes in the orienta- approach to the study of vision explores the information- tion, size, or density of texture elements; image shading, processing mechanisms needed to extract this important which refers to smooth variations of intensity that occur as information. The integration of a computational perspective surfaces bend toward or away from a light source; and per- with experimental studies of biological vision systems from spective, which refers to the distortion of object contours psychology and neuroscience can ultimately yield a more that results from the perspective projection of the 3-D complete functional understanding of the neural mecha- scene onto the two-dimensional (2-D) image. (See STRUC- nisms underlying visual processing. TURE FROM VISUAL INFORMATION SOURCES and STEREO Vision begins with a large array of measurements of the AND MOTION PERCEPTION for further discussion of visual light reflected from object surfaces onto the eye. Analysis cues to structure and form.) then proceeds in multiple stages, with each producing The computation of 3-D structure cannot proceed unam- increasingly more useful representations of information in biguously from the 2-D image alone. Models also incorpo- Computational Vision 173 rate physical constraints that capture the typical behavior of tion models can be classified into three main approaches. objects in the world. For the early and intermediate stages of The first assumes that objects have certain invariant proper- processing, these constraints are as general as possible. ties that are common to all of their views. Recognition typi- Existing models use constraints based on the following typi- cally proceeds in this case by first computing a set of simple cal behaviors: object surfaces are coherent and typically geometric properties of a viewed object from image infor- vary smoothly and continuously from one image location to mation, and then selecting an object model that offers the the next; objects usually move rigidly, at least within small closest fit to the set of observed property values. The second image regions; illumination usually shines from above the approach focuses on the decomposition of objects into observer; changes in the reflectance properties of a surface primitive, salient parts. In this case, models first find primi- (such as its color) usually occur abruptly while illumination tive parts in an image, and then identify objects on the basis may vary slowly across the image. Models also incorporate of the detected parts and their spatial arrangement. The best- the known physics of how the image is formed from the per- known model of this type was proposed by Biederman spective projection of light reflected from surfaces onto the (1985; see Ullman 1996). The third major approach to eyes. Computational studies of vision identify appropriate object recognition uses a process that explicitly compen- physical constraints and show how they can be built into a sates for the transformation between a viewed object and its specific algorithm for computing the image representations. stored model. One example of this approach proposed by Among cues for recovering 3-D structure from 2-D Ullman (1996) first computes the geometric transformations images, the two most extensively studied by computational that best explain the mapping between a viewed object and and biological researchers are binocular stereo and motion. each object model in a database. A second stage then recog- For both stereo and motion measurement, the most chal- nizes the object by finding which combination of object lenging computational problem is the correspondence prob- model and transformation best matches the viewed object. lem. Given a representation of features in the left and right (Some specific models of recognition are described in images, or two images displaced in time, a matching process Faugeras 1993 and Ullman 1996.) must identify pairs of features in the two images that are See also OBJECT RECOGNITION, ANIMAL STUDIES; OBJECT projections of the same physical structure in space. Many RECOGNITION, HUMAN NEUROPSYCHOLOGY; SURFACE PER- models attempt to match edge features in the two images. CEPTION; VISUAL OBJECT RECOGNITION, AI; VISION AND Some models, such as an early model of human stereo LEARNING vision proposed by MARR and Poggio (Marr 1982), simulta- —Ellen Hildreth neously match image edge representations at multiple spa- tial scales. The correspondence of features at a coarse scale References can provide a rough 3-D layout of a scene that can guide the correspondence of features at finer scales. Information such Biederman, I. (1985). Human image understanding: Recent research and a theory. Computer Vision, Graphics, and Image as the orientation or contrast of edge features can help iden- Processing 32: 29–73. tify pairs of similar features likely to correspond to one Faugeras, O. (1993). Three-Dimensional Computer Vision: A Geo- another. Stereo and motion models also typically use physi- metric Viewpoint. Cambridge, MA: MIT Press. cal constraints such as uniqueness (i.e., features in one Haralick, R. M., and L. G. Shapiro. (1992). Computer and Robot image have a unique corresponding feature in the other) and Vision. 2 vols. Reading, MA: Addison-Wesley. continuity or smoothness (i.e., nearby features in the image Hildreth, E. C., and S. Ullman. (1989). The computational study of lie at similar depths or have a similar direction and speed of vision. In M. Posner, Ed., Foundations of Cognitive Science. motion). Many models incorporate some form of optimiza- Cambridge, MA: MIT Press, pp. 581–630. tion: a solution is found that best satisfies a complex set of Horn, B. K. P. (1989). Shape from Shading. Cambridge, MA: MIT criteria based on all of the physical constraints taken Press. Kasturi, R., and R. C. Jain, Eds. (1991). Computer Vision: Princi- together. In the case of motion processing, the analysis of ples. Los Alamitos, CA: IEEE Computer Society Press. the movement of features in the changing 2-D image is fol- Landy, M. S., and J. A. Movshon, Eds. (1991). Computational lowed by a process that infers the 3-D structure of the mov- Models of Visual Processing. Cambridge, MA: MIT Press. ing features. Most computational models of this inference Marr, D. (1982). Vision. San Francisco: Freeman. use the rigidity constraint: they attempt to find a rigidly Martin, W. N., and J. K. Aggarwal, Eds. (1988). Motion Under- moving 3-D structure consistent with the computed 2-D standing: Robot and Human Vision. Boston: Kluwer. image motion. (For specific models of stereo and motion Ullman, S. (1996). High-level Vision: Object Recognition and processing, see Faugeras 1993; Hildreth and Ullman 1989; Visual Cognition. Cambridge, MA: MIT Press. Kasturi and Jain 1991; Landy and Movshon 1991; Marr Wandell, B. A. (1995). Foundations of Vision. Sunderland, MA: 1982; Martin and Aggarwal 1988; and Wandell 1995.) Sinauer. Much attention has been devoted to the higher-level Further Readings problem of object recognition, which requires that a repre- sentation derived from a viewed object in the image be Aloimonos, J., and D. Shulman. (1989). Integration of Visual Mod- matched with internal representations of a similar object ules. Boston: Academic Press. stored in memory. Most computational models consider the Blake, A., and A. Yuille, Eds. (1992). Active Vision. Cambridge, recognition of objects on the basis of their 2-D or 3-D MA: MIT Press. shape. Recognition is difficult because a given 3-D object Blake, A., and A. Zisserman. (1987). Visual Reconstruction. Cam- can have many appearances in the 2-D image. Most recogni- bridge: MIT Press. 174 Computer-Human Interaction scalar weight, ranging from positive to negative depending Brooks, R. A. (1991). Intelligence without representation. Artifi- cial Intelligence Journal 47: 139–160. on whether the synapse was excitatory or inhibitory. The Fischler, M. A., and O. Firschein, Eds. (1987). Readings in Com- contributions from all synapses, multiplied by their synaptic puter Vision: Issues, Problems, Principles, and Paradigms. Los weights, add linearly at the cell body. If this sum exceeds a Altos, CA: Kaufman. threshold, a spike is generated. McCulloch and Pitts argued Grimson, W. E. L. (1990). Object Recognition by Computer: The that, with the addition of memory, a sufficiently large num- Role of Geometric Constraints. Cambridge, MA: MIT Press. ber of these logical “neurons,” wired together in an appro- Horn, B. K. P. (1986). Robot Vision. Cambridge, MA: MIT Press. priate manner, can compute anything that can be computed Kanade, T., Ed. (1987). Three-Dimensional Machine Vision. Bos- on any digital computer. ton: Kluwer. LEARNING entered this picture in the form of HEBB’s Koenderink, J. J. (1990). Solid Shape. Cambridge, MA: MIT Press. (1949) rule, postulating that the synapse between neuron A Levine, M. D. (1985). Vision in Man and Machine. New York: McGraw-Hill. and neuron B increases its “weight” if activity in A occurs at Lowe, D. (1985). Perceptual Organization and Visual Recognition. the same time as activity in B. Half a century later, we have Cambridge, MA: MIT Press. solid evidence that such changes do take place in a well- Malik, J. (1995). Perception. In S. J. Russell and P. Norvig, Eds., studied phenomenon termed LONG-TERM POTENTIATION Artificial Intelligence: A Modern Approach. Englewood Cliffs, (LTP). Here, the synaptic weight increases for days or even NJ: Prentice Hall, pp. 724–756. weeks. It can be induced by simultaneous activity at the pre- Mayhew, J. E. W., and J. P. Frisby. (1991). 3-D Model Recognition and postsynaptic termini, in agreement with Hebb’s rule from Stereoscopic Cues. Cambridge, MA: MIT Press. (Nicoll and Malenka 1995). Of more recent vintage is the Mitiche, A. (1994). Computational Analysis of Visual Motion. New discovery of a complementary process, a decrease in synap- York: Plenum Press. tic weight called “long-term depression” (Stevens 1996). Mundy, J. L., and A. Zisserman, Eds. (1992). Geometric Invari- ance in Computer Vision. Cambridge, MA: MIT Press. Over the last few years, it has become abundantly clear Ullman, S. (1984). Visual routines. Cognition 18: 97–159. that dendrites do much more than simply convey synaptic inputs to the cell body for linear summation. Dendrites have traditionally been treated as passive cables, surrounded by a Computer-Human Interaction membrane that can be modeled by a conductance in parallel with a capacitance (Segev, Rinzel, and Shepherd 1995). See HUMAN-COMPUTER INTERACTION When synaptic input is applied, such an arrangement acts as a low-pass filter, removing the high frequencies but per- forming no other significant information processing. Den- Computing in Single Neurons drites with such a passive membrane would not really disturb our view of neurons as linear threshold units. Over the past few decades, NEURAL NETWORKS have pro- As long ago as the 1950s, Hodgkin and Huxley (1952) vided the dominant framework for understanding how the showed how the transient changes in such active voltage- brain implements the computations necessary for its sur- dependent membrane conductances generate and shape the vival. At the heart of these networks are very dynamic and action potential. But it was assumed that they are limited to complex processing nodes, individual neurons. A typical the axon and the adjacent cell body. We now know that NEURON in the CEREBRAL CORTEX receives input from a few many dendrites of pyramidal cells are endowed with a rela- thousand fellow neurons and, in turn, passes on messages to tively homogeneous distribution of sodium conductances as a few thousand other neurons. One hundred thousand such cells are packed into a cubic millimeter of cortical tissue, which amounts to 4 kilometers of axonal wiring, 500 meters of dendrites, and close to one billion synapses (Braitenberg and Schüz 1991). Synapses, the specialized connections between two neu- rons, come in two basic flavors, excitatory and inhibitory. An excitatory synapse will reduce the electrical potential across the membrane of its target cell (that is, it will depolarize the cell), while an inhibitory synapse will hyperpolarize the cell. If the membrane potential at the cell body exceeds a particu- lar threshold value, the neuron generates a short, millisec- ond-long pulse, called an “action potential” or “spike” (figure 1). Otherwise, it remains silent. The amount of syn- aptic input determines how fast the cell generates spikes, and Figure 1. Action potentials recorded in a single neuron in the visual these are in turn conveyed to the next target cells through the cortex, shown as a plot of membrane potential against time. (a) A output axon. Information processing in an average human visual stimulus causes a sustained barrage of synaptic input, which cortex then relies on the proper interconnection of about 4 × triggers three cycles of depolarization. The first cycle does not 1010 such neurons in a network of stupendous size. reach the threshold for generating a spike, but the second and third In 1943, MCCULLOCH and PITTS showed that this view is do. Each spike lasts for about 1 msec. (Data provided by B. Ahmed, at least plausible. They described each synapse by a single N. Berman, and K. Martin.) Computing in Single Neurons 175 well as a diversity of calcium membrane conductances the behavior of a monkey in a visual discrimination task can (Johnston et al. 1996). be statistically predicted by counting spikes in a single neu- What is the function of these active conductances? One ron in the visual cortex (Newsome, Britten, and Movshon likely explanation (supported by computer models) is that 1989). Robustness of this encoding is further ensured by calcium and potassium membrane conductances in the dis- averaging the response over a large number of similar cells tant dendrites can selectively linearize and amplify this input. (a process known as population coding). Voltage-dependent conductances can also subserve a specific Recent years have witnessed a resurgence in informa- nonlinear operation, multiplication, one of the most common tion-theoretic approaches to the nervous system (Rieke et al. operations carried out in the nervous system (Koch and Pog- 1996). We know that individual neurons, such as motion- gio 1992). If the dendritic tree contains sodium or calcium selective cells in the fly or single auditory inputs in the bull- conductances, or if the synapses use a particular type of frog, can encode between 1 and 3 bits of sensory informa- receptor (the so-called NMDA receptor), the inputs can inter- tion per output spike, amounting to rates of up to 300 bits act synergistically, with the strongest response occurring per second. This information is encoded using changes in when inputs from different neurons are located close to each the instantaneous interspike interval between a handful of other on a patch of dendritic membrane. Simulations (Mel spikes. Such a temporal encoding mechanism is within 10 to 1994) show that the firing rate of such a neuron is propor- 40 percent of the theoretical maximum allowed by the spike tional to the product, rather than the sum, of its inputs. train variability. This implies that individual spikes can Ramòn y CAJAL postulated the law of “dynamic polariza- carry significant amounts of information, at odds with the tion,” stipulating that dendrites and cell bodies are the idea that neurons are unreliable and can only signal in the receptive areas for the synaptic input, and that the resulting aggregate. At these rates, the optic nerve would convey output pulses are transmitted unidirectionally along the between one and ten million bits per second. (This com- axon to its targets. From work on brain slices, however, it pares to a ten-speed CD-ROM drive, which transfers infor- seems that this is by no means the whole story. Single action mation at 1.5 million bits per second.) potentials can propagate not only forward from their initia- Timing precision of spiking across populations of simul- tion site along the axon, but also backward into the dendritic taneously firing neurons is believed to be a key element in tree, a phenomenon known as antidromic spike invasion neuronal strategies for encoding perceptual information in (Stuart and Sakmann 1994). It remains unclear whether den- the sensory pathways (Abeles 1990; Singer and Gray 1995). drites themselves can initiate action potentials. If spikes can Yet if information is indeed embodied in a temporal code, be generated locally under physiological conditions, they how, if at all, is it decoded by the target neurons? Do neu- could implement powerful logical operations far away from rons act as coincidence detectors, able to detect the arrival the cell body (Softky 1994). time of incoming spikes at a millisecond or better resolu- What of the role of time in neuronal processing? There tion? Or do they integrate more than a hundred or so rela- are two main aspects to this issue: (1) the relationship tively small inputs over many tens of milliseconds until the between the timing of an event in the external world and threshold for spike initiation is reached (Softky 1995; see the timing of the representation of that event at the single- figure 1)? neuron level; (2) the accuracy and importance of the rela- Current thinking about computation has the brain as a tive timing of spikes between two or more neurons. hybrid computer. Individual nerve cells convert the incom- Regarding the first aspect, some animals can discrimi- ing streams of binary pulses into analog, spatially distributed nate intervals of the order of a microsecond (for instance, to variables: the postsynaptic membrane potential and calcium localize sounds), implying that the timing of sensory stimuli distribution throughout the dendritic tree. This transforma- must be represented with similar precision in the brain, and tion involves highly dynamic synapses that adapt to their is probably based on the average timing of spikes in a popu- input. Information is then processed in the analog domain, lation of cells. It is also possible to measure the precision using a number of linear and nonlinear operations (multipli- with which individual cells track the timing of external cation, saturation, amplification, thresholding) implemented events. For instance, certain cells in the monkey VISUAL CORTEX are preferentially stimulated by moving stimuli, and these cells can modulate their firing rate with a preci- sion of less than 10 msec (Bair and Koch 1996). The second aspect of the timing issue is the extent to which the exact temporal arrangements of spikes—both within a single neuron and across several neurons—matters for information processing. It is usually assumed that, to cope with the apparent lack of reliability of single cells, the brain makes use of a “firing rate” code. Only the average number of spikes within some suitable time window, say a Figure 2. Variability in neuronal responses (each line in the trace fraction of a second, matters. The detailed pattern of spikes corresponds to a spike of the type shown in figure 1). If the same (figure 2) is thought by many to be largely irrelevant, a stimulus is presented twice in succession, it induces the same hypothesis supported by the existence of a quantitative rela- average firing rate on both trials (about 50 Hz), although the exact tionship between the firing rates of single cortical neurons timing of individual spikes shows random variation. (Data and psychophysical judgments made by monkeys. That is, provided by W. Newsome and K. Britten.) 176 Concepts in the dendritic cable structure and augmented by voltage- Softky, W. R. (1995). Simple codes versus efficient codes. Curr. Opin. Neurobiol 5: 239–247. dependent membrane and synaptic conductances. The result Stevens, C. F. (1996). Strengths and weaknesses in memory. is converted into asynchronous binary pulses and conveyed Nature 381: 471–472. to the following neurons. The functional resolution of these Stuart, G. J., and B. Sakmann. (1994). Active propagation of pulses is in the millisecond range, with temporal synchrony somatic action potentials into neocortical pyramidal cell den- across neurons likely to contribute to coding. Reliability drites. Nature 367: 69–72. could be achieved by pooling the responses of a small num- ber (20–200) of neurons. Further Readings And what of MEMORY? It is everywhere (but cannot be randomly accessed). It resides in the concentration of free Arbib, M., Ed. (1995). The Handbook of Brain Theory and Neural calcium in dendrites and cell body; in the presynaptic termi- Networks. Cambridge, MA: MIT Press. nal; in the density and exact voltage dependency of the vari- Hopfield, J. J. (1995). Pattern-recognition computation using action potential timing for stimulus representation. Nature 376: 33–36 ous ionic conductances; in the density and configuration of Koch, C. (1998) Biophysics of Computation: Information Process- specific proteins in the postsynaptic terminals; and, ulti- ing in Single Neurons. New York: Oxford University Press. mately, in the gene in the cell’s nucleus for lifetime memo- Koch, C., and I. Segev, Eds. (1998). Methods in Neuronal Model- ries. ing: From Ions to Networks. Second edition. Cambridge, MA: See also BINDING BY NEURAL SYNCHRONY; COMPUTA- MIT Press. TION; COMPUTATION AND THE BRAIN; COMPUTATIONAL Shepherd, G., Ed. (1998). The Synaptic Organization of the Brain. NEUROSCIENCE; CORTICAL LOCALIZATION, HISTORY OF; Fourth edition. New York: Oxford University Press. SINGLE-NEURON RECORDING Supplementary information can be found at http://www.klab. caltech.edu —Christof Koch Concepts References Abeles, M. (1990). Corticonics: Neural Circuits of the Cerebral The elements from which propositional thought is con- Cortex. Cambridge University Press. structed, thus providing a means of understanding the Bair, W. and C. Koch. (1996). Temporal precision of spike trains in world, concepts are used to interpret our current experience extrastriate cortex of the behaving monkey. Neural Computa- by classifying it as being of a particular kind, and hence tion 8: 1185–1202. relating it to prior knowledge. The concept of “concept” is Braitenberg, V., and A. Schüz. (1991). Anatomy of the Cortex. Ber- central to many of the cognitive sciences. In cognitive psy- lin: Springer. Hebb, D. O. (1949). The Organization of Behavior: A Neuropsy- chology, conceptual or semantic encoding effects occur in a chological Theory. New York: Wiley. wide range of phenomena in perception, ATTENTION, lan- Hodgkin, A. L., and A. F. Huxley. (1952). A quantitative descrip- guage comprehension, and MEMORY. Concepts are also fun- tion of membrane current and its application to conduction and damental to reasoning in both machine systems and people. excitation in nerve. J. Physiol. 117: 500–544. In AI, concepts are the symbolic elements from which Johnston, D., J. Magee, C. Colbert, and B. Christie. (1996). Active KNOWLEDGE REPRESENTATION systems are built in order to properties of neuronal dendrites. Annu. Rev. Neurosci 19: 165– provide machine-based expertise. Concepts are also often 186. assumed to form the basis for the MEANING of nouns, verbs Koch, C., and T. Poggio. (1992). Multiplying with synapses and neu- and adjectives (see COGNITIVE LINGUISTICS and SEMAN- rons. In T. McKenna, J. Davis, and S. F. Zornetzer, Eds., Single TICS). In behaviorist psychology, a concept is the propensity Neuron Computation. Boston: Academic Press, pp. 315–345. McCulloch, W., and W. Pitts. (1943). A logical calculus of the of an organism to respond differentially to a class of stimuli ideas immanent in nervous activity. Bulletin of Mathematical (for example a pigeon may peck a red key for food, ignoring Biophysics 5: 115–133. other colors). In cultural anthropology, concepts play a cen- Mel, B. W. (1994). Information processing in dendritic trees. Neu- tral role in constituting the individuality of each social ral Computation 6: 1031–1085. group. In comparing philosophy and psychology, it is neces- Newsome, W. T., K. H. Britten, and J. A. Movshon (1989). Nature sary to distinguish philosophical concepts understood as 341: 52–54. abstractions, independent of individual minds, and psycho- Nicoll, R. A., and R. C. Malenka. (1995). Contrasting properties of logical concepts understood as component parts of MENTAL two forms of longterm potentiation in the hippocampus. Nature REPRESENTATIONS of the world (see INDIVIDUALISM). 377: 115–118. Philosophy distinguishes NARROW CONTENT, which is Rieke, F., D. Warland, R. R. D. van Steveninck, and W. Bialek. (1996). Spikes: Exploring the Neural Code. Cambridge, MA: the meaning of a concept in an individual’s mental represen- MIT Press. tation of the world, from broad content, in which the mean- Segev, I., J. Rinzel, and G. Shepherd. (1995). The Theoretical ing of a concept is also partly determined by factors in the Foundation of Dendritic Function: Selected Papers of Wilfrid external world. There has been much debate on the question Rall with Commentaries. Cambridge, MA: MIT Press. of how to individuate the contents of different concepts, and Singer, W., and C. M. Gray. (1995). Visual feature integration and whether this is possible purely in terms of narrow content the temporal correlation hypothesis. Annu. Rev. Neurosci 18: (Fodor 1983; Kripke 1972), and how concepts as purely 555–586. internal symbols in the mind relate to classes of entities in Softky, W. R. (1994). Sub-millisecond coincidence detection in the external world. active dendritic trees. Neuroscience 58: 15–41. Concepts 177 Concepts are considered to play an “intensional” and an more classes. The stimuli in the set are created by manipu- “extensional” role (FREGE 1952). There are different techni- lating values on a number of stimulus dimensions (for cal ways to approach this distinction. One philosophical example, shape or color). A particular value on a particular definition is that the extension is the set of all objects in the dimension constitutes a stimulus feature. The distribution “actual” world which fall under the concept, whereas the of stimuli across the classes to be learned constitutes the intension is the set of objects that fall under the concept in structure of the concept. Training in these experiments typ- “all possible worlds.” In cognitive science a less strict ically involves using trial-and-error learning with feedback. notion of intension has been operationalized as the set of In a subsequent transfer or generalization phase, novel propositional truths associated with a proper understanding stimuli are presented for classification without feedback, to of the concept—for example that chairs are for sitting on. It test what has been learned. Three types of model have been resembles a dictionary definition, in that each concept is explored in this paradigm. “Rule-based” learning models defined by its relation to others. Intensions permit infer- propose that participants try to form hypotheses consistent ences to be drawn, as in “This is a chair, therefore it can be with the feedback in the learning trials (see for example sat upon,” although, as the example illustrates, these infer- Bruner, Goodnow, and Austin 1956). “Prototype” learning ences may be fallible. The extension of a concept is the models propose that participants form representations of class of objects, actions or situations in the actual external the average or prototypical stimulus for each class, and world which the concept represents and to which the con- classify these by judging how similar the new stimulus is to cept term therefore refers (Frege’s “reference”). Frege each prototype. “Exemplar” models propose that partici- argued that intension determines extension; thus the exten- pants store individual exemplars and their classification in sion is the class of things in the world for which the inten- memory, and base the classification on the relative average sion is a true description. This notion of concepts leads to a similarity of a stimulus to the stored exemplars in each research program for the analysis of relevant concepts (such class, with a generally assumed exponential decay of simi- as “moral” or “lie”) in which proposed intensional analyses larity as distance along stimulus dimensions increases of concepts are tested against intuitions of the extension of (Nosofsky 1988). Exemplar models typically provide the the concept, either real or hypothetical. Fodor (1994) has best fits to experimental data, although rules and prototypes advanced arguments against this program. To avoid the cir- may also be used when the experimental conditions are cularity found in dictionaries, the intension of a concept favorable to their formation. NEURAL NETWORK models of must be expressed in terms of more basic concepts (the category learning capture the properties of both prototype “symbol grounding problem” in cognitive science). The and exemplar models because they abstract away from indi- problems involved in grounding concepts have led Fodor to vidual exemplar representations, but at the same time are propose a strongly innatist account of concept acquisition, sensitive to patterns of co-occurrence of particular stimulus according to which all simple concepts form unanalyzable features. units, inherited as part of the structure of the brain. Others The study of categorization learning in the behaviorist have explored ways to ground concepts in more basic per- tradition has generated powerful models of fundamental ceptual symbolic elements (Barsalou 1993). learning processes with an increasing range of application, In the psychology of concepts, there are three main although the connection to other traditions in the psychology research traditions. First, the “cognitive developmental” tra- of concepts (for example, cognitive development or lexical dition, pioneered by PIAGET (1967), seeks to describe the semantics) is still quite weak. As in much behaviorist- ages and stages in the growing conceptual understanding of inspired experimental research, the desire to have full con- children. Concepts are schemas. Through self-directed trol over the stimulus structure has led to the use of stimulus action and experience the assimilation of novel experiences domains with low meaningfulness and hence poor ECOLOGI- or situations to a schema leads to corresponding accommo- CAL VALIDITY. dation of the schema to the experience, and hence to CON- The third research tradition derives from the application CEPTUAL CHANGE and development. Piaget’s theory of adult of psychological methods to lexical semantics, the represen- intelligence has been widely criticized for overestimating tation of word meaning, where concepts are studied through the cognitive capacities of most adults. His claims about the their expression in commonly used words. Within the lack of conceptual understanding in young children have Fregean branch of this tradition, interest has focused on how also been challenged in the literature on conceptual devel- the intensions of concepts are related to their extensions. opment (Carey 1985; Keil 1989). Research in this tradition Tasks have been devised to examine each of these two has also had a major influence on theories of adult concepts aspects of people’s everyday concepts. Intensions are typi- developed within the lexical semantics tradition. cally studied through feature-listing tasks, where people are The second research tradition derives from behaviorist asked to list relevant aspects or attributes of a concept which psychology, for which concepts involve the ability to clas- might be involved in categorization, and then to judge their sify the world into categories (see also CATEGORIZATION importance to the definition of the concept. Extensions are and MACHINE LEARNING). Animal discrimination learning studied by asking people either to generate or to categorize paradigms have been used to explore how people learn and lists of category members. The use of superordinate con- represent new concepts. A typical experiment involves a cepts (for example, birds or tools) allows instances to be controlled stimulus set, usually composed of arbitrary and named with single words. Extensions may also be studied meaningless elements, such as line segments, geometric through the classification of hypothetical or counterfactual symbols, or letters, which has to be classified into two or examples, or through using pictured objects. 178 Concepts Five broad classes of model have been proposed within metal spoons and large wooden spoons are considered the lexical semantics tradition. The “classical” model more typical than small wooden spoons and large metal assumes that concepts are clearly defined by a conjunction spoons (Medin and Shoben 1988). This fact could be evi- of singly necessary and jointly sufficient attributes (Arm- dence for representation through stored exemplars, strong, Gleitman, and Gleitman 1983; Osherson and Smith although it could also be explained by a disjunctive proto- 1981). The first problem for this model is that the attributes type representation. Formally, explicit exemplar models are people list as true or relevant to a concept’s definition fre- generally underpowered for representing lexical concepts, quently include nonnecessary information that is not true of having no means to represent intensional information for all category members (such as that birds can fly), and often stimulus domains that do not have a simple dimensional fail to provide the basis of a necessary and sufficient classi- structure. As a result, they have no way to derive logical cal definition. Second, there are category instances which entailments based on conceptual meaning (for example, show varying degrees of disagreement about their classifica- that all robins are birds). tion both between individuals and for the same individuals The fourth model is the “theory-based” model (Murphy on different occasions (McCloskey and Glucksberg 1978). and Medin 1985), which has strong connections with the Third, clear category members differ in how “typical” they COGNITIVE DEVELOPMENT tradition. Concepts are embedded are judged to be of the category (Rosch 1975). The classical in theoretical understanding of the world. While a prototype view was therefore extended by proposing two kinds of representation of the concept bird would consist of a list of attribute in concept representations—defining features, unconnected attributes, the theory-based representation which form the core definition of the class, and characteris- would also represent theoretical knowledge about the rela- tic features, which are true of typical category members tion of each attribute to others in a complex network of only and which may form the basis of a recognition proce- causal and explanatory links, represented in a structured dure for quick categorization. Keil and Batterman (1984) frame or schema. Birds have wings in order to fly, which reported a development with age from the use of character- allows them to nest in trees, which they do to escape preda- istic to defining features. Nevertheless, the extended classi- tion, and so forth. According to this view, objects are cate- cal model is still incompatible with the lack of clearly gorized in the class which best explains the pattern of expressible definitions for most everyday concept terms. attributes they possess (Rips 1989). In the second or “prototype” model, concepts are repre- The fifth and final model, psychological ESSENTIALISM sented by a prototype with all the most common attributes of (Medin and Ortony 1989), is a development of the classical the category, which includes all instances sufficiently similar and theory-based models, and attempts to align psychologi- to this prototype (Rosch and Mervis 1975). The typicality of cal models with the philosophical intuitions of Putnam and an instance in a category depends on the number of attributes others. The model argues for a classical “core” definition of which an instance shares with other category members. Pro- concepts, but one which may frequently contain an empty totype representations lead naturally to non-defining attri- “place holder.” People believe that there is a real definition butes and to the possibility of unstable categorization at the of what constitutes a bird (an essence of the category), but category borderline. Such effects have been demonstrated in they do not know what it is. They are therefore forced to use a range of conceptual domains. A corollary of the prototype available information to categorize the world, but remain view is that the use of everyday concepts may show nonlogi- willing to yield to more expert opinion. Psychological cal effects such as intransivity of categorization hierarchies, essentialism captures Putnam’s intuition (1975) that people and nonintersective conjunctions (Hampton 1982, 1988). defer to experts when it comes to classifying biological or Associated with prototype theory is the theory of basic levels other technical kinds (for example, gold). However, it has in concept hierarchies. Rosch, Simpson, and Miller (1976) not been shown that the model applies well to concepts proposed that the SIMILARITY structure of the world is such beyond the range of biological and scientific terms (Kalish that we readily form a basic level of categorization—typi- 1995) or even to people’s use of natural kind terms such as cally, that level corresponding to high-frequency nouns such water (Malt 1994). as chair, apple, or car—and presented evidence that both The proliferation of different models for concept repre- adults and children find thinking to be easier at this level of sentation reflects the diversity of research traditions, the generality (as opposed to superordinate levels such as furni- many different kinds of concepts we possess, and the differ- ture or fruit, or subordinate levels such as armchair or McIn- ent uses we make of them. tosh). This intuitive notion has, however, proved hard to See also BEHAVIORISM; CATEGORIZATION; INTENTIONAL- formalize in a rigorous way, and the evidence for basic levels ITY; NATIVISM; NATURAL KINDS outside the well-studied biological and artifact domains —James A. Hampton remains weak. Attempts to model the combination of proto- type concept classes with FUZZY LOGIC (Zadeh 1965) has References also proved to be ill founded (Osherson and Smith 1981), although they have led to the development of more general Armstrong, S. L., L. R. Gleitman, and H. Gleitman. (1983). What research in conceptual combination (Hampton 1988). some concepts might not be. Cognition 13: 263–308. In the third or “exemplar” model, which is only weakly Barsalou, L. W. (1993). Structure, flexibility and linguistic vagary represented in the lexical semantic research tradition, lexi- in concepts: Manifestations of a compositional system of per- cal concepts are based not on a prototype but on a number ceptual symbols. In A. C. Collins, S. E. Gathercole, and M. A. of different exemplar representations. For example, small Conway, Eds., Theories of Memory. Hillsdale, NJ: Erlbaum. Conceptual Change 179 Further Readings Bruner, J. S., J. J. Goodnow, and G. A. Austin. (1956). A Study of Thinking. New York: Wiley. Hampton, J. A. (1997). Psychological representation of con- Carey, S. (1985). Conceptual Change in Childhood. Cambridge, cepts. In M. A. Conway and S. E. Gathercole, Eds., Cognitive MA: MIT Press. Models of Memory. Hove, England: Psychology Press, pp. Fodor, J. A. (1983). The Modularity of Mind. Cambridge, MA: 81–110. MIT Press. Lakoff, G. (1987). Women, Fire and Dangerous Things. Chicago: Fodor, J. A. (1994). Concepts—a pot-boiler. Cognition 50: 95– University of Chicago Press. 113. Millikan, R. (1984). Language, Thought, and Other Biological Frege, G. (1952). On sense and reference. In P. Geach and M. Categories. Cambridge, MA: MIT Press. Black, Eds., Translations from the Philosophical Writings of Neisser, U., Ed. (1993). Concepts and Conceptual Development: Gottlob Frege. Oxford: Blackwell. Ecological and Intellectual Bases of Categories. Cambridge: Hampton, J. A. (1982). A demonstration of intransitivity in natural Cambridge University Press. categories. Cognition 12: 151–164. Rey, G. (1983). Concepts and stereotypes. Cognition 15: 237–262. Hampton, J. A. (1988). Overextension of conjunctive concepts: Rips, L. J. (1995). The current status of research on concept com- Evidence for a unitary model of concept typicality and class bination. Mind and Language 10: 72–104. inclusion. Journal of Experimental Psychology: Learning, Rosch, E., and B. B. Lloyd, Eds. (1978). Cognition and Categori- Memory and Cognition 14: 12–32. zation. Hillsdale, NJ: Erlbaum. Kalish, C. W. (1995). Essentialism and graded membership in ani- Schwanenflugel, P., Ed. (1991). The Psychology of Word Mean- mal and artifact categories. Memory and Cognition 23: 335– ings. Hillsdale, NJ: Erlbaum. 353. Smith, E. E., and D. L. Medin. (1981). Categories and Concepts. Keil, F. C. (1989). Concepts, Kinds and Cognitive Development. Cambridge, MA: Harvard University Press. Cambridge, MA: MIT Press. van Mechelen, I., J. A. Hampton, R. S. Michalski, and P. Theuns, Keil, F. C., and N. Batterman. (1984). A characteristic-to-defining Eds. (1993). Categories and Concepts: Theoretical Views and shift in the development of word meaning. Journal of Verbal Inductive Data Analysis. London: Academic Press. Learning and Verbal Behavior 23: 221–236. Ward, T. B., S. M. Smith, and J. Viad, Eds. (1997). Conceptual Kripke, S. (1972). Naming and necessity. In D. Davidson and G. Structures and Processes: Emergence Discovery and Change. Harman, Eds., Semantics of Natural Language. Dordrecht: Washington, DC: American Psychological Association. Reidel. Malt, B. C. (1994). Water is not H2O. Cognitive Psychology 27: 41–70. Conceptual Change McCloskey, M., and S. Glucksberg. (1978). Natural categories: Well-defined or fuzzy sets? Memory and Cognition 6: 462–472. Medin, D. L., and A. Ortony. (1989). Psychological essentialism. Discussion of conceptual change is commonplace through- In S. Vosniadou and A. Ortony, Eds., Similarity and Analogical Reasoning. Cambridge: Cambridge University Press, pp. 179– out cognitive science and is very much a part of under- 195. standing what CONCEPTS themselves are. There are Medin, D. L., and E. J. Shoben. (1988). Context and structure in examples in the history and philosophy of science (Kuhn conceptual combination. Cognitive Psychology 20: 158–190. 1970, 1977), in the study of SCIENTIFIC THINKING AND ITS Murphy, G. L., and D. L. Medin. (1985). The role of theories in DEVELOPMENT, in discussions of COGNITIVE DEVELOPMENT conceptual coherence. Psychological Review 92: 289–316. at least as far back as PIAGET (1930) and VYGOTSKY Nosofsky, R. M. (1988). Exemplar-based accounts of relations (1934), in linguistic analysis both of language change over between classification, recognition and typicality. Journal of history and of LANGUAGE ACQUISITION, and in computer Experimental Psychology: Learning, Memory and Cognition science and artificial intelligence (AI) (Ram, Nersessian, 14: 700–708. and Keil 1997). But no one sense of conceptual change Osherson, D. N., and E. E. Smith. (1981). On the adequacy of pro- totype theory as a theory of concepts. Cognition 11: 35–58. prevails, making it difficult to define conceptual change in Piaget, J. (1967). Piaget’s theory. In J. Mussen, Ed., Car- uncontroversial terms. We can consider four types of con- michael’s Manual of Child Psychology, vol. 1. New York: ceptual change (see also Keil 1998) as being arrayed along Basic Books. a continuum from the simple accretion of bits of knowl- Putnam, H. (1975). The meaning of “meaning.” In Mind, Lan- edge to complete reorganizations of large conceptual struc- guage, and Reality, vol. 2, Philosophical Papers. Cambridge: tures, with a fifth type that can involve little or no Cambridge University Press, restructuring of concepts but radical changes in how they Rips, L. J. (1989). Similarity, typicality and categorization. In S. are used. Common to all accounts is the idea that either Vosniadou and A. Ortony, Eds., Similarity and Analogical Rea- conceptual structure itself or the way that structure is used soning. Cambridge: Cambridge University Press, pp. 21–59. changes over time. The most discussed account focuses on Rosch, E. (1975). Cognitive representations of semantic catego- ries. Journal of Experimental Psychology: General 104: 192– structural change seen as a dramatic and qualitative 232. restructuring of whole systems of concepts (type 4). All Rosch, E., and C. B. Mervis. (1975). Family resemblances: Studies five types are critical to consider, however, because very in the internal structure of categories. Cognitive Psychology 7: often the phenomena under discussion have not been stud- 573–605. ied in sufficient detail to say which type best explains the Rosch, E., C. Simpson, and R. S. Miller. (1976). Structural bases change. of typicality effects. Journal of Experimental Psychology: 1. Feature or property changes and value changes on Human Perception and Performance 2: 491–502. dimensions. With increasing knowledge, different clusters Zadeh, L. (1965). Fuzzy sets. Information and control 8: 338– of features may come to be weighted more heavily in a 353. 180 Conceptual Change concept, perhaps because they occur more frequently in a information, to more rulelike organizations of the same fea- set of experiences. A young child might weight shape tures (Sloman 1996). In other cases, there have been claims somewhat more heavily in her concept of a bath towel than of changes from prelogical to quasi-logical computations texture, while an older child might do the opposite. In its over features (Inhelder and Piaget 1958), or changes from simplest form, such a developmental change may not con- integral to separable operations on features and dimensions nect to any other relations or beliefs, such as why texture is (Kemler and Smith 1978); or changes from feature fre- now more important. An older child might disagree with a quency tabulations to feature correlation tabulations. younger one on identifying some marginal feature of bath Although most models tend to propose changes in com- towels, and while we might thereby attribute this differ- putations that apply across all areas of cognition, such tran- ence to conceptual change, we might not see the concepts sitions can also occur in circumscribed domains of thought as really being very different. even as there are no global changes in computational ability Changes in feature weightings and dimensional value (Chi 1992). Second, these models do not require that con- shifts are ubiquitous in cognitive science studies of con- cepts be interrelated in a larger structure. They are neutral in cepts. They are seen at all ages ranging from studies of that respect and thus allow each concept to change on its infant categorization to adult novice-to-expert shifts (see own. In practice, this is highly implausible and may in the INFANT COGNITION and EXPERTISE). Any time that some bit end render such models inadequate because they fail to of information is incrementally added to a knowledge base make stronger claims about links among concepts. and results in a different feature weighting, such a change There are also cases where there is no absolute change in occurs. When such changes have no other obvious conse- feature or computational types, but rather a strong change in quences for how knowledge in a domain is represented, they the ratio of types. Thus a younger child may have true con- constitute the most minimal sense of conceptual change, ceptual or functional features but may have ten times as and for many who contrast “learning” with true conceptual many perceptual ones in her concepts, whereas an older change, not a real case at all (Carey 1991). child may have the opposite ratio. Similarly, a younger child 2. Shifting use of different sorts of properties and rela- may perform logical computations on feature sets, but may tions. Conceptual change could occur because of changes in do so much more rarely and may more frequently resort to the kinds of feature used in representations. Infants and simpler probabilistic tabulations. This variant is important young children have been said to use perceptual and not con- because it offers a very different characterization of the ceptual features to represent classes of things, or perceptual younger child in terms of basic competencies. Younger chil- and not functional ones, or concrete and not abstract ones dren are not incapable of representing certain feature types (e.g., Werner and Kaplan 1963). More recently, young chil- or engaging in certain computations; rather, they do so dren are said to use one-place predicates and not higher- much less often, perhaps as a function of being much more order relational ones (Gentner and Toupin 1988), or to rely inexperienced in so many domains (Keil 1989). heavily on shape-based features early on in some contexts 4. Theoretic changes, where theories spawn others and (Smith, Jones, and Landau 1996). Similar arguments have thereby create new sets of concepts. The most dramatic been made about novice to expert shifts in adults (Chi, Fel- kinds of conceptual change, and those occupying most dis- tovich, and Glaser 1981) and even about the evolution of cussions in cognitive science at large, are those that view concepts from those in “primitive” cultures to those in more concepts as embedded in larger explanatory structures, usu- “advanced” ones (cf. Horton 1967; see also LURIA). ally known as “theories,” and whose changes honor DOMAIN Several forms of conceptual change can be captured by SPECIFICITY. Sweeping structural changes are said to occur shifts in what feature types are used in concepts. Despite a among whole sets of related concepts in a domain. For wide range of proposals in this area, however, it is striking example, a change in one concept in biology will naturally how many have always been controversial, especially in lead to simultaneous changes in other biological concepts claims of cross-cultural differences (Cole and Means 1981). because they as a cluster tend to complement each other There is no consensus on changes in the sorts of properties, symbiotically. Within this type, three kinds of change are relations, or both available at different points in develop- normally described: (a) birth of new theories and concepts ment, expertise, or historical change, nor on the very real through the death of older ones (Gopnik and Wellman possibility of no true changes in the availability of property 1994); (b) gradual evolution of new theories and concepts types. out of old ones in a manner that eventually leaves no traces Part of the problem is the need for better theories of of the earlier ones (Wiser and Carey 1983); and (c) birth of property types. It is difficult to make claims about percep- new theories and attendant concepts in a manner that leaves tual to conceptual shifts, or perceptual to functional shifts, if the old ones intact (Carey 1985). the contrast between perceptual and conceptual features is One key issue in choosing among these kinds of theoretic murky. Claims of changes in feature types therefore need to change is the extent to which concepts of one type are attend closely to philosophical analyses of properties and incommensurable or contradictory with those of another relations, which in turn need to attend more to the empirical type (Kuhn 1970, 1982). Kuhn suggested that conceptual facts. changes in domains could lead to “paradigm shifts” in 3. Changes in computations performed on features. which concepts in a prior system of beliefs might not even Conceptual change can also arise from new kinds of compu- be understandable in terms of the new set of beliefs, just as tations performed on a constant set of features, such as from concepts in that newer system might not be understandable tabulations of features based on frequency and correlational in terms of the older one. The ideas of paradigm shifts and Conceptual Change 181 ensuing incommensurability have been highly influential in of Mind: Essays on Biology and Cognition. Jean Piaget Sympo- sium Series. Hillsdale, NJ: Erlbaum, pp. 257–291. many areas of cognitive science, most notably in the study Chi, M. (1992). Conceptual change within and across ontologi- of conceptual change in childhood (Carey 1985). A related cal categories: Examples from learning and discovery in sci- issue asks how contradictions and anomalies in an older the- ence. Minnesota Studies in the Philosophy of Science 15: ory precipitate change (Chinn and Brewer 1993; Rusnock 129–186. and Thagard 1995). Chi, M., P. J. Feltovich, and R. Glaser. (1981). Categorization and Although most discussion of theoretic conceptual change representation of physics problems by experts and novices. has focused on these three kinds of restructuring, unambigu- Cognitive Science 5: 121–152. ous empirical evidence for these systemic restructurings as Chinn, C. A., and W. F. Brewer. (1993). The role of anomalous data opposed to the other four types of conceptual change (1, 2, in knowledge acquisition: A theoretical framework and implica- 3, 5) is often difficult to come by. For example, when a child tions for science. Instruction Rev. Educ. Res. 63(1): 1–49. Cole, M., and B. Means. (1981). Comparative Studies of How Peo- undergoes a dramatic developmental shift in how she thinks ple Think. Cambridge, MA: Harvard University Press. about the actions of levers, although that change might Gentner, D., and C. Toupin. (1988). Systematicity and surface sim- reflect a restructuring of an interconnected set of concepts ilarity in the development of analogy. Cognitive Science 10: in a belief system about physical mechanics, it might also 277–300. reflect a change in the kinds of features that are most Gopnik, A., and H. M. Wellman. (1994). The theory theory. In L. emphasized in mechanical systems, or how the child per- A. Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: forms computations on correlations that she notices among Domain Specificity in Cognition and Culture. Cambridge: elements in mechanical systems. Cambridge University Press, pp. 257–293. 5. Shifting relevances. Children and adults often come to Gutheil, G., A. Vera, and F. C. Keil. (1998). Houseflies don't dramatic new insights not because of an underlying concep- “think”: Patterns of induction and biological beliefs in develop- ment. Cognition, 66: 33–49. tual revolution or birth of a new way of thinking, but rather Horton, R. (1967). African traditional thought and Western sci- because they realize the relevance or preferred status of an ence. Africa 37: 50–71, 159–187. already present explanatory system to a new set of phenom- Inhelder, B., and J. Piaget. (1958). The Growth of Logical Thinking ena. Because the realization can be sudden and the exten- from Childhood to Adolescence. New York: Basic Books. sion to new phenomena quite sweeping, it can have all the Keil, F. C. (1989). Concepts, Kinds and Cognitive Development. hallmarks of profound conceptual change. It is, however, Cambridge, MA: MIT Press. markedly different from traditional restructuring notions. Keil, F. C. (1998). Cognitive science and the origins of thought and Children, for example, can often have several distinct theo- knowledge. In R. M. Lerner, Ed., Theoretical Models of Human ries available to them throughout an extensive developmen- Development. vol. 1 of Handbook of Child Psychology, 5th ed. tal period but might differ dramatically from adults in where New York: Wiley. Kemler, D. G., and L. B. Smith. (1978). Is there a developmental they think those theories are most relevant (e.g., Gutheil, trend from integrality to separability in perception? Journal of Vera, and Keil 1998). Children might not differ across ages Experimental Child Psychology 26: 498–507. in their possession of the theories but rather in their applica- Kuhn, T. S. (1970). The Structure of Scientific Revolutions. Chi- tion of them. These kinds of relevance shifts, combined with cago: University of Chicago Press. theory elaboration in each domain, may be far more com- Kuhn, T. S. (1977). A function for thought experiments. In T. mon than cases of new theories arising de novo out of old Kuhn, Ed., The Essential Tension: Selected Studies in Scien- ones. tific Tradition and Change. Chicago: University of Chicago An increasing appreciation of these different types of Press. conceptual change is greatly fostered by a cognitive science Kuhn, T. S. (1982). Commensurability, comparability, and commu- perspective on knowledge; for as questions cross the disci- nicability. PSA 2: 669–688. East Lansing: Philosophy of Sci- ence Association. plines, they become treated in different ways and different Piaget, J. (1930). The Child's Conception of Physical Causality. kinds of conceptual change stand out as most prominent. In London: Routledge and Keegan Paul. addition, these types of conceptual change need not be Ram, A., N. J. Nersessian, and F. C. Keil. (1997). Conceptual mutually exclusive. For example, changes in the kinds of change: Guest editors’ introduction. Journal of the Learning features that are emphasized and in the kinds of computa- Sciences 6(1): 1–2. tions performed on those features can occur on a domain- Rusnock, P., and P. Thagard. (1995). Strategies for conceptual specific basis and might result in a set of concepts having change: Ratio and proportion in classical Greek mathematics. different structural relations among each other. Studies in the History and Philosophy of Science 26(1): 107– See also COGNITIVE MODELING, CONNECTIONIST; EDUCA- 131. Sloman, S. A. (1996). The empirical case for two systems of rea- TION; EXPLANATION; INDUCTION; MENTAL MODELS; THE- soning. Psychological Bulletin 119(1): 3–22. ORY OF MIND Smith, L. B., S. S. Jones, and B. Landau. (1996). Naming in young —Frank Keil children: A dumb attentional mechanism? Cognition 60: 143– 171. Werner, H., and B. Kaplan. (1963). Symbol Formation: An References Organismic-Developmental Approach to Language and the Expression of Thought. New York: Wiley. Carey, S. (1985). Conceptual Change in Childhood. Cambridge, Wiser, M., and S. Carey. (1983). When heat and temperature were MA: MIT Press. one. In D. Gentner and A. Stevens, Eds., Mental Models. Hills- Carey, S. (1991). Knowledge acquisition: Enrichment or concep- dale, NJ: Erlbaum. tual change? In S. Carey and R. Gelman, Eds., The Epigenesis 182 Conceptual Role Semantics Conditioning Further Readings Arntzenius, F. (1995). A heuristic for conceptual change. Philoso- phy of Science 62(3): 357–369. When Ivan Pavlov observed that hungry dogs salivated pro- Bartsch, R. (1996) The relationship between connectionist models fusely not only at the taste or sight of food, but also at the and a dynamic data-oriented theory of concept formation. Syn- sight or sound of the laboratory attendant who regularly fed thèse 108(3): 421–454. them, he described this salivation as a “psychical reflex” Carey, S., and E. Spelke. (1984). Domain-specific knowledge and and later as a “conditional reflex.” Salivation was an inborn, conceptual change. In L. A. Hirschfeld and S. A. Gelman, Eds., reflexive response, unconditionally elicited by food in the Mapping the Mind: Domain Specificity in Cognition and Cul- mouth, but which could be elicited by other stimuli condi- ture. New York: Cambridge University Press, pp. 169–200. Case, R. (1996). Modeling the process of conceptual change in a tionally on their having signaled the delivery of food. The continuously evolving hierarchical system. Monographs of the term conditional was translated as “conditioned,” whence Society for Research in Child Development 61(1–2): 283–295. by back-formation the verb “to condition,” which has been Dunbar, K. (1997). How scientists think: On-line creativity and used ever since. conceptual change in science. In T. B. Ward, S. M. Smith, and In Pavlov’s experimental studies of conditioning (1927), J. Vaid, Eds., Creative Thought: An Investigation of Conceptual the unconditional stimulus (US), food or dilute acid injected Structures and Processes. Washington, DC: American Psycho- into the dog’s mouth, was delivered immediately after the logical Association, pp. 461–493. presentation of the conditional stimulus (CS), a bell, metro- Gentner, D., S. Brem, R. W. Ferguson, et al. (1997). Analogical nome, or flashing light, regardless of the animal’s behavior. reasoning and conceptual change: A case study of Johannes The US served to strengthen or reinforce the conditional Kepler. Journal of the Learning Sciences 6(1): 3–40. Greeno, J. G. (1998). The situativity of knowing, learning, and reflex of salivating to the CS, which would extinguish if the research. American Psychologist 53(1): 5–26. US was no longer presented. Hence the US is often referred Hatano, G. (1994). Conceptual Change: Japanese Perspectives: to as a “reinforcer.” Pavlovian or classical conditioning is Introduction. Human Development 37(4): 189–197. contrasted with instrumental or operant conditioning, where Kitcher, P. (1988). The child as parent of the scientist. Mind and the delivery of the reinforcer is dependent on the animal Language 3: 217–227. performing a particular response or action. This was first Lee, O., and C. W. Anderson. (1993). Task engagement and con- studied in the laboratory by Thorndike (1911), at much the ceptual change in middle school science classrooms. American same time as, but quite independently of, Pavlov’s experi- Educational Research Journal 30(3): 585–610. ments. Thorndike talked of “trial-and-error learning,” but Nersessian, N. J. (1989). Conceptual change in science and in sci- the “conditioning” terminology was popularized by Skinner ence education. Synthèse 80(1): 163–183. Nersessian, N. J. (1992). How do scientists think? Capturing the (1938), who devised the first successful fully automated dynamics of conceptual change in science. Minnesota Studies apparatus for studying instrumental conditioning. in the Philosophy of Science 15: 3–44. In Pavlovian conditioning, the delivery of the reinforcer Nersessian, N. J. (1996). Child’s play. Philosophy of Science 63: is contingent on the occurrence of a stimulus (the CS), 542–546. whereas in instrumental conditioning, it is contingent on Smith, C., S. Carey, and M. Wiser. (1985). On differentiation: A the occurrence of a designated response. This operational case study of the development of the concepts of size, weight, distinction was first clearly articulated by Skinner, but and density. Cognition 21: 177–237. Miller and Konorski (1928) in Poland and Grindley (1932) Stinner, A., and H. Williams. (1993). Conceptual change, history, in England had already argued, on experimental and theo- and science. Stories Interchange 24: 87–103. retical grounds, for the importance of this distinction. Thagard, P. (1990). Concepts and conceptual change. Synthèse 82: 255–274. According to the simplest, and still widely accepted, inter- Vosniadou, S., and W. F. Brewer. (1987). Theories of knowledge pretation of Pavlovian conditioning, the US serves to elicit restructuring in development. Review of Educational Research a response (e.g., salivation), and pairing a CS with this US 57: 51–67. results in the formation of an association between the two, Vosniadou, S., and W. F. Brewer. (1992). Mental models of the such that the presentation of the CS can activate a represen- earth: A study of conceptual change in childhood. Cognitive tation of the US, which then elicits the same (or a related) Psychol. 24(4): 535–585. response. This account cannot explain the occurrence of Vygotsky, L. S. (1934/1986). Thought and Language. Cambridge, instrumental conditioning. If the delivery of a food rein- MA: MIT Press. forcer is contingent on the execution of a particular Wiser, M. (1988). The differentiation of heat and temperature: His- response, this may well lead to the formation of an associa- tory of science and novice-expert shift. In S. Strauss, Ed., Ontogeny, Phylogeny, and Historical Development. Norwood, tion between response and reinforcer. The Pavlovian princi- NJ: Ablex, pp. 28–48. ple can then predict that the dog performing the required Zietsman, A., and J. Clement. (1997). The role of extreme case response will salivate when doing so (a prediction that has reasoning in instruction for conceptual change. Journal of the been confirmed), but what needs to be explained is why the Learning Sciences 6(1): 61–89. dog learns to perform the response in the first place. Another way of stating the distinction between Pavlovian and instrumental conditioning is to note that instrumentally Conceptual Role Semantics conditioned responses are being modified by their conse- quences, much as Thorndike’s law of effect, or Skinner’s talk of “controlling contingencies of reinforcement,” See FUNCTIONAL ROLE SEMANTICS Conditioning 183 implied. The hungry rat that presses a lever to obtain food, association between a CS and a US is the associability of will desist from pressing the lever if punished for doing so. the CS—which can itself change as a consequence of expe- But Pavlovian conditioned responses are not modified by rience. For example, in the phenomenon of latent inhibi- their consequences; they are simply elicited by a CS associ- tion, a novel CS will enter into association with a US ated with a US, as experiments employing omission sched- rapidly, but a familiar one will condition only slowly. Inhib- ules demonstrate. If, in a Pavlovian experiment, the delivery itory conditioning, when a CS signals the absence of an of food to a hungry pigeon is signaled by the illumination of otherwise predicted US, is not the symmetrical opposite of a small light some distance away from the food hopper, the excitatory conditioning, when the CS signals the occur- pigeon will soon learn to approach and peck at this light, rence of an otherwise unexpected US. Even the rather sim- even though this pattern of behavior takes it farther away ple stimuli used in most conditioning experiments are, at from the food, to the point of reducing the amount of food it least sometimes, represented as configurations of patterns obtains. Indeed, it will continue to approach and peck the of elements rather than as a simple sum of their elements light on a high proportion of trials even if the experimenter (Pearce 1994). This last point has indeed been incorporated arranges that any such response actually cancels the deliv- into many connectionist networks because a simple, ele- ery of food on that trial. The light, as a CS, has been associ- mentary representation of stimuli makes the solution of ated with food, as a US, and comes to elicit the same pattern many discriminations impossible. A familiar example is the of behavior as food, approach and pecking, regardless of its XOR (exclusive or) problem: if each of two stimuli, A and consequences (Mackintosh 1983). B, signaled the delivery of a US when presented alone, but Most research in COMPARATIVE PSYCHOLOGY accepts their combination, AB, predicted the absence of the US, a that the conditioning process is of wide generality, common simple elementary system would respond more vigorously at least to most vertebrates, and allows them to learn about to the AB compound than to A or B alone, and thus fail to the important contingencies in their environment—what learn the discrimination. The solution must be to represent events predict danger, what signs reliably indicate the avail- the compound as something more than, or different from, ability of food, how to take effective action to avoid preda- the sum of its components. But apart from this, not all con- tors or capture prey; in short, to learn about the causal nectionist models have acknowledged the modifications to structure of their world. But why should cognitive scientists error-correcting associative systems that conditioning theo- pay attention to conditioning? One plausible answer is that rists have been willing to entertain to supplement the sim- conditioning experiments provide the best way to study sim- ple Rescorla-Wagner model. Conversely, some of the ple associative LEARNING, and associative learning is what phenomena once thought to contradict, or lie well outside NEURAL NETWORKS implement. Conditioning experiments the scope of, standard conditioning theory, such as evidence have unique advantages for the study of associative learn- of so-called constraints on learning (Seligman and Hager ing: experiments on eyelid conditioning in rabbits, condi- 1972), turn out on closer experimental and theoretical anal- tioned suppression in rats, or autoshaping in pigeons reveal ysis to require little more than minor parametric changes to the operation of simple associative processes untrammeled the theory (Mackintosh 1983). Conditioning theory and by other, cognitive operations that people bring to bear conditioning experiments may still have some important when asked to solve problems. And through such prepara- lessons to teach. tions researchers can directly study the rules governing the See also BEHAVIORISM; CONDITIONING AND THE BRAIN; formation of single associations between elementary events. PSYCHOLOGICAL LAWS As many commentators have noted, there is a striking simi- —Nicholas J. Mackintosh larity between the Rescorla-Wagner (1972) model of Pav- lovian conditioning and the Widrow-Hoff or delta rule References frequently used to determine changes in connection weights in a parallel distributed processing (PDP) network (Sutton Grindley, G. C. (1932). The formation of a simple habit in guinea and Barto 1981). The phenomenon of “blocking” in Pavlov- pigs. British Journal of Psychology 23: 127–147. ian conditioning provides a direct illustration of the opera- Mackintosh, N. J. (1983). Conditioning and Associative Learning. tion of this rule: if a given reinforcer is already well Oxford: Oxford University Press. predicted by CS1, further conditioning trials on which CS2 Miller, S., and J. Konorski. (1928). Sur une forme particulière des is added to CS1 and the two are followed by the reinforcer reflexes conditionnels. C. R. Sèance. Soc. Biol. 99: 1155– results in little or no conditioning to CS2. The Rescorla- 1157. Wagner model explains this by noting that the strength of an Pavlov, I. P. (1927). Conditioned Reflexes. Oxford: Oxford Univer- sity Press. association between a CS and reinforcer will change only Pearce, J. M. (1994). Similarity and discrimination: A selective when there is a discrepancy between the reinforcer that review and a connectionist model. Psychological Review 10: actually occurs and the one that was expected to occur. 587–607. According to the delta rule, connections between elements Rescorla, R. A., and A. R. Wagner. (1972). A theory of Pavlovian in a network are changed only insofar as is necessary to conditioning: Variations in the effectiveness of reinforcement bring them into line with external inputs to those elements. and nonreinforcement. In A. H. Black and W. F. Proskay, Eds., But conditioning theorists, not least Rescorla and Wag- Classical Conditioning, vol. 2, Current Research and Theory. ner themselves, have long known that the Rescorla-Wagner New York: Appleton-Century-Crofts, pp. 54–99. model is incomplete in several important respects. A sec- Seligman, M. E. P., and J. L. Hager, Eds. (1972). Biological ond determinant of the rate of change in the strength of an Boundaries of Learning. New York: Appleton-Century-Crofts. 184 Conditioning and the Brain or arousal, followed by slower learning of discrete, adaptive Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts. behavioral responses (Rescorla and Solomon 1967). As the Sutton, R. S., and A. G. Barto. (1981). Toward a modern theory of latter learning develops, fear subsides. We now think that at adaptive networks: Expectation and prediction. Psychological least in mammals a third process of “declarative” memory Review 88: 135–170. for the events and their relations also typically develops (cf. Thorndike, E. L. (1911). Animal Intelligence: Experimental Stud- EPISODIC VS. SEMANTIC MEMORY). ies. New York: Macmillan. Learned fear develops rapidly, often in one trial, and involves changes in autonomic responses (heart rate, blood Conditioning and the Brain pressure, pupillary dilation) and nonspecific skeletal res- ponses (freezing, startle). The afferent limb of the condi- How the brain codes, stores, and retrieves memories is tioned fear circuit involves projections from sensory relay among the most important and baffling questions in science. nuclei via thalamic projections to the AMYGDALA. Although The uniqueness of each human being is due largely to the lesions of the appropriate regions of the amygdala can abol- MEMORY store—the biological residue of memory from a ish all signs of learned fear, lesions of the efferent targets of lifetime of experience. The cellular basis of this ability to the amygdala can have selective effects, for example, lateral learn can be traced to simpler organisms. In the past genera- hypothalamic lesions abolish cardiovascular signs of tion, it has become clear that various forms and aspects of learned fear but not behavioral signs (e.g., freezing), LEARNING and memory involve particular systems, networks, whereas lesions of the periqueductal gray abolish learned and circuits in the brain, and it now appears possible we will freezing but not the autonomic signs of learned fear (see, for identify these circuits, localize the sites of memory storage, example, Le Doux et al. 1988). This double disassociation and ultimately analyze the cellular and molecular mechanism of conditioned responses stresses the key role of the of memory. amygdala in learned fear, as do studies involving recording All aspects of learning share a common thrust. As Res- of neuronal activity and electrical stimulation (Davis 1992). corla (1988) has stressed, basic associative learning is the The amygdala is critically involved in unlearned fear way organisms, including humans, learn about causal rela- responses as well. The structures most involved in generat- tionships in the world. It results from exposure to relations ing the appropriate responses in basic associative learning among events in the world. For both modern Pavlovian and and memory seem also to be the most likely sites of mem- cognitive views of learning and memory, the individual ory storage (see below). learns a representation of the causal structure of the world Higher brain structures also become critically engaged in and adjusts this representation through experience to bring it learned fear under certain circumstances. Thus when an in tune with the real causal structure of the world, striving to organism experiences strong shock in a particular environ- reduce any discrepancies or errors between its internal rep- ment, reexperiencing that environment elicits learned fear. resentation and external reality. This context-dependent learned fear involves both the Most has been learned about the simplest forms of learn- amygdala and the HIPPOCAMPUS for a time-limited period ing: nonassociative processes of habituation and sensitiza- after the experience, a temporal property characteristic of tion, and basic associative learning and memory. Here we more cognitive aspects of declarative memory (Kim and focus on CONDITIONING in the mammalian brain. We em- Fanselow 1992). phasize classical or Pavlovian conditioning because far A vast amount of research has been done using Pavlovian more is known about brain substrates of this form of learn- conditioning of the eye blink response in humans and other ing than about more complex instrumental learning. Pavlov- mammals (Gormezano, Kehoe, and Marshall-Goodell ian conditioning involves pairing a “neutral” stimulus, for 1983). The eye blink response exhibits all the basic laws and example, a sound- or light-conditioned stimulus (CS) with properties of Pavlovian conditioning equally in humans and an unconditioned stimulus (US) that elicits a response, the other mammals. The basic procedure is to present a neutral unconditioned response (UR). As a result of repeated pair- CS such as a tone or a light followed a quarter of a second or ings, with the CS onset preceding the US onset by some so later by a puff of air to the eye or a periorbital (around- brief period of time, the CS comes to elicit a conditioned the-eye) shock (US), the two stimuli terminating together. response (CR). Conditioning may be the way organisms, This is termed the delay procedure. If a period of no stimuli including humans, first learn about the causal structure of intervenes between CS offset and US onset, it is termed the the world. Contemporary views of Pavlovian conditioning trace procedure, which is much more difficult to learn than emphasize the predictive relations between the CS and the the delay procedure. Initially, there is no response to the CS US, consistent with cognitive views of learning and mem- and a reflex eye blink to the US. After a number of such tri- ory. The key factor is the contingencies among events in the als, the eyelid begins to close in response to the CS before organism’s environment. the US occurs, and in a well-trained subject, the eyelid clo- When animals, including humans, are faced with an aver- sure CR becomes very precisely timed so that the eyelid is sive or threatening situation, at least two complementary maximally closed about the time that the air puff or shock processes of learning occur. Learned fear or arousal devel- US onset occurs. This very adaptive timing of the eye blink ops very rapidly, often in one trial. Subsequently, the organ- CR develops over the range of CS-US onset intervals where ism learns to make the most adaptive behavioral motor learning occurs, about 100 milliseconds to 1 second. Thus responses to deal with the situation. These observations led the conditioned eye blink response is a very precisely timed to theories of “two-process” learning: an initial learned fear elementary learned motor skill. The same is true of other Conditioning and the Brain 185 These results constitute an extraordinary confirmation of discrete behavioral responses learned to deal with aversive the much earlier theories of the cerebellum as a neuronal stimuli (e.g., the forelimb or hindlimb flexion response, learning system, first advanced in the classic papers of MARR head turn, etc.). Two brain systems become massively engaged in eye (1969) and Albus (1971) and elaborated by Eccles (1977) blink conditioning, hippocampus and CEREBELLUM (Thomp- and Ito (1984). These theories proposed that mossy-parallel son and Kim 1996). If the US is sufficiently aversive, learned fibers conveyed information about stimuli and movement fear also occurs, involving the amygdala, as noted above. contexts (CSs here) and the climbing fibers conveyed infor- Neuronal unit activity in the hippocampus increases in paired mation about specific movement errors and aversive events (tone CS–corneal air puff US) training trials very rapidly, (USs here) and they converged (e.g., on Purkinje neurons in shifts forward in time as learning develops, and forms a pre- cerebellar cortex and interpositus nucleus neurons). dictive “temporal model” of the learned behavioral response, The cerebellar system essential for a basic form of learn- both within and over the training trials. The growth of this ing and memory constitutes the clearest example to date of hippocampal neuronal unit response is, under normal condi- localizing memory traces to particular sites in the brain (i.e., tions, an invariable and strongly predictive concomitant of in the cerebellum). subsequent behavioral learning (Berger, Berry, and Thomp- See also BEHAVIORISM; EMOTION AND THE ANIMAL son 1986). BRAIN Interestingly, in the basic delay procedure, hippocampal —Richard F. Thompson lesions do not impair the eye blink CR, although if the more difficult trace procedure is used, the hippocampal References lesions massively impair learning of the CR and, in trained Albus, J. S. (1971). A theory of cerebellar function. Mathematical animals, impair memory in a time-limited manner. These Bioscience 10: 25–61. results are strikingly consistent with the literature con- Berger, T. W., S. D. Berry, and R. F. Thompson. (1986). Role of the cerned with declarative memory deficit following damage hippocampus in classical conditioning of aversive and appeti- to the hippocampal system in humans and monkeys, as is tive behaviors. In R. L. Isaacson and K. H. Pribram, Eds., The the hippocampus-dependent contextual fear discussed Hippocampus. New York: Plenum Press, pp. 203–239. above. So even in “simple” learning tasks like eye blink and Davis, M. (1992). The role of the amygdala in fear and anxiety. fear conditioning, hippocampus-dependent “declarative” Annual Review of Neuroscience 15: 353–375. memory processes develop. Eccles, J. C. (1977). An instruction-selection theory of learning in The cerebellum has long been a favored structure for the cerebellar cortex. Brain Research 127: 327–352. modeling a neuronal learning system, in part because of the Gormezano, I., E. J. Kehoe, and B. S. Marshall-Goodell. (1983). Twenty years of classical conditioning research with the rabbit. extraordinary architecture of the cerebellar cortex, where In J. M. Sprague and A. N. Epstein, Eds., Progress in Physio- each Purkinje neuron receives 100,000+ excitatory syn- logical Psychology. New York: Academic Press, pp. 197–275. apses from mossy-parallel fibers but only one climbing Ito, M. (1984). The Cerebellum and Neural Control. New York: fiber from the inferior olive (see below). The reflex eye Raven. blink response pathways activated by the US (corneal air Kim, J. J., and M. S. Fanselow. (1992). Modality-specific retro- puff or periorbital shock) involve direct and indirect relays grade amnesia of fear. Science 256: 675–677. through the brain stem from the sensory (trigeminal) Kim, J. J., and R. F. Thompson. (1997). Cerebellar circuits and nucleus to the relevant motor nuclei (largely the seventh and synaptic mechanisms involved in classical eyeblink condition- accessory sixth). The CS (e.g., tone) pathway projects to the ing. Trends in Neurosciences 20: 177–181. forebrain and also, via mossy fibers, to the cerebellum. The Krupa, D. J., J. K. Thompson, and R. F. Thompson. (1993). Local- ization of a memory trace in the mammalian brain. Science US (e.g., corneal air puff) pathway projects from the 260: 989–991. trigeminal nuclei to the forebrain and also, via the inferior Lavond, D. G., J. J. Kim, and R. F. Thompson. (1993). Mammalian olive, as climbing fibers to the cerebellum. These two pro- brain substrates of aversive classical conditioning. Annual jection systems converge on localized regions of the cere- Review of Psychology 44: 317–342. bellum, where the memory traces appear to be formed. The LeDoux, J. E., J. Iwata, P. Cicchetti, and D. J. Reis. (1988). Differ- CR pathway projects from the cerebellar cortex and nuclei ent projections of the central amygdaloid nucleus mediate auto- (interpositus nucleus) via the red nucleus to the motor nomic and behavioral correlates of conditioned fear. Journal of nuclei generating the eye blink response. (The cerebellum Neuroscience 8: 2517–2529. does not participate in the reflex eye blink response.) A Marr, D. (1969). A theory of cerebellar cortex. Journal of Physiol- wide range of evidence, including electrophysiological ogy (London) 202: 437–470. McGaugh, J. L. (1989). Involvement of hormonal and neuromodu- recording, lesions, electrical stimulation, and reversible latory systems in the regulation of memory storage. Annual inactivation during training, has demonstrated conclusively Review of Neuroscience 12: 255–287. that the cerebellum is necessary for this form of learning Rescorla, R. A. (1988). Behavioral studies of Pavlovian condition- (both delay and trace) and that the cerebellum and its asso- ing. Annual Review of Neuroscience 11: 329–352. ciated circuitry form the essential (necessary and sufficient) Rescorla, R. A., and R. L. Solomon. (1967). Two-process learning circuitry for this learning. Moreover, the evidence strongly theory: Relationships between Pavlovian conditioning and suggests that the essential memory traces are formed and instrumental learning. Psychological Review 74: 151–182. stored in the localized regions in the cerebellum (see Thompson, R. F., and J. J. Kim. (1996). Memory systems in the Thompson and Krupa 1994; Lavond, Kim, and Thompson brain and localization of a memory. Proceedings of the 1993; Yeo 1991). National Academy of Sciences 93: 13438–13444. 186 Connectionism, Philosophical Issues writers believe this framework will provide a new paradigm Thompson, R. F., and D. J. Krupa. (1994). Organization of mem- ory traces in the mammalian brain. Annual Review of Neuro- for understanding the nature of COGNITIVE ARCHITECTURE science 17: 519–549. and give rise to psychological explanations that depart dra- Yeo, C. H. (1991). Cerebellum and classical conditioning of motor matically from past accounts (van Gelder 1991; Horgan response. Annals of the New York Academy of Sciences 627: and Tienson 1996; see also COMPUTATIONAL NEURO- 292–304. SCIENCE). Zola-Morgan, S., and L. R. Squire. (1993). Neuroanatomy of GOFAI cognitive models rely heavily on explicit, syntac- memory. Annual Review of Neuroscience 16: 547–563. tically structured symbols to store and process information. By contrast, connectionist networks employ a very different Connectionism, Philosophical Issues type of representation, whereby information is encoded throughout the nodes and connections of the entire network. Since its inception, artificial intelligence (AI) research has These distributed representations (cf. DISTRIBUTED VS. had a growing influence on the philosophy of mind. Conse- LOCAL REPRESENTATION) lack the languagelike, syntactic quently, the recent development of a radically different style structure of traditional GOFAI symbols. Moreover, their of cognitive modeling—commonly known as “connection- content and representational function is often revealed only ism” (see COGNITIVE MODELING, CONNECTIONIST)—has through mathematical analysis of the activity patterns of the brought with it a number of important philosophical issues system’s internal units. and concerns. Because connectionism is such a dramatic The philosophical implications of this new account of departure from more traditional accounts of cognition, it has representation are far-reaching. Some writers, unhappy with forced philosophers to reconsider several assumptions based the quasi-linguistic character of GOFAI symbols, have on earlier theories. Most of these cluster around three cen- embraced the connectionist picture to support nonsentential tral themes: (1) the nature of psychological explanation, (2) theories of representation, including prototype accounts of forms of mental representation, and (3) nativist and empiri- CONCEPTS (Churchland 1989). Others have suggested paral- cist accounts of learning. lels between the connectionist representations and the bio- Before the introduction of connectionism in the mid- logically motivated theories of INFORMATIONAL SEMANTICS 1980s, the dominant paradigm in cognitive modeling was explored by writers such as Fred Dretske (1988). Many the COMPUTATIONAL THEORY OF MIND, sometimes referred believe the internal units of connectionist networks provide to by philosophers as “GOFAI” (for “good old-fashioned a promising new way to understand MENTAL REPRESENTA- artificial intelligence”; see also COGNITIVE MODELING, TION because of their similarity to real neural systems and SYMBOLIC). GOFAI accounts treat the mind as a complex their sensitivity to environmental stimuli (Bechtel 1989). organization of interacting subsystems, each performing a On the other hand, some philosophers have argued that specific cognitive function and processing information the connectionist account of representation is seriously through the manipulation of discrete, quasi-linguistic sym- flawed. Jerry Fodor and Zenon Pylyshyn have claimed that bols whose interactions are governed by explicitly encoded the ability to represent some states of affairs (e.g., “John rules. Psychological explanation is treated as a form of loves Mary”) is closely linked to the ability to represent FUNCTIONAL DECOMPOSITION, where sophisticated cogni- other states of affairs (e.g., “Mary loves John”; Fodor and tive capacities are broken down and explained through the Pylyshyn 1988). They argue that this feature of cognition, coordinated activity of individual components. The capaci- called “systematicity,” must be explained by any plausible ties of the individual components are further explained theory of mind. They insist that because connectionist rep- through a description of their internal symbolic operations resentations do not have constituent parts, connectionist (Cummins 1983; see also RULES AND REPRESENTATIONS models cannot explain systematicity. In response to this and ALGORITHM). challenge, several connectionists have argued that it is pos- Connectionism suggests a very different outlook on the sible for connectionist representations to produce systematic nature of psychological theory. Connectionist networks cognition in subtle ways without merely implementing a model cognition through the spreading activation of numer- symbolic system (Smolensky 1991; Clark 1991). ous simple units. The processing is highly distributed Connectionist accounts of representations have also throughout the entire system, and there are no task-specific influenced philosophical debate concerning the status of modules, discrete symbols, or explicit rules that govern the PROPOSITIONAL ATTITUDES. ELIMINATIVE MATERIALISM operations (Rumelhart, McClelland, and PDP Research holds that our commonsense conception of the mind is so Group 1986; McClelland, Rumelhart, and PDP Research flawed that there is reason to be skeptical about the exist- Group 1986; Smolensky 1988). This has forced researchers ence of states such as beliefs and desires. Some writers have to abandon the functional decomposition approach and suggested that the style of information encoding in networks search for new ways to understand the structure of psycho- is so radically different from what is assumed by common logical explanation. In one popular alternative, DYNAMIC sense that connectionist models actually give credence to APPROACHES TO COGNITION, cognitive activity is under- eliminativism (Churchland 1986; Ramsey, Stich, and Garon stood as a series of mathematical state transitions plotted 1990). Others have gone a step further and argued that the along different possible trajectories. Mental operations are internal elements of networks should not be viewed as rep- described through equations that capture the behavior of resentations at all (Brooks 1991; Ramsey 1997). In res- the whole system, rather than focusing on the logical or ponse, several writers have insisted that commonsense psy- syntactic transformations within specific subsystems. Some chology and connectionism are quite compatible, once the Connectionism, Philosophical Issues 187 former is properly construed; moreover, because our com- Bechtel, W., and A. Abrahamsen. (1993). Connectionism and the future of folk psychology. In S. Christensen and D. Turner, monsense notion of belief is not committed to any specific Eds., Folk Psychology and the Philosophy of Mind. Hillsdale, sort of cognitive architecture, it has nothing to fear from the NJ: Erlbaum, pp. 340–367. success of connectionism (Dennett 1991; Bechtel and Abra- Brooks, R. (1991). Intelligence without representation. Artificial hamsen 1993; see also FOLK PSYCHOLOGY). Intelligence 47: 139–159. Research in cognitive science has had an important influ- Chomsky, N. (1975). Reflections on Language. New York: Pan- ence on the traditional debate between nativists, who claim theon. that we are born with innate knowledge, and empiricists, Churchland, P. M. (1986). Some reductive strategies in cognitive who claim that knowledge is derived from experience (see neurobiology. Mind 95(379): 279–309. also NATIVISM and RATIONALISM VS. EMPRIRICISM). Nativ- Churchland, P. M. (1989). A Neurocomputational Perspective. ism has enjoyed popularity in cognitive science because it Cambridge, MA: MIT Press. Clark, A. (1991). Systematicity, structured representations and has proven difficult to explain how cognitive capacities are cognitive architecture: A reply to Fodor and Pylyshyn. In T. acquired without assuming some form of preexisting knowl- Horgan and J. Tienson, Eds., Connectionism and the Philoso- edge within the system. Yet one of the most striking features phy of Mind. Dordrecht: Kluwer, pp. 198–218. of connectionist networks is their ability to attain capacities Cummins, R. (1983). The Nature of Psychological Explanation. with very little help from antecedent knowledge. By relying Cambridge, MA: MIT Press. on environmental stimuli and powerful learning algorithms, Dennett, D. (1991). Two contrasts: Folk craft versus folk science, networks often appear to program themselves. This has led and belief vs. opinion. In J. Greenwood, Ed., The Future of Folk many to claim that connectionism offers a powerful new Psychology. New York: Cambridge University Press, pp. 135– approach to learning—one that will resurrect empiricist 148. accounts of the mind. Dretske, F. (1988). Explaining Behavior. Cambridge, MA: MIT Press. A mainspring of nativism in cognitive science has been Elman, J., E. Bates, M. Johnson, A. Karmiloff-Smith, D. Parisi, Chomsky’s POVERTY OF THE STIMULUS ARGUMENT for the and K. Plunkett. (1996). Rethinking Innateness: A Connec- INNATENESS OF LANGUAGE (1975). Chomsky has argued tionist Perspective on Development. Cambridge, MA: MIT that LANGUAGE ACQUISITION is impossible without a rich Press. store of innate linguistic knowledge. Although several CON- Fodor, J. (1981). The present status of the innateness controversy. NECTIONIST APPROACHES TO LANGUAGE have been devel- In Representations. Cambridge, MA: MIT Press, pp. 257–316. oped to demonstrate how areas of linguistic competence— Fodor, J., and Z. Pylyshyn. (1988). Connectionism and cognitive such as knowing regular and irregular past tense forms of architecture: A critical analysis. Cognition 28: 3–71. verbs—can be obtained without preexisting linguistic rules Horgan, T., and J. Tienson. (1996). Connectionism and the Philos- (Rumelhart, McClelland, and PDP Research Group 1986; ophy of Psychology. Cambridge, MA: MIT Press. McClelland, J., D. Rumelhart, and PDP Research Group. (1986). Elman et al. 1996), the success of these models in establish- Parallel Distributed Processing: Explorations in the Micro- ing a nonnativist theory of linguistic competence has been structure of Cognition. Vol. 2, Psychological and Biological heavily debated (Pinker and Prince 1988). One critical issue Models. Cambridge, MA: MIT Press. concerns the degree of DOMAIN SPECIFICITY employed in Muntakata, Y., J. L. McClelland, M. H. Johnson, and R. S. Siegler. the learning strategies and initial configuration of the net- (1997). Rethinking infant knowledge: Toward an adaptive pro- works (Ramsey and Stich 1990). cess account of success and failure in object permanence tasks. A second motivation for nativism stems from the “classi- Psychological Review 104(4): 686–713. cal” account of concept acquisition, which assumes that Pinker, S., and A. Prince. (1988). On language and connectionism: learning occurs when new complex concepts are constructed Analysis of a parallel distributed processing model of language from more primitive concepts (Fodor 1981), and which sug- acquisition. Cognition 28: 73–193. Ramsey, W. (1997). Do connectionist representations earn their gests there must first exist a prior store of basic concepts explanatory keep? Mind and Language 12(1): 34–66. that, by hypothesis, are unlearned. However, connectionism Ramsey, W., and S. Stich (1990). Connectionism and three levels appears to offer a different model of concept acquisition. of nativism. Synthèse 82: 177–205. Networks seem to develop new classifications and abstrac- Ramsey, W., S. Stich, and J. Garon (1990). Connectionism, elimi- tions that emerge without the recombination of preexisting nativism and the future of folk psychology. Philosophical Per- representations. In other words, there is reason to think con- spectives 4: 499–533. nectionist learning gives rise to new primitive concepts that Rumelhart, D., J. McClelland, and PDP Research Group. (1986). are developed entirely in response to the system’s training Parallel Distributed Processing: Explorations in the Micro- input (Munakata et al. 1997). To many, this captures the structure of Cognition. Vol. 1, Foundations. Cambridge, MA: essence of empiricist learning and signals a new direction for MIT Press. Smolensky, P. (1988). On the proper treatment of connectionism. understanding CONCEPTUAL CHANGE (Churchland 1989; Behavioral and Brain Sciences 11(1): 1–74. Elman et al. 1996). Smolensky, P. (1991). The constituent structure of mental states: —William Ramsey A reply to Fodor and Pylyshyn. In T. Horgan and J. Tienson, Eds., Connectionism and the Philosophy of Mind. Dordrecht: Kluwer, pp. 281–308. References van Gelder, T. (1991). Connectionism and dynamical explanation. Bechtel, W. (1989). Connectionism and intentionality. In Proceed- In Proceedings of the Thirteenth Annual Conference of the ings of the Eleventh Annual Meetings of the Cognitive Science Cognitive Science Society. Hillsdale, NJ: Erlbaum, pp. 499– Society. Hillsdale, NJ: Erlbaum, pp. 553–600. 503. 188 Connectionist Approaches to Language phonological, syntactic, and semantic properties of nested Further Readings phrases or their designations). In contrast, connectionist Aizawa, K. (1994). Representation without rules, connectionism computation is based in the continuous mathematics of NEU- and the syntactic argument. Synthèse 101: 465–492. RAL NETWORKS: the theory of numerical vectors and tensors Bechtel, W. (1991). Connectionism and the philosophy of mind: (e.g., of activation values), matrices (e.g., of connection An overview. In T. Horgan and J. Tienson, Eds., Connectionism weights), differential equations (e.g., for the dynamics of and the Philosophy of Mind. Dordrecht: Kluwer, pp. 30–59. spreading activation or learning), probability and statistics Bechtel, W., and A. Abrahamsen. (1991). Connectionism and the (e.g., for analysis of inductive and statistical inference). Mind: An Introduction to Parallel Processing in Networks. How can linguistic phenomena traditionally analyzed with Oxford: Blackwell. discrete symbolic computation be analyzed with continuous Bechtel, W., and R. Richardson. (1993). Discovering Complexity. Princeton, NJ: Princeton University. connectionist computation? Two quite different strategies Chalmers, D. (1993). Connectionism and compositionality: Why have been pursued for facing this challenge. Fodor and Pylyshyn were wrong. Philosophical Psychology 6: The dominant, model-centered strategy proceeds as fol- 305–319. lows (see COGNITIVE MODELING, CONNECTIONIST; see also Churchland, P. S., and T. Sejnowski. (1992). The Computational STATISTICAL TECHNIQUES IN NATURAL LANGUAGE PROCESS- Brain. Cambridge, MA: MIT Press. ING): specific data illustrating some interesting linguistic Clark, A. (1989). Microcognition. Cambridge MA: MIT Press. phenomena are identified; certain general connectionist prin- Clark, A. (1993). Associative Engines: Connectionism, Concepts ciples are hypothesized to account for these data; a concrete and Representational Change. Cambridge MA: MIT Press. instantiation of these principles in a particular connectionist Forster, M. R., and E. Saidel. (1994). Connectionism and the fate network—the model—is selected; computer simulation is of folk psychology: A reply to Ramsey, Stich and Garon. Philo- sophical Psychology 7: 437–452. used to test the adequacy of the model in accounting for the Garson, J. (1994). Cognition without classical architecture. Syn- data; and, if the network employs learning, the network con- thèse 100: 291–306. figuration resulting from learning is analyzed to discern the Hanson, S., and D. Burr. (1990). What connectionist models learn: nature of the account that has been learned. Learning and representation in connectionist networks. Behav- For instance, a historically pivotal model (Rumelhart and ioral and Brain Sciences 13: 471–518. McClelland 1986) addressed the data on children’s overgen- Haugeland, J. (1978). The nature and plausibility of cognitivism. eralization of the regular past tense inflection of irregular Behavioral and Brain Sciences 2: 215–260. verbs; connectionist induction from the statistical prepon- Horgan, T., and J. Tienson, Eds. (1991). Connectionism and the derance of regular inflection was hypothesized to account Philosophy of Mind. Dordrecht: Kluwer. for these data; a network incorporating a particular repre- Lloyd, D. (1989). Simple Minds. Cambridge, MA: MIT Press. Nadel, L., L. Cooper, P. Culicover, and R. M. Harnish, Eds. (1989). sentation of phonological strings and a simple learning rule Neural Connections, Mental Computation. Cambridge, MA: was proposed; simulations of this model documented con- MIT Press. siderable but not complete success at learning to inflect Macdonald, C., and G. Macdonald, Eds. (1995). Connectionism: irregular, regular, and novel stems; and limited post hoc Debates on Psychological Explanation. Oxford: Blackwell. analysis was performed of the structure acquired by the net- McLaughlin, B. (1993). The connectionism/classicism battle to work which was responsible for its performance. win souls. Philosophical Studies 71: 163–190. The second, principle-centered, strategy approaches lan- McLaughlin, B., and T. Warfield. (1994). The allure of connection- guage by directly deploying general connectionist princi- ism re-examined. Synthèse 101: 365–400. ples, without the intervention of a particular network model. Port, F., and T. van Gelder, Eds. (1995). Mind as Motion: Explora- Selected connectionist principles are used to directly derive tions in the Dynamics of Cognition. Cambridge, MA: MIT Press. a novel and general linguistic formalism, and this formalism Ramsey, W., S. Stich, and D. Rumelhart, Eds. (1991). Philosophy is then used directly for the analysis of particular linguistic and Connectionist Theory. Hillsdale, NJ: Erlbaum. phenomena. An example is the “harmonic grammar” for- Tomberlin, J., Ed. (1995). Philosophical Perspectives. Vol. 9, AI, malism (Legendre, Miyata, and Smolensky 1990), in which Connectionism and the Philosophy of Mind. Atascadero, CA: a grammar is a set of violable or “soft” constraints on the Ridgeview. well-formedness of linguistic structures, each with a numer- ical strength: the grammatical structures are those that simultaneously best satisfy the constraints. As discussed Connectionist Approaches to Language below, this formalism is a consequence of general mathe- matical principles that can be shown to govern the abstract, In research on theoretical and COMPUTATIONAL LINGUISTICS high-level properties of the representation and processing of and NATURAL LANGUAGE PROCESSING, the dominant formal information in certain classes of connectionist systems. approaches to language have traditionally been theories of These two connectionist approaches to language are RULES AND REPRESENTATIONS. These theories assume an complementary. Although the principle-centered approach underlying symbolic COGNITIVE ARCHITECTURE based in is independent from many of the details needed to define a discrete mathematics, the theory of algorithms for manipu- concrete connectionist model, it can exploit only relatively lating symbolic data structures such as strings (e.g., of pho- basic connectionist principles. With the exception of the nemes; see PHONOLOGY), trees (e.g., of nested syntactic simplest cases, the general emergent cognitive properties of phrases; see SYNTAX), graphs (e.g., of conceptual structures the dynamics of a large number of interacting low-level con- deployed in SEMANTICS), and feature structures (e.g., of nectionist variables are not yet characterizable by mathe- Connectionist Approaches to Language 189 matical analysis—detailed computer simulation of concrete ibly that models lacking grammatical knowledge in their a networks is required. priori structure can acquire such knowledge (Elman et al. We now consider several connectionist computational 1996; Pinker and Mehler 1988; Seidenberg 1997). In addi- principles and their potential linguistic implications. These tion to this model-based research, recent formal work in principles divide into those pertaining to the learning, the COMPUTATIONAL LEARNING THEORY based in mathematical processing, and the representational components of connec- statistics has made considerable progress in the area of tionist theory. inductive learning, including connectionist methods, for- mally relating the justifiability of induction to general a pri- ori limits on the learner’s hypothesis space, and Connectionist Inductive Learning Principles quantitatively relating the number of adjustable parameters These provide one class of solution to the problem of how in a network architecture to the number of training examples the large numbers of interactions among independent con- needed for good generalization (with high probability) to nectionist units can be orchestrated so that their emergent novel examples (Smolensky, Mozer, and Rumelhart 1996). effect is the computation of an interesting linguistic func- tion. Such functions include those relating a verb stem with Connectionist Processing Principles its past tense (MORPHOLOGY); orthographic with phonologi- cal representations of a word (READING and VISUAL WORD The potential linguistic implications of connectionist princi- RECOGNITION); and a string of words with a representation ples go well beyond learning and the RATIONALISM VS. of its meaning. EMPIRICISM debate. The processing component of connec- Many learning principles have been used to investigate tionist theory includes several relevant principles. For exam- what types of linguistic structure can be induced from ple, in place of serial stages of processing, a connectionist examples. SUPERVISED LEARNING techniques learn to com- principle that might be dubbed “parallel modularity” pute a given input or output function by adapting the hypothesizes that informationally distinct modules (e.g., weights of the network during experience with training phonological, orthographic, syntactic, and semantic knowl- examples so as to minimize a measure of the overall output edge) are separate subnetworks operating in parallel with error (e.g., each training example might be a pair consisting each other, under continuous exchange of information of a verb stem and its past tense form). UNSUPERVISED through interface subnetworks (e.g., Plaut and Shallice LEARNING methods extract regularities from training data 1994). without explicit information about the regularities to be Another processing principle concerns the transforma- learned, for example, a network trained to predict the next tions of activity patterns from one layer of units to the next: letter in an unsegmented stream of text extracts aspects of In the processing of an input, the influence exerted by a pre- the distributional structure arising from the repetition of a viously stored item is proportional to both the frequency of fixed set of words, enabling the trained network to segment presentation of the stored item and its “similarity” to the the stream (Elman 1990). input, where “similarity” of activity patterns is measured by A trained network capable of computing, to some degree, a training-set–dependent metric (see PATTERN RECOGNITION a linguistically relevant function has acquired a certain AND FEEDFORWARD NETWORKS). While such frequency- degree of internal structure, manifest in the behavior of the and similarity-sensitive processing is readily termed asso- learned network (e.g., its pattern of generalization to novel ciative, it must be recognized that “similarity” is defined inputs), or more directly discernible under analysis of the relative to the internal activation pattern encoding of the learned connection weights. The final network structure is entire set of items. This encoding may itself be sensitive to jointly the product of the linguistic structure of the training the contextual or structural role of an item (Smolensky examples and the a priori structure explicitly and implicitly 1990); it may be sensitive to certain complex combinations provided the model via the selection of architectural param- of features of its content, and insensitive altogether to other eters. Linguistically relevant a priori structure includes what content features. For example, a representation may encode is implicit in the representation of inputs and outputs, the the syntactic role and category of a word as well as its pho- pattern of connectivity of the network, and the performance nological and semantic content, and the relevant “similar- measure that is optimized during learning. ity” metric may be strongly sensitive to the syntactic Trained networks have acquired many types of linguisti- information, while being completely insensitive to the pho- cally relevant structure, including nonmonotonic or “U- nological and semantic information. shaped” development (Rumelhart and McClelland 1986); A class of RECURRENT NETWORKS with feedback con- categorical perception; developmental spurts (Elman et al. nections is subject to the following principle: The net- 1996); functional modularity (behavioral dissociations in work’s activation state space contains a finite set of intact or internally damaged networks; Plaut and Shallice attractor states, each surrounded by a “basin of attraction”; 1994); localization of different functions to different spatial any input pattern lying in a given basin will eventually pro- portions of the network (Jacobs, Jordan, and Barto 1991); duce the corresponding attractor state as its output (see finite-state, machinelike structure corresponding to a learned DYNAMIC APPROACHES TO COGNITION). This principle grammar (Touretzky 1991). Before a consensus can be relates a continuous space of possible input patterns and a reached on the implications of learned structure for POVERTY continuous processing mechanism to a discrete set of out- OF THE STIMULUS ARGUMENTS and the INNATENESS OF LAN- puts, providing the basis for many connectionist accounts GUAGE, researchers will have to demonstrate incontrovert- of categorical perception, categorical retrieval of lexical 190 Consciousness items from memory, and categorization processes gener- Plaut, D. C., J. L. McClelland, M. S. Seidenberg, and K. Patterson. (1996). Understanding normal and impaired word reading: ally. For example, the pronunciation model of Plaut et al. Computational principles in quasi-regular domains. Psycholog- (1996) acquires a combinatorially structured set of output ical Review 103: 56–115. attractors encoding phonological strings including mono- Prince, A., and P. Smolensky. (1993). Optimality Theory: Con- syllabic English words, and an input encoding a letter straint Interaction in Generative Grammar. RuCCS Technical string yields an output activation pattern that is an attractor Report 2, Rutgers Center for Cognitive Science, Rutgers Uni- for a corresponding pronunciation. versity, Piscataway, NJ, and Department of Computer Science, A related principle governing processing in a class of University of Colorado at Boulder. recurrent networks characterizes the output of the network Rumelhart, D., and J. L. McClelland. (1986). On learning the past as an optimal activation pattern: among those patterns con- tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, taining the given input pattern, the output is the pattern that and the PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2, Psy- maximizes a numerical well-formedness measure, harmony, chological and Biological Models. Cambridge, MA: MIT or that minimizes “energy” (see also CONSTRAINT SATIS- Press, pp. 216–271. FACTION). This principle has been used in combination with Seidenberg, M. (1997). Language acquisition and use: Learning the following one to derive a general grammar formalism, and applying probabilistic constraints. Science 275: 1599–1603. harmonic grammar, described above as an illustration of Smolensky, P. (1990). Tensor product variable binding and the rep- principle-centered research. Harmonic grammar is a precur- resentation of symbolic structures in connectionist networks. sor to OPTIMALITY THEORY (Prince and Smolensky 1993), Artificial Intelligence 46: 159–216. which adds further strong restrictions on what constitutes a Smolensky, P., M. C. Mozer, and D. E. Rumelhart, Eds. (1996). possible human grammar. These include the universality of Mathematical Perspectives on Neural Networks. Mahwah, NJ: grammatical constraints, and the requirement that the Erlbaum. Touretzky, D. S., Ed. (1991). Machine Learning 7(2/3). Special strengths of the constraints be such as to entail “strict domi- issue on connectionist approaches to language learning. nation”: the cost of violating one constraint can never be exceeded by any amount of violation of weaker constraints. Further Readings Connectionist Representational Principles Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition 48: 71–99. Research on the representational component of connec- Goldsmith, J., Ed. (1993). The Last Phonological Rule: Reflections tionist theory has focused on statistically based analyses of on Constraints and Derivations. Chicago: University of Chi- internal representations learned by networks trained on cago Press. linguistic data, and on techniques for representing, in Hare, M., and J. L. Elman. (1994). Learning and morphological numerical activation patterns, information structured by change. Cognition 49. linear precedence, attribute/value, and dominance relations Hinton, G. E. (1991). Connectionist Symbol Processing. Cam- (e.g., Smolensky 1990; see BINDING PROBLEM). While this bridge, MA: MIT Press. Miikkulainen, R. (1993). Subsymbolic Natural Language Process- research shows how complex linguistic representations ing: An Integrated Model of Scripts, Lexicon, and Memory. may be realized, processed, and learned in connectionist Cambridge, MA: MIT Press. networks, contributions to the theory of linguistic repre- Plunkett, K., and V. Marchman. (1993). From rote learning to sys- sentation remain largely a future prospect. tem building: Acquiring verb morphology in children and con- See also COGNITIVE MODELING, CONNECTIONIST; DIS- nectionist nets. Cognition 48: 21–69. TRIBUTED VS. LOCAL REPRESENTATION; NATIVISM Sharkey, N., Ed. (1992). Connectionist Natural Language Process- ing. Dordrecht: Kluwer. —Paul Smolensky Wheeler, D. W., and D. S. Touretzky. (1993). A connectionist implementation of cognitive phonology. In J. Goldsmith, Ed., References The Last Phonological Rule: Reflections on Constraints and Derivations. Chicago: University of Chicago Press, pp. 146– Elman, J. L. (1990). Finding structure in time. Cognitive Science 14: 179–211. 172. Elman, J., E. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Parisi, and K. Plunkett. (1996). Rethinking Innateness: A Connection- ist Perspective on Development. Cambridge, MA: MIT Press. Consciousness Jacobs, R. A., M. I. Jordan, and A. G. Barto. (1991). Task decom- position through competition in a modular connectionist archi- tecture: The what and where vision tasks. Cognitive Science 15: Conscious mental states include sensations, such as the 219–250. pleasure of relaxing in a hot bath or the discomfort of a Legendre, G., Y. Miyata, and P. Smolensky. (1990). Harmonic hangover, perceptual experiences, such as the visual experi- grammar: A formal multi-level connectionist theory of linguis- ence of a computer screen about half a meter in front of me, tic well-formedness: Theoretical foundations. In Proceedings and occurrent thoughts, such as the sudden thought about of the Twelfth Annual Conference of the Cognitive Science how a problem can be solved. Consciousness is thus a per- Society. Cambridge, MA, pp. 388–395. vasive feature of our mental lives, but it is also a perplexing Pinker, S., and J. Mehler. (1988). Connections and Symbols. Cam- one. This perplexity—the sense that there is something bridge, MA: MIT Press. mysterious about consciousness despite our familiarity with Plaut, D., and T. Shallice. (1994). Connectionist Modelling in Cog- sensation, perception, and thought—arises principally from nitive Neuropsychology: A Case Study. Hillsdale, NJ: Erlbaum. Consciousness 191 the question of how consciousness can be the product of Rosenthal (1986). In this theory, consciousness, considered physical processes in our brains. as a property of mental states, is analyzed in terms of con- Ullin Place (1956) introduced a precursor of central state sciousness of mental states, while consciousness of some- materialism for conscious states such as sensations. But the thing is analyzed in terms of having a thought about that idea that types of conscious experience are to be identified thing. Thus for a mental state to be a conscious mental state with types of brain processes raises an important question, is for the subject of that state to have a thought about it. If which can be made vivid by using Thomas Nagel’s (1974) the higher-order thought theory were to be correct, then the idea of WHAT-IT’S-LIKE to be in a certain state—and, more occurrence of consciousness in the physical world would generally, the idea of there being something that it is like to not be any more mysterious than the occurrence of mental be a certain creature or system. The question is, why should states, which are not in themselves conscious states, or the there be something that it is like for certain processes to be occurrence of thoughts about mental states. occurring in our brains? Nagel’s famous example of what it However, there are some quite serious problems for the is like to be a bat illustrates that our grasp of facts about the higher-order thought theory. One is that the theory seems to subjective character of experiences depends very much on face a kind of dilemma. If the notion of thought employed our particular perceptual systems. Our grasp on physical or is a demanding one, then there could be something that it is neurophysiological theories, in contrast, is not so depen- like for a creature to be in certain states even though the dent. Thus it may appear that subjective facts are not to be creature did not have (perhaps, even, could not have) any identified with the facts that are spelled out in those scien- thoughts about those states. In that case, higher-order tific theories. This Nagelian argument about the elusiveness thought is not necessary for consciousness. But if the of QUALIA is importantly similar to Frank Jackson’s (1982, notion of thought that is employed is a thin and undemand- 1986) “knowledge argument” and similar responses have ing one, then higher-order thought is not sufficient for con- been offered to both (Churchland 1985, 1988; and for a sciousness. Suppose, for example, that thought is said to reply, Braddon-Mitchell and Jackson 1996). require no more than having discriminative capacities. Ned Block’s (1978) “absent qualia argument” is different Then it seems clear that a creature, or other system, could from the arguments of Nagel and Jackson because it is spe- be in a certain type of mental state, and could have a capac- cifically directed against FUNCTIONALISM: the idea that ity to detect whether it was in a state of that type, even mental states are individuated by the causal roles they play though there was nothing that it was like to be that creature in the total mental economy, rather than by the particular or system. neurophysiological ways these roles are realized. The prob- More generally, work toward the demystification of con- lem for functionalism is that we can imagine a system (e.g., sciousness has a negative and a positive aspect. The negative Block’s homunculi-headed system) in which there is noth- aspect consists in seeking to reveal unclarities and para- ing that it is like to be that system, even though there are, doxes in the notion of the subjective character of experience within the system, devices that play the various functional (e.g., Dennett 1988, 1991). The positive aspect consists in roles associated with sensations, perceptions, and thoughts. offering putative explanations of one or another property of This argument is not intended for use against a physicalist conscious experience in neural terms. Paul Churchland who (in the style of Place and subsequent central state mate- (1988, 148) clearly illustrates how to explain certain struc- rialists) simply identifies conscious mental states with brain tural features of our experiences of color (for example, that processes (pain with C-fibers firing, for example). The an experience of orange is more like an experience of red examples used in the absent qualia argument may, however, than it is like an experience of blue). The explanation be used to support the claim that it is even logically possible appeals to the system of neural coding for colors that there could be a physical duplicate of a normal human being involves triples of activation values corresponding to the that nevertheless lacked qualia (a “zombie”; Chalmers illumination reaching three families of cones, and to struc- 1996). tural properties of the three-dimensional space in which It is a disputed question whether arguments like Nagel’s they are plotted (see COLOR VISION). But while this is a sat- can establish an ontological conclusion that consciousness isfying explanation of those structural features of color involves something nonphysical (see MIND-BODY PROB- experiences, it seems to leave us without any account of LEM). But even if they cannot, there still appears to be a why it is like anything at all to see red. Why there are any problem about consciousness; namely, it is a mystery why experiential correlates of the neural codes is left as a brute there should be something that it is like to undergo certain unexplained fact. The demystifier of consciousness may physical processes. This is what Joseph Levine (1983) has then reply that this appearance of residual mystery is illu- called the EXPLANATORY GAP. Jackson and Block both join sory, and that it is a product either of fallacies and confu- Nagel in seeing a puzzle at this point, and Colin McGinn sions that surround the notion of the subjective character of (1989) has argued that understanding how physical pro- experience or else of an illegitimately high standard im- cesses give rise to consciousness is cognitively beyond us posed on explanation. (for a critical appraisal of McGinn’s argument, see Flanagan The notion of consciousness associated with the idea of 1992). the subjective character of experience, and which generates One possible strategy for demystifying the notion of con- the “hard problem” of consciousness (Chalmers 1996), is sciousness is to claim that consciousness is a matter of sometimes called “phenomenal consciousness.” There are thought about mental states. This is the “higher-order several other notions for which the term consciousness is thought theory of consciousness” favored by David sometimes used (Allport 1988), including being awake, 192 Consciousness voluntary action, ATTENTION, monitoring of internal states, enon to a system implies that the phenomenon is in reportability, INTROSPECTION, and SELF-KNOWLEDGE. The principle accessible to consciousness.” This is to say that, distinctions among these notions are important, especially while we can allow for unconscious intentional states, such for the assessment of cognitive psychological and neurosci- as unconscious thoughts, these have to be seen as secondary, entific theories of consciousness (see CONSCIOUSNESS, and as standing in a close relation to conscious intentional states. Searle’s argument is naturally interpreted as being NEUROBIOLOGY OF). directed toward the conclusion that central cases of thinking One particularly useful contrast is between phenomenal are at least akin to phenomenally conscious states. consciousness and “access consciousness” (Block 1995, Even if one does not accept Searle’s argument for the 231): “A state is access-conscious if, in virtue of one’s hav- connection principle, there is a plausible argument for a ing the state, a representation of its content is (1) inferen- weaker version of his conclusion. The INTENTIONALITY of tially promiscuous, that is, poised to be used as a premise in reasoning, (2) poised for rational control of action, and (3) human thought involves modes of presentation of objects poised for rational control of speech. . . . [Access conscious- and properties (see SENSE AND REFERENCE); demonstrative ness is] a cluster concept, in which (3)—roughly, reportabil- modes of presentation afforded by perceptual experience of ity—is the element of the cluster with the smallest weight, objects and their properties constitute particularly clear though (3) is often the best practical guide to [access con- examples. For example, we think of an object as “that [per- sciousness].” The two notions appear to be independent in ceptually presented] cat” or of a property as “that color.” the sense that it is possible to have phenomenal (P) con- Suppose now that it could be argued that some theoretical sciousness without access (A) consciousness, and vice versa. primacy attaches to these “perceptual demonstrative” modes An example of P-consciousness without A-consciousness of presentation (Perry 1979). It might be argued, for exam- would be a situation in which there is an audible noise to ple, that in order to be able to think about objects at all, a which we pay no attention because we are engrossed in con- subject needs to be able to think about objects under percep- versation. As an example of A-consciousness without P- tual demonstrative modes of presentation. Such an argument consciousness, Block (1995, 233) suggests an imaginary would establish a deep connection between intentionality phenomenon of “superblindsight.” In ordinary cases of and consciousness. BLINDSIGHT, patients are able to guess correctly whether Finally, there is another way phenomenal consciousness there is, for example, an O or an X in the blind region of might enter the theory of thought. It might be because a their visual field, even though they are unable to see either thinker’s thoughts are phenomenally conscious states, that an O or an X there. The state that represents an O or an X is they also have the more dispositional properties (such as neither a P-conscious nor an A-conscious state. In super- reportability) mentioned in the definition of access con- blindsight, there is still no P-consciousness, but now the sciousness. This phenomenal consciousness property might patient is imagined to be able to make free use in reasoning also figure in the explanation of a thinker’s being able to of the information that there is an O, or that there is an X. engage in critical reasoning—evaluating and assessing rea- While the notion of phenomenal consciousness applies sons and reasoning as such (Burge 1996). It is far from most naturally to sensations and perceptual experiences, the clear, however, whether this idea can be worked out in a sat- notion of access consciousness applies very clearly to isfactory way. Would the idea require a sensational phenom- thoughts. It is not obvious whether we should extend the enology for thinking? If it does require that, then it might be notion of phenomenal consciousness to include thoughts as natural to suggest that phenomenally conscious thoughts are well as sensory experiences. But the idea of an important clothed in the phonological or orthographic forms of natural connection between consciousness and thought is an engag- language sentences (Carruthers 1996). ing one. Sometimes, for example, it seems hard to accept that there could be a fully satisfying reconstruction of think- —Martin Davies ing in the terms favored by the physical sciences. This intu- ition is similar to, and perhaps derives from, the intuition References that consciousness somehow defies scientific explanation. The question whether there is an important connection Allport, A. (1988). What concept of consciousness? In A. J. Mar- between consciousness and thought divides into two: Does cel and E. Bisiach, Eds., Consciousness in Contemporary Sci- consciousness require thought? Does thought require con- ence. Oxford: Oxford University Press, pp. 159–182. sciousness? The intuitive answer to the first question is that Block, N. (1978). Troubles with functionalism. In C. Wade Savage, access consciousness evidently does require thought, but Ed., Minnesota Studies in the Philosophy of Science, vol. 9. Minneapolis: University of Minnesota Press, pp. 261–325. that phenomenal consciousness does not. (The appeal of this Block, N. (1995). On a confusion about a function of conscious- intuitive answer is the source of some objections to the ness. Behavioral and Brain Sciences 18: 227–287. higher-order thought theory of consciousness.) The answer Braddon-Mitchell, D., and F. Jackson (1996). Philosophy of Mind to the second question as it concerns access consciousness and Cognition. Oxford: Blackwell. is that there is scarcely any distance at all between the Burge, T. (1996). Our entitlement to self-knowledge. Proceedings notion of thought and the notion of access consciousness. of the Aristotelian Society 96: 91–116. But when we focus on phenomenal consciousness, the Carruthers, P. (1996). Language, Thought and Consciousness. answer to the second question is less clear. Cambridge: Cambridge University Press. John Searle (1990, 586) argues for the connection princi- Chalmers, D. (1996). The Conscious Mind: In Search of a Funda- ple: “The ascription of an unconscious intentional phenom- mental Theory. New York: Oxford University Press. Consciousness, Neurobiology of 193 Consciousness, Neurobiology of Churchland, P. M. (1985). Reduction, qualia and the direct intro- spection of brain states. Journal of Philosophy 82: 8–28. Churchland, P. M. (1988). Matter and Consciousness. Rev. ed. Cambridge, MA: MIT Press. After a hiatus of fifty years or more, the physical origins of Dennett, D. C. (1988). Quining qualia. In A. J. Marcel and E. Bisi- CONSCIOUSNESS are being once again vigorously debated, in ach, Eds., Consciousness in Contemporary Science. Oxford: hundreds of books and monographs published in the last Oxford University Press, pp. 42–77. decade. What sparse facts can we ascertain about the neuro- Dennett, D. C. (1991). Consciousness Explained. Boston: Little, biological basis of consciousness, and what can we reason- Brown. ably assume at this point in time? Flanagan, O. (1992). Consciousness Reconsidered. Cambridge, By and large, neuroscientists have made a number of MA: MIT Press. working assumptions that need to be justified more fully. In Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quar- terly 32: 127–136. particular, Jackson, F. (1986). What Mary didn’t know. Journal of Philosophy 1. There is something to be explained, that is, the subjec- 83: 291–295. tive content associated with a conscious sensation (what Levine, J. (1983). Materialism and qualia: The explanatory gap. philosophers refer to as QUALIA; see also WHAT-IT’S-LIKE) Pacific Philosophical Quarterly 64: 354–361. does exist and has its physical basis in the brain. McGinn, C. (1989). Can we solve the mind-body problem? Mind 2. Consciousness is one of the principal properties of the 98: 349–366. human brain, a highly evolved system; it must therefore Nagel, T. (1974). What is it like to be a bat? Philosophical Review have a useful function to perform. Crick and Koch (1995) 83: 435–450. assume that the function of visual consciousness is to pro- Perry, J. (1979). The problem of the essential indexical. Noûs 13: duce the best current interpretation of the visual scene—in 3–21. Place, U. T. (1956). Is consciousness a brain process? British Jour- the light of past experiences—and to make it available for a nal of Psychology 47: 44–50. sufficient time to the parts of the brain that contemplate, Rosenthal, D. M. (1986). Two concepts of consciousness. Philo- plan, and execute voluntary motor outputs (including lan- sophical Studies 94: 329–359. guage). This needs to be contrasted with the on-line systems Searle, J. R. (1990). Consciousness, explanatory inversion, and that bypass consciousness but can generate stereotyped cognitive science. Behavioral and Brain Sciences 13: 585–596. behaviors (see below). 3. At least some animal species (i.e., non-human primates Further Readings such as the macaque monkey) are assumed to possess some aspects of consciousness. Consciousness associated with Block, N. (1998). How to find the neural correlate of conscious- ness. In A. O’Hear, Ed., Contemporary Issues in the Philoso- sensory events is likely to be very similar in humans and phy of Mind. Royal Institute of Philosophy Supplement 43. monkeys for several reasons. First, trained monkeys behave Cambridge: Cambridge University Press, pp. 23–34. as humans do under controlled conditions for most sensory Block, N., O. Flanagan, and G. Güzeldere, Eds. (1997). The Nature tasks (e.g., visual motion discrimination; see MOTION, PER- of Consciousness: Philosophical Debates. Cambridge, MA: CEPTION OF; Wandell 1995). Second, the gross neuroanat- MIT Press. omy of humans and nonhuman primates is the same, once Crick, F., and C. Koch. (1995). Are we aware of neural activity in the difference in size has been accounted for. Finally, MAG- primary visual cortex? Nature 375: 121–123. NETIC RESONANCE IMAGING in humans is confirming the Davies, M., and G. W. Humphreys, Eds. (1993). Consciousness: existence of a functional organization very similar to that Psychological and Philosophical Essays. Oxford: Blackwell. discovered by single-cell electrophysiology in the monkey Dennett, D. C., and M. Kinsbourne. (1992). Time and the observer: The where and when of consciousness in the brain. Behavioral (Tootell et al. 1996). As a corollary, it follows that language and Brain Sciences 15: 183–247. is not necessary for consciousness to occur (although it Metzinger, T., Ed. (1995). Conscious Experience. Paderborn: greatly enriches human consciousness). In the following, we Schöningh. will mainly concentrate on sensory consciousness, and, in Nelkin, N. (1996). Consciousness and the Origins of Thought. particular, on visual consciousness, because it is experimen- Cambridge: Cambridge University Press. tally the most accessible and the best understood. Peacocke, C. (1998). Conscious attitudes, attention and self- Cognitive and clinical research demonstrates that much knowledge. In C. Wright, B. C. Smith, and C. Macdonald, Eds., complex information processing can occur without involving Knowing Our Own Minds. Oxford: Oxford University Press, consciousness, both in normals as well as in patients. Exam- pp. 63–98. ples of this include BLINDSIGHT (Weiskrantz 1997), priming, Rolls, E. T. (1997). Consciousness in neural networks? Neural Net- works 10: 1227–1240. and the implicit recognition of complex sequences (Velmans Schacter, D. L. (1989). On the relation between memory and con- 1991; Berns, Cohen, and Mintun 1997). Milner and Goodale sciousness: Dissociable interactions and conscious experience. (1995) have made a masterful case for the existence of so- In H. Roediger and F. Craik, Eds., Varieties of Memory and called on-line visual systems that bypass consciousness, and Consciousness: Essays in Honor of Endel Tulving. Hillsdale, that serve to mediate relative stereotype visual-motor behav- NJ: Erlbaum. iors, such as eye and arm movements as well as posture Shear, J., Ed. (1997). Explaining Consciousness: The Hard Prob- adjustments, in a very rapid manner. On-line systems work lem. Cambridge, MA: MIT Press. in egocentric coordinate systems and lack both certain types Tye, M. (1995). Ten Problems of Consciousness: A Representa- of perceptual ILLUSIONS (e.g. size illusion) and direct access tional Theory of the Phenomenal Mind. Cambridge, MA: MIT to WORKING MEMORY. Milner and Goodale (1995; see also Press. 194 Consciousness, Neurobiology of Rossetti forthcoming) hypothesize that on-line systems are cepts, alternating in time, that arise from a constant visual associated with the dorsal stream of visual information in the stimulus as in a Necker cube (Crick and Koch 1992). In one CEREBRAL CORTEX, originating in the primary VISUAL COR- such case, a small image, say of a horizontal grating, is pre- TEX (V1) and terminating in the posterior parietal cortex (see sented to the left eye and another image, say of a vertical VISUAL PROCESSING STREAMS). This contrasts well with the grating, is presented to the corresponding location in the function of consciousness alluded to above, namely, to syn- right eye. In spite of the constant retinal stimulus, observers thesize information from many different sources and use it to “see” the horizontal grating alternate every few seconds plan behavioral patterns over time. with the vertical one, a phenomenon known as “binocular What is the neuronal correlate of consciousness? Most rivalry” (Blake 1989). The brain does not allow for the popular has been the belief that consciousness arises as an simultaneous perception of both images. emergent property of a very large collection of interacting It is possible, though difficult, to train a macaque mon- neurons (Popper and Eccles 1981; Libet 1995). An alterna- key to report whether it is currently seeing the left or the tive hypothesis is that there are special sets of “conscious- right image. The distribution of the switching times and the ness” neurons distributed throughout cortex (and associated way in which changing the contrast in one eye affects these systems, such as the THALAMUS and the BASAL GANGLIA) times leaves little doubt that monkeys and humans experi- that represent the ultimate neuronal correlate of conscious- ence the same basic phenomenon (Myerson, Miezin, and ness (NCC), in the sense that activity of an appropriate sub- Allman 1981). In a series of elegant experiments, Logothe- set of them is both necessary and sufficient to give rise to an tis and colleagues (Logothetis and Schall 1989; Leopold appropriate conscious experience or percept (Crick and and Logothetis 1996; Sheinberg and Logothetis 1997) Koch 1995). NCC neurons would, most likely, be character- recorded from a variety of monkey cortical areas during this ized by a unique combination of molecular, biophysical, task. In early visual cortex, only a small fraction of cells pharmacological, and anatomical traits. It is also possible, modulated their response as a function of the percept of the of course, that all cortical neurons may be capable of partic- monkey, while 20 to 30 percent of neurons in MT and V4 ipating in the representation of one percept or another, at cells did. The majority of cells increased their firing rate in one time or another, though not necessarily doing so for all response to one or the other retinal stimulus with no regard percepts. The secret of consciousness would then consist of to what the animal perceived at the time. In contrast, in a all cortical neurons representing that particular percept at high-level cortical area, such as the inferior temporal cortex that moment (see BINDING BY NEURAL SYNCHRONY). (IT), almost all neurons responded only to the perceptual dominant stimulus (in other words, a “face” cell only fired Where could such NCC neurons be found? Based on when the animal indicated by its performance that it saw the clinical evidence that small lesions of the intralaminar face and not the sunburst pattern in the other eye). This nuclei of the thalamus (ILN) cause loss of consciousness makes it likely that the NCC is located among—or be- and coma and that ILN neurons project widely and recipro- yond—IT neurons. cally into the cerebral cortex, ILN neurons have been pro- Finding the NCC would only be the first, albeit critical, posed as the site where consciousness is generated (Bogen step in understanding consciousness. We also need to know 1995; Purpura and Schiff 1997). It is more likely, however, where these cells project to, their postsynaptic action, and that ILN neurons provide an enabling or arousal signal with- what happens to them in various diseases known to affect out which no significant cortical processing can occur. The consciousness, such as schizophrenia or AUTISM, and so on. great specificity associated with the content of our con- sciousness at any point in time can only be mediated by neu- And, of course, a final theory of consciousness would have rons in the cerebral cortex, its associated specific thalamic to explain the central mystery—why a physical system with nuclei, and the basal ganglia. It is here, among the neurons a particular architecture gives rise to feelings and qualia. whose very specific response properties have been exten- (Chalmers 1996). sively characterized by SINGLE-NEURON RECORDING, that See also ATTENTION; ATTENTION IN THE ANIMAL BRAIN; we have to look for the NCC. ATTENTION AND THE HUMAN BRAIN; SENSATIONS What, if anything, can we infer about the location of —Christof Koch and Francis Crick these neurons? In the case of visual consciousness, Crick and Koch (1995) surmised that these neurons must have References access to visual information and project to the planning stages of the brain, that is, to premotor and frontal areas Berns, G. S., J. D. Cohen, and M. A. Mintun. (1997). Brain regions (Fuster 1997). Because in the macaque monkey, no neurons responsive to novelty in the absence of awareness. Science 276: 1272–1275. in primary visual cortex project to any area anterior to the Blake, R. (1989). A neural theory of binocular rivalry. Psychol. central sulcus, Crick and Koch (1995) proposed that neu- Rev. 96: 145–167. rons in V1 do not directly give rise to consciousness Bogen, J. E. (1995). On the neurophysiology of consciousness: 1. (although V1 is necessary for most forms of vision, just as An overview. Consciousness and Cognition 4: 52–62. the retina is). Current electrophysiological, psychophysical, Chalmers, D. (1996). The Conscious Mind: In Search of a Funda- and imaging evidence (He, Cavanagh, and Intriligator 1996; mental Theory. Oxford: Oxford University Press. Engel, Zhang, and Wandell 1997) supports the hypothesis Crick, F., and C. Koch. (1992). The problem of consciousness. Sci- that the NCC is not to be found among V1 neurons. entific American 267(3): 153–159. A promising experimental approach to locate the NCC Crick, F., and C. Koch. (1995). Are we aware of neural activity in has been the use of bistable percepts, that is, pairs of per- primary visual cortex? Nature 375: 121–123. Constraint Satisfaction 195 and to find an optimal solution relative to a given cost func- Engel, S., X. Zhang, and B. Wandell (1997). Colour tuning in human visual cortex measured with functional magnetic reso- tion. A well-known example of a constraint satisfaction nance imaging. Nature 388: 68–71. problem is k-colorability, where the task is to color, if possi- Fuster, J. M. (1997). The Prefrontal Cortex: Anatomy, Physiology, ble, a given graph with k colors only, such that any two adja- and Neuropsychology of the Frontal Lobe. 3rd ed. Philadelphia: cent nodes have different colors. A constraint satisfaction Lippincott-Raven. formulation of this problem associates the nodes of the He, S., P. Cavanagh, and J. Intriligator. (1996). Attentional resolu- graph with variables, the possible colors are their domains, tion and the locus of visual awareness. Nature 383: 334–337. and the inequality constraints between adjacent nodes are Leopold, D. A., and N. K. Logothetis. (1996). Activity changes in the constraints of the problem. Each constraint of a CSP early visual cortex reflect monkeys’ percepts during binocular may be expressed as a relation, defined on some subset of rivalry. Nature 379: 549–553. variables, denoting legal combinations of their values. Con- Libet, B. (1995). Neurophysiology of Consciousness: Selected Papers and New Essays. Boston: Birkhäuser. straints can also be described by mathematical expressions Logothetis, N., and J. Schall. (1989). Neuronal correlates of sub- or by computable procedures. Another typical constraint jective visual perception. Science 245: 761–763. satisfaction problem is SATisfiability, the task of finding the Milner, D., and M. Goodale. (1995). The Visual Brain in Action. truth assignment to propositional variables such that a given Oxford: Oxford University Press. set of clauses is satisfied. For example, given the two Myerson, J., F. Miezin, and J. Allman. (1981). Binocular rivalry in clauses (A ∨ B ∨ ¬ C), (¬ A ∨ D), the assignment of false to macaque monkeys and humans: A comparative study in percep- A, true to B, false to C, and false to D, is a satisfying truth tion. Behav. Anal. Lett. 1: 149–156. value assignment. Popper, K. R., and J. C. Eccles. (1981). The Self and Its Brain. Ber- The structure of a constraint network is depicted by a lin: Springer. constraint graph whose nodes represent the variables and in Purpura, K. P., and N. D. Schiff (1997). The thalamic intralaminar nuclei: a role in visual awareness. Neuroscientist 3: 8–15. which any two nodes are connected if the corresponding Rossetti, Y. (Forthcoming). Implicit perception in action: Short- variables participate in the same constraint. In the k-col- lived motor representations of space evidenced by brain- orability formulation, the graph to be colored is the con- damaged and healthy subjects. In P. G. Grossenbacher, Ed., straint graph. In our SAT example the constraint graph has A Finding Consciousness in the Brain. Philadelphia: Benjamins. connected to D, and A, B, and C are connected to each other. Sheinberg, D. L., and N. K. Logothetis. (1997). The role of tempo- Constraint networks have proven successful in modeling ral cortical areas in perceptual organization. Proc. Natl. Acad. mundane cognitive tasks such as vision, language compre- Sci. U.S.A.196 94: 3408–3413. hension, default reasoning, and abduction, as well as in Tootell, R. B. H., A. M. Dale, M. I. Sereno, and R. Malach. (1996). applications such as scheduling, design, diagnosis, and tem- New images from human visual cortex. Trends Neurosci. 19: poral and spatial reasoning. In general, constraint satisfac- 481–489. Velmans, M. (1991). Is human information processing conscious? tion tasks are computationally intractable (“NP-hard”; see Behavioral Brain Sci. 14: 651–726. COMPUTATIONAL COMPLEXITY). Wandell, B. A. (1995). Foundations of Vision. Sunderland, MA: ALGORITHMS for processing constraints can be classified Sinauer. into two interacting categories: (1) search and (2) consis- Weiskrantz, L. (1997). Consciousness Lost and Found. Oxford: tency inference. Search algorithms traverse the space of par- Oxford University Press. tial instantiations, while consistency inference algorithms reason through equivalent problems. Search algorithms are Further Readings either systematic and complete or stochastic and incom- Crick, F., and C. Koch. (1998). Consciousness and neuroscience. plete. Likewise, consistency inference algorithms have Cerebral Cortex 8: 97–107. either complete solutions (e.g., variable-elimination algo- Jackendoff, R. (1987). Consciousness and the Computational rithms) or incomplete solutions (i.e., local consistency algo- Mind. Cambridge, MA: MIT Press. rithms). Zeki, S. (1993). Vision of the Brain. Oxford: Blackwell. Local consistency algorithms, also called “consistency- enforcing” or “constraint propagation” algorithms (Mon- Consensus Theory tanari 1974; Mackworth 1977; Freuder 1982), are polyno- mial algorithms that transform a given constraint network into an equivalent, yet more explicit network by deducing See CULTURAL CONSENSUS THEORY new constraints to be added onto the network. Intuitively, a consistency-enforcing algorithm will make any partial solu- tion of a small subnetwork extensible to some surrounding Constraint Satisfaction network. For example, the most basic consistency algorithm, called an “arc consistency” algorithm, ensures that any legal A constraint satisfaction problem (CSP) is defined over a value in the domain of a single variable has a legal match in constraint network, which consists of a finite set of vari- the domain of any other selected variable. A “path consis- ables, each associated with a domain of values, and a set of tency” algorithm ensures that any consistent solution to a constraints. A solution is an assignment of a value to each two-variable subnetwork is extensible to any third variable, variable from its domain such that all the constraints are sat- and, in general, i-consistency algorithms guarantee that any isfied. Typical constraint satisfaction problems are to deter- locally consistent instantiation of i – 1 variables is extensible mine whether a solution exists, to find one or all solutions, to any ith variable. Enforcing i-consistency is time and space 196 Constraint Satisfaction the same conflicts will not arise again, a process known as exponential in i. Algorithms for i-consistency frequently “constraint learning” and “no-good recording” (Stallman decide inconsistency. and Sussman 1977; Dechter 1990). A network is globally consistent if it is i-consistent for Stochastic local search strategies have been recently every i, which means a solution can be assembled by assign- reintroduced into the satisfiability and constraint satisfac- ing values using any variable ordering without encountering tion literature under the umbrella name (GSAT “greedy SAT- any dead end, namely, in a “backtrack-free” manner. How- ever, it is enough to possess directional global consistency isfiability”; see GREEDY LOCAL SEARCH). These methods relative to a given ordering only. Indeed, an adaptive consis- move in hill-climbing manner in the space of complete tency (variable elimination) algorithm enforces global con- instantiations to all the variables (Minton et al. 1990). The sistency in a given order only, such that every solution can be algorithm improves its current instantiation by “flipping” a extracted, with no dead ends along this ordering. Another value of a variable that maximizes the number of constraints related algorithm, called a “tree-clustering” algorithm, com- satisfied. Such search algorithms are incomplete, may get piles the given constraint problem into an equivalent tree of stuck in a local maxima, and cannot prove inconsistency. subproblems (Dechter and Pearl 1989) whose respective Nevertheless, when equipped with some heuristics for ran- solutions can be efficiently combined into a complete solu- domizing the search walksat or for revising the guiding cri- tion. Adaptive consistency and tree-clustering algorithms are terion function (constraint reweighting), they prove time and space exponential in a parameter of the constraint successful in solving large and hard problems that are fre- graph called an “induced-width” or “tree-width” (Arnborg quently hard for backtracking search algorithms (Selman, and Proskourowski 1989; Dechter and Pearl 1987). Levesque, and Mitchell 1992). When a problem is computationally hard for the adap- Structure-driven algorithms cut across both search and tive consistency algorithm, it can be solved by bounding consistency inference algorithms. These techniques emer- the amount of consistency enforcing (e.g., arc or path con- ged from an attempt to topologically characterize con- sistency), and by augmenting the algorithm with a search straint problems that are tractable. Tractable classes were component. Generally speaking, search will benefit from generally recognized by realizing that enforcing low-level network representations that have a high level of consis- consistency (in polynomial time) guarantees global consis- tency. However, because the complexity of enforcing i- tency for some problems. The basic network structure that consistency is exponential in i, there is a trade-off between supports tractability is a tree (Mackworth and Freuder the effort spent on consistency inference and that spent on 1985). In particular, enforcing arc consistency on a tree- search. Theoretical and empirical studies of this trade-off, structured network ensures global consistency along some prior to or during search, aim at identifying a problem- ordering. Most other graph-based techniques can be viewed dependent cost-effective balance (Haralick and Elliot as transforming a given network into a metatree. Adaptive 1980; Prosser 1993; Sabin and Freuder 1994; Dechter and consistency, tree clustering, and constraint learning, are all Rish 1994). time and space exponentially bounded by the tree width of The most common algorithm for performing systematic the constraint graph; the cycle cutset scheme combines search is the backtracking algorithm, which traverses the search and inference and is exponentially bounded by the space of partial solutions in a depth-first manner. At each constraint graph’s cycle-cutset; the biconnected component step, the algorithm extends a partial solution by assigning a method is bounded by the size of the constraint graph’s value to one more variable. When a variable is encountered largest biconnected component (Freuder 1982); and back- such that none of its values are consistent with the partial jumping is exponentially bounded by the depth of the solution (a situation referred to as a “dead end”), backtrack- graph’s depth-first search tree. The last three methods ing takes place. The algorithm is time exponential, but require only polynomial space. requires only linear space. Tractable classes were also identified by the properties of Improvements of the backtracking algorithm have the constraints themselves. Such tractable classes exploit focused on the two phases of the algorithm: moving forward notions such as tight domains and tight constraints (van Beek (look-ahead schemes) and backtracking (look-back schemes; and Dechter 1997), row-convex constraints (van Beek and Dechter 1990). When moving forward, to extend a partial Dechter 1995), implicational and max-ordered constraints solution, some computation (e.g., arc consistency) is carried (Kirousis 1993; Jeavons, Cohen, and Gyssens 1997), as well out to decide which variable and value to choose next. For as causal networks. A connection between tractability and variable orderings, variables that maximally constrain the algebraic closure was recently discovered (Cohen, Jeavons, rest of the search space are preferred. For value selection, and Gyssens 1995). however, the least constraining value is preferred, in order to Finally, special classes of tractable constraints associated maximize future options for instantiations (Haralick and with TEMPORAL REASONING have received much attention in Elliot 1980; Dechter and Pearl 1987; Purdom 1983; Sabin the last decade. These include subsets of qualitative interval and Freuder 1994). algebra (Golumbic and Shamir 1993) expressing relation- Look-back schemes are invoked when the algorithm ships such as “time interval A overlaps or precedes time encounters a dead end. These schemes perform two func- interval B,” as well as quantitative binary linear inequalities over the real numbers of the form X – Y ≤ a (Dechter, Meiri, tions: (1) they decide how far to backtrack, by analyzing the reasons for the dead end, a process often referred to as and Pearl 1990). “backjumping” (Gaschnig 1979); (2) they record the rea- Theoretical evaluation of constraint satisfaction algo- sons for the dead end in the form of new constraints so that rithms is accomplished primarily by worst-case analysis Constraint Satisfaction 197 (i.e., determining a function of the problem’s size that sets Minton, S., M. D. Johnson, A. B. Phillips, and P. Laird. (1990). Solving large-scale constraint satisfaction and scheduling prob- the upper bound to the algorithm’s performance over all lems using heuristic repair methods. In National Conference on problems of that size), or by dominance relationships Artificial Intelligence, Anaheim, CA, pp. 17–24. (Kondrak and van Beek 1997). However, because worst- Montanari, U. (1974). Networks of constraints: fundamental prop- case analysis by its nature is too pessimistic and often erties and applications to picture processing. Information Sci- does not reflect actual performance, empirical evaluation ences 7(66): 95–132. is necessary. Normally, a proposed algorithm is evaluated Prosser, P. (1993). Hybrid algorithms for constraint satisfaction empirically on a set of randomly generated instances taken problems. Computational Intelligence 9(3): 268–299. from the relatively “hard” “phase transition” region (Sel- Purdom, P. W. (1983). Search rearrangement backtracking and man, Levesque, and Mitchell 1992). Other benchmarks polynomial average time. Artificial Intelligence 21: 117–133. based on real-life applications such as scheduling are also Sabin, D., and E. C. Freuder. (1994). Contradicting conventional wisdom in constraint satisfaction. In ECAI-94, Amsterdam, pp. used. Currently, dynamic variable ordering and value 125–129. selection heuristics that use various forms of constraint Selman, B., H. Levesque, and D. Mitchell. (1992). A new method inference, backjumping, and constraint learning have been for solving hard satisfiability problems. In Proceedings of the shown to be very effective for various problem classes Tenth National Conference on Artificial Intelligence, 339– (Prosser 1993; Frost and Dechter 1994; Sabin and Freuder 347. 1994). Stallman, M., and G. J. Sussman. (1977). Forward reasoning and See also HEURISTIC SEARCH dependency-directed backtracking in a system for computer- aided circuit analysis. Artificial Intelligence 2: 135–196. —Rina Dechter van Beek, P., and R. Dechter. (1995). On the minimality and decomposability of row-convex constraint networks. Journal of the ACM 42: 543–561. References van Beek, P., and R. Dechter. (1997). Constraint tightness and Arnborg, S., and A. Proskourowski. (1989). Linear time algorithms looseness versus local and global consistency. Journal of the for NP-hard problems restricted to partial k-trees. Discrete and ACM 44(4): 549–566. Applied Mathematics 23: 11–24. Dechter, R. (1990). Enhancement schemes for constraint process- Further Readings ing: Backjumping, learning, and cutset decomposition. Artifi- cial Intelligence 41: 273–312. Baker, A. B. (1995). Intelligent Backtracking on Constraint Satis- Dechter, R., and J. Pearl. (1987). Network-based heuristics for faction Problems: Experimental and Theoretical Results. Ph.D. constraint satisfaction problems. Artificial Intelligence 34: 1– diss., University of Oregon. 38. Bayardo, R., and D. Mirankar. (1996). A complexity analysis of Dechter, R., and J. Pearl. (1989). Tree clustering for constraint net- space-bound learning algorithms for the constraint satisfaction works. Artificial Intelligence 38(3): 353–366. problem. AAAI-96: Proceedings of the Thirteenth National Dechter, R., I. Meiri, and J. Pearl. (1990). Temporal constraint net- Conference on Artificial Intelligence, pp. 298–304. works. Artificial Intelligence 49: 61–95. Bistarelli, S., U. Montanari, and F. Rossi. (Forthcoming). Semir- Dechter, R., and I. Rish. (1994). Directional resolution: The Davis- ing-based constraint satisfaction and optimization. Journal of Putnam procedure, revisited. In Principles of Knowledge Rep- the Association of Computing Machinery. resentation and Reasoning, pp. 134–145. Cohen, D. A., M. C. Cooper, and P.G. Jeavons. (1994). Character- Freuder, E. C. (1982). A sufficient condition for backtrack-free izing tractable constraints. Artificial Intelligence 65: 347–361. search. Journal of the ACM 29(1): 24–32. Dechter, R., and D. Frost. (1997). Backtracking algorithms for Frost, D., and R. Dechter. (1994). In search of best search: An constraint satisfaction problems: A survey. UCI Tech Report. empirical evaluation. In AAAI-94: Proceedings of the Twelfth Dechter, R. (1992). Constraint networks. In Encyclopedia of Artifi- National Conference on Artificial Intelligence. Seattle, WA, pp. cial Intelligence. 2nd ed. New York: Wiley, pp. 276–285. 301–306. Dechter, R., and I. Meiri. (1994). Experimental evaluation of pre- Gaschnig, J. (1979). Performance Measurement and Analysis of processing algorithms for constraint satisfaction problems. Search Algorithms. Pittsburgh: Carnegie Mellon University. Artificial Intelligence 68: 211–241. Golumbic, M. C., and R. Shamir. (1993). Complexity and algo- Dechter, R., and P. van Beek. (1997). Local and global relational rithms for reasoning about time: A graph-theoretic approach. consistency. Theoretical Computer Science 173(1): 283–308. Journal of the ACM 40: 1108–1133. Frost, D. (1997). Algorithms and Heuristics for Constraint Satis- Haralick, M., and G. L. Elliot. (1980). Increasing tree-search effi- faction Problems. Ph.D. diss., University of California, Irvine. ciency for constraint satisfaction problems. Artificial Intelli- Ginsberg, M. L. (1993). Dynamic backtracking. Journal of Artifi- gence 14: 263–313. cial Intelligence Research 1: 25–46. Jeavons, P., D. Cohen, and M. Gyssens. (1997). Closure properties Kumar, V. (1992). Algorithms for constraint satisfaction problems: of constraints. Journal of ACM 44: 527–548. A survey. AI magazine 13(1): 32–44. Kirousis, L. M. (1993). Fast parallel constraint satisfaction. Artifi- Mackworth, A. K. (1992). Constraint Satisfaction. In Encyclopedia cial Intelligence 64: 147–160. of Artificial Intelligence. 2nd ed. New York: Wiley, pp. 285– Kondrak, G., and P. van Beek. (1997). A theoretical valuation of 293. selected algorithms. Artificial Intelligence 89: 365–387. Schwalb, E. and R. Dechter. (1997). Processing disjunctions in Mackworth, A. K. (1977). Consistency in networks of relations. temporal constraint networks. Artificial Intelligence 93(1–2): Artificial Intelligence 8(1): 99–118. 29–61. Mackworth, A. K., and E. C. Freuder. (1985). The complexity of Tsang, E. (1993). Foundation of Constraint Satisfaction. Academic some polynomial network consistency algorithms for constraint Press. satisfaction problems. Artificial Intelligence 25. 198 Context and Point of View in English left and right). Person deixis, too, is subject to Context and Point of View variation: many languages (e.g., French) distinguish more than one second person, depending on the social relationship The content of most linguistic expressions in one way or between speaker and hearer, a phenomenon known as “social other depends on the context of utterance; in fact, in logical deixis” (Levinson 1979); another common distinction (to be semantics (literal) meaning is analyzed as a function found, for example, in Tagalog) is between an inclusive and assigning contents (or intensions) to contexts. The most an exclusive first person plural, depending on whether the prominent way for the context to determine content is by addressee does or does not belong to the group designated. way of INDEXICALS, that is, by expressions whose sole The variation in temporal deixis is harder to estimate, partly function is to contribute a component of the situation in because it is not always clear whether tenses are truly deictic, which an utterance is made to the content expressed by that but also because languages tend to be more or less localistic, utterance. Indexicals can be either lexical items, such as the extending their system of spatial deixis to time by analogy English personal pronoun I, which always refers to the and frozen metaphor (cf. TENSE AND ASPECT). speaker, or grammatical forms, such as the first person ver- The role of the context in providing the perspective from bal suffix in Latin, which has the same function. Indexicals which the utterance is interpreted becomes particularly are special cases of deictic expressions whose reference vivid in shifted contexts, also known as “relativized deixis” depends on context—a case in point being the possessive (Fillmore 1975), or “Deixis am Phantasma” (Bühler 1934), pronoun my, which describes something as belonging to the where at least some of the deictic parameters are not pro- speaker as determined by the utterance situation (Zimmer- vided by the utterance situation. Among these shifts are var- mann 1995). ious forms of pretense like play-acting, impersonation, All languages seem to contain deictic expressions and to analogous deixis (Klein 1978)—the speaker of We took this make ample use of them. Traditionally, deictics are classified road may refer to what is represented by the map in front of according to the aspect or feature of the utterance context her—or even first-person inscriptions on gravestones that determines their reference. Three major kinds of deixis (Kratzer 1978). are usually distinguished: (1) person deixis, where the con- Speakers often take the hearer’s perspective in describing text provides one or more participants of the conversation the spatial location of objects (Schober 1993), as in Please (speaker, addressee), or a group to which they belong; (2) press the left button/the button to your left. This context spatial deixis, where the context provides a location or a shift can be made out of politeness or, especially when the direction, especially as a reference point for spatial orienta- hearer does not know the speaker’s location, for communi- tion on which other deictics depend (an “origo,” in Bühler’s cative efficiency. In the latter case, however, deictic orienta- 1934 sense); and (3) temporal deixis, with the context con- tion may also be replaced by an intrinsic perspective tributing a specific time, which may be the time of utterance, provided by an object with a canonical front (as in behind the time of a reported event, or the like. Among the clear the house, denoting the backyard). cases of deixis in English are (1) the first and second person A rather coherent area of regular context shift is known pronouns, I, you, we; (2) the local adverbs here and there; as “free indirect speech” in narrative analysis (Banfield and (3) the temporal adverbs now and yesterday. Other 1982; Ehrlich 1990). In a passage such as Mary looked out examples of deixis, including demonstratives such as this of the window. Her husband was coming soon, the second whose referents depend on an accompanying gesture plus the sentence is understood to report Mary’s thoughts from her speaker’s referential intentions (Kaplan 1989a, 1989b), do own point of view: it is Mary, not the narrator, who believes not clearly fall under 1–3. DISCOURSE anaphors, such as her husband to be on his way and, whereas the verb come aforementioned, or third person pronouns, such as he, her, normally expresses movement toward the speaker as deter- receive their interpretation by reference to their linguistic mined by the context, in this case it is Mary’s position context and are thus sometimes also considered as deictic. As toward which her husband is reported to move (Rossdeut- has become clear by work in DYNAMIC SEMANTICS, however, scher 1997). Similarly, the adverb soon is understood to such anaphoric elements quite regularly undergo variable describe an event as happening shortly after the scene binding processes, as in Every man who owns a donkey beats described, rather than the utterance, has taken place. This it, resulting in a quantified, rather than context-dependent simultaneous replacement of some (but not all) contextual reading. The same has been observed for the context depen- parameters can be seen as a shift of the logophoric center dence of certain relational nouns (Partee 1989) such as (Kuno 1987), which comprises a large part of the more sub- enemy whose arguments are usually given by the utterance jective parameters, including those that determine the inter- situation, as in The enemy is approaching, but may also be pretation of evaluative adjectives (e.g., boring) and free quantified over, as in Every participant faced an enemy. reflexives. Whereas free indirect speech is a rather well Languages differ considerably in the number and kinds of understood phenomenon with predictable features (includ- deictic locutions they have. Some have the place of utterance ing a restricted choice of tenses), other shifts of the logo- as their only spatial parameter, where others have complex phoric center are less easily accounted for. Among these are systems classifying space according to various criteria (Frei the optional perspectives in overt attitude reports (Mitchell 1944), including distance from the speaker’s position as 1987), as in The CIA agent knows that John thinks that a measured in varying degrees (up to seven in Malagasy, KGB agent lives across the street, where the underlined according to Anderson and Keenan 1985), visibility (also in phrase can be evaluated from the speaker’s, the CIA agent’s, Malagasy, and in many other languages), and perspective (as or John’s point of view. Control Theory 199 See also Jarvella, R. J., and W. Klein, Eds. (1982). Speech, Place, and MEANING; PRAGMATICS; QUANTIFIERS; SITUAT- Action: Studies of Deixis and Related Topics. Chichester: EDNESS/EMBEDDEDNESS Wiley. —Thomas Ede Zimmermann Zimmermann, T. E. (1991). Kontextabhängigkeit. In A. v. Stechow and D. Wunderlich, Eds., Semantik (Semantics). Berlin/New York: Springer, pp. 156–229. References Anderson, S. R., and E. L. Keenan. (1985). Deixis. In T. Shopen, Control Theory Ed., Language Typology and Syntactic Description. Vol. 2, Grammatical Categories and the Lexicon. Cambridge: Cam- bridge University Press, pp. 259–308. The modern development of automatic control evolved Banfield, A. (1982). Unspeakable Sentences. London: Routledge. from the regulation of tracking telescopes, steam engine Bühler, K. (1934). Sprachtheorie. Jena, Germany: Fischer. control using fly-ball governors, the regulation of water Ehrlich, S. (1990). Point of View. London: Routledge. turbines, and the stabilization of the steering mechanisms Fillmore, C. (1975). Santa Cruz Lectures on Deixis. Bloomington: of ships. The literature on the subject is extensive, and Indiana University Lingustics Club. because feedback control is so broadly applicable, it is Frei, H. (1944). Systèmes de déictiques. Acta Linguistica 4: 111– scattered over many journals ranging from engineering and 129. physics to economics and biology. The subject has close Kaplan, D. (1989a). Demonstratives: An essay on the semantics, links to optimization including both deterministic and sto- logic, metaphysics and epistemology of demonstratives and other indexicals. In J. Almog, J. Perry, and H. Wettstein, Eds., chastic formulations. Indeed, Bellman’s influential book on Themes from Kaplan. Oxford: Oxford University Press, pp. dynamic optimization, Dynamic Programming (1957), is 481–563. couched largely in the language of control. Kaplan, D. (1989b). Afterthoughts. In J. Almog, J. Perry, and H. The successful use of feedback control often depends on Wettstein, Eds., Themes from Kaplan. Oxford: Oxford Univer- having an adequate model of the system to be controlled and sity Press, pp. 565–614. suitable mechanisms for influencing the system, although Klein, W. (1978). Wo ist hier? Präliminarien zu einer Untersu- recent work attempts to bypass this requirement by incorpo- chung der lokalen Deixis. Linguistische Berichte 58: 18–40. rating some form of adaptation, learning, or both. Here we Kratzer, A. (1978). Semantik der Rede: Kontexttheorie—Modal- will touch on the issues of modeling, regulation and track- wörter—Konditionalsätze. Königstein, Germany: Scriptor. ing, optimization, and stochastics. Kuno, S. (1987). Functional Syntax. Chicago: University of Chi- cago Press. Levinson, S. C. (1979). Pragmatics and social deixis: Reclaiming Modeling the notion of conventional implicature. In Proceedings of the The oldest and still most successful class of models used to Fifth Annual Meeting of the Berkeley Linguistics Society, Ber- keley, CA: pp. 206–223. design and analyze control systems are input-output mod- Mitchell, J. (1987). The Formal Semantics of Point of View. Ph.D. els, which capture how certain controllable input variables diss., University of Massachusetts at Amherst. influence the state of the system and, ultimately, the Partee, B. (1989). Binding implicit variables in quantified contexts. observable outputs. The models take the form of differen- In C. Wiltshire, R. Graczyk, and B. Music, Eds., CLS 25. Part tial or difference equations and can be linear or nonlinear, One: The General Session, Chicago, pp. 342–365. finite or infinite dimensional. When possible, the models Rossdeutscher, A. (1997). Perspektive und propositionale Einstel- are derived from first principles, adapted and simplified to lung in der Semantik von kommen. In C. Umbach, M. Grabski, be relevant to the situation of interest. In other cases, and R. Hörnig, Eds., Perspektive in Sprache und Raum. Wies- empirical approaches based on regression or other tools baden: Deutscher Universitätsverlag, pp. 261–288. from time series analysis are used to generate a mathemati- Schober, M. F. (1993). Spatial perspective-taking in conversation. Cognition 47: 1–24. cal model from data. The latter is studied under the name of Zimmermann, T. E. (1995). Tertiumne datur? Possessivpronomina “system identification” (Willems 1986). To fix ideas, con- und die Zweiteilung des Lexikons. Zeitschrift für Sprachwis- sider a linear differential equation model with input vector senschaft 14: 54–71. Translated as Tertiumne datur? Possessive u, state vector x, and output y pronouns and the bipartition of the lexicon. In H. Kamp and B. Partee, Eds., Context Dependence in the Analysis of Linguistic dx ----- = Ax + Bu ; - y = Cx Meaning. Vol. 1, Papers. Stuttgart, Germany: Institut für dt maschinelle Sprachverarbeitung, Stuttgart University, 1997, pp. Such models are of fundamental importance because they 409–425. not only capture the essential behavior of important classes of linear systems but also represent the small signal approx- Further Readings imation to a large class of strongly nonlinear systems. Ques- Doron, E. (1990). Point of View. CSLI Report 90–143, Stanford tions of control often center around the design of an University. auxiliary system, called the “compensator” or “controller,” Ehrich, V. (1992). Hier und Jetzt: Studien zur lokalen und tem- which acts on the measurable variable y to produce a feed- poralen Deixis im Deutschen. Tübingen, Germany: Niem- back signal that, when added to u, results in better perfor- eyer. mance. The concepts of controllability, observability, and Forbes, G. (1988). Indexicals. In D. Gabbay and F. Guenthner, model reduction play a central role in the theory of linear Eds., Handbook of Philosophical Logic, vol. 4. Dordrecht: Klu- models (Kalman et al. 1969; Brockett 1970). wer, pp. 463–490. 200 Control Theory KBBTK + Q = 0. This methodology provides a reasonably There are important classes of systems whose perfor- mance can only be explained by nonlinear models. Many of systematic approach to the design of regulators in that only these are prominent in biology. In particular, problems that the loss matrix Q is unspecified. Different types of optimi- involve pattern generation, such as walking or breathing, are zation problems associated with trajectory optimization in not well modeled using linear equations. The description of aerospace applications and batch processing in chemical numerically controlled machine tools and robots, both of plants are also quite important. A standard problem formu- which convert a formal language input into an analog (con- lation in this latter setting would be concerned with prob- tinuous) output are also not well modeled by the linear the- lems of the form ory, although linear theory may have a role in explaining the ∫0 L ( x, u, t ) dt + φ ( x ( T ) ) T dx ----- = f ( x, u ) ; η= behavior of particular subsystems (Brockett 1997, 1993). - dt Regulation and Tracking Chapter 7 of Sontag (1990) provides a short introduction including a discussion of the relationships between The simplest and most frequently studied problem in auto- DYNAMIC PROGRAMMING and the more classical Hamilton- matic control is the regulation problem. Here one has a Jacobi theory. desired value for a variable, say the level of water in a tank, and wants to regulate the flow of water into the tank to keep Stochastics the level constant in the face of variable demand. This is a special case of the problem of tracking a desired signal, for The Kalman-Bucy (1961) filter, one of the most widely example, keeping a camera focused on a moving target (see appreciated triumphs of mathematical engineering, is used STEREO AND MOTION PERCEPTION), or orchestrating the in many fields to reduce the effects of measurement errors motion of a robot so that the end effector follows a certain and has played a significant role in achieving the first soft moving object. The design of stable regulators is one of the landing on the moon, and more recently, achieving closed- oldest problems in control theory. It is often most effective loop control of driverless cars on the autobahns of Germany. to incorporate additional dynamic effects, such as integral Developed in the late 1950s as a state space version of the action, in the feedback path, thus increasing the complexity Wiener-Kolomogorov theory of filtering and prediction, it of the dynamics and making the issue of stability less intui- gave rise to a rebirth of this subject. In its basic form, the tive. In the case of systems adequately modeled by linear Kalman-Bucy filter is based on a linear system, white noise · · differential equations, the matter was resolved long ago by (written here as w and n ) model the work of Routh, and Hurwitz, which yields, for example, · · · the result that the third-order linear, constant-coefficient dif- x = Ax + Bw ; y = Cx + n ferential equation · The signal Cx is generated from the white noise w by pass- ing it into the linear system, although it is not observed 2 3 dy dy dy directly, but only after it is corrupted by the additive noise -------- + a -------- + b ----- + cy = 0 - · dt 2 3 n . The theory tells us that the best (in several senses includ- dt dt ing the least squares) way to recover x from y is to generate is stable if a, b, and c are all positive and ab – c ≥ 0. Moti- ˆ an estimate x using the equation vated by feedback stability problems associated with high- ˆ dx Tˆ gain amplifiers, Nyquist (1932) took a fresh look at the feed- ----- = Ax – PC ( Cx – y ) ˆ - dt back stability problem and formulated a stability criterion directly in terms of the frequency response of the system. with P being the solution of the variance equation This criterion and variations of it form the basis of classical · feedback compensation techniques as described in the well- T T T P = AP + PA + BB – PC CP known book of Kuo (1967). In the case of nonlinear sys- tems, the design of stable regulators is more challenging. The development of similar theories for counting processes, Liapunov stability theory (Lefschetz 1965) provides a point queuing systems, and the like is more difficult and remains of departure, but general solutions are not to be expected. an active area for research (Brémaud 1981). See also BEHAVIOR-BASED ROBOTICS; DYNAMIC Optimization APPROACHES TO COGNITION; MANIPULATION AND GRASP- ING; MOBILE ROBOTS; WALKING AND RUNNING MACHINES; A systematic approach to the design of feedback regulators WIENER can be based on the minimization of the integral of some —Roger W. Brockett positive function of the error and the control effort. For the linear system defined above this might take the form References ∞ ∫0 ( x T T η= Qx + u u ) dt Airy, G. B. (1840). On the regulator of the clock-work for effecting uniform movement of equatoreals. Memoirs of the Royal Astro- which leads, via the calculus of variations, to a linear feed- nomical Society 11: 249–267. back control law of the form u = – BTKx, with K being a Bellman, R. (1957). Dynamic Programming. Princeton: Princeton solution to the quadratic matrix equation ATK + KA – University Press. Cooperation and Competition 201 tion. The ally, even though in no immediate danger, will come Brémaud, P. (1981). Point Processes and Queues. New York: to the aid of a beleaguered partner, on the implicit assumption Springer. Brockett, R. W. (1970). Finite Dimensional Linear Systems. New that the partner will come to the ally’s aid on some future York: Wiley. occasion. And in kin selection, the debt is repaid after the Brockett, R. W. (1993). Hybrid models for motion control systems. death of the altruist because the extra fitness that accrues to In H. Trentelman and J. C. Willems, Eds., Perspectives in Con- the recipient contributes to the altruist’s inclusive fitness, trol. Boston: Birkhauser, pp. 29–54. defined as the number of copies of a given gene contributed to Brockett, R. W. (1997). Cycles that effect change. In Motion, Con- the species’ gene pool by an individual as a result of his or her trol and Geometry. Washington, DC: National Research Coun- own reproductive output plus the number contributed by his or cil, Board on Mathematical Sciences. her relatives as a direct result of that individual helping each Hurwitz, A. (1895). Über die Bedingungen, unter welchen eine relative to breed more successfully. Kin selection can only Gleichung nur Wurzeln mit negativen reellen Theilen besitzt. work when the two individuals are genetically related. It may Mathematische Annalen 46: 273–284. Kalman, R. E., and R. S. Bucy. (1961). New results in linear filter- provide an explanation for assistance freely given to relatives ing and prediction theory. Trans. ASME Journal of Basic Engi- without prior demands for reciprocation. neering 83: 95–108. Although cooperation and the exchange of services or Kalman, R. E., et al. (1969). Topics in Mathematical System The- resources occur widely among humans, such exchanges are ory. New York: McGraw-Hill. not wholly altruistic, especially when the actor incurs a sig- Kuo, B. C. (1967). Automatic Control Systems. Englewood Cliffs, nificant cost. A number of recent studies of humans have NJ: Prentice-Hall. demonstrated that exchange of benefits occurs without pre- Lefschetz, S. (1965). Stability of nonlinear control systems. In conditions for repayment when it involves relatives, but Mathematics in Science and Engineering, vol. 13. London: only with strict reciprocation when it involves nonrelatives, Academic Press. (for example, garden labor exchange among South Ameri- Maxwell, J. C. (1868). On governors. Proc. of the Royal Soc. Lon- don 16: 270–283. can K’ekchi’ horticulturalists and Nepalese hill farmers Minorsky, N. (1942). Self-excited oscillations in dynamical sys- (Berté 1988; Panter-Brick 1989); alliance support among tems possessing retarded action. J. of Applied Mechanics 9: historical Vikings (Dunbar, Clark, and Hurst 1995); and 65–71. exchange of information about good fishing grounds among Nyquist, H. (1932). Regeneration theory. Bell Systems Technical contemporary Maine lobstermen (Palmer 1991). Journal 11: 126–147. Cooperation is an unstable strategy because it is suscepti- Sontag, E. D. (1990). Mathematical Control Theory. New York: ble to cheating by free riders. The ease with which selfish Springer. interests can undermine cooperativeness is most conspicu- Willems, J. C. (1986). From time series to linear systems. Auto- ous in the case of common pool resources (e.g., forest matica 22: 561–580. resources, communally owned commons or oceanic fishing grounds). Although it may be obvious to everyone that a Cooperation and Competition communal agreement to manage the use of these resources would benefit everyone because the resource would last Cooperation is a hallmark of all social organisms. Social longer, the advantages to be gained by taking a dispropor- groups are, in effect, cooperative solutions to the day-to-day tionate share can be an overwhelming temptation. The result problems of survival and reproduction. For some or all is often the complete destruction of the resource through members of the group, however, group living invariably overuse, the “tragedy of the commons” (see Ortsrom, Gard- incurs costs, which may be reflected in social subordination, ner, and Walker 1994). Tax avoidance and parking in no restricted access to the best feeding or resting sites, the parking zones are everyday examples of a similar kind of social suppression of reproduction, or increased ecological cheating on socially agreed conventions. costs. Because individuals (or at least, their genes) are by The free rider problem is one of the most serious prob- definition in evolutionary competition with each other, this lems encountered by organisms living in large groups that creates a paradox that is not easy to explain in Darwinian depend on cooperation for their effectiveness. It acts as a terms: Cooperation is a form of ALTRUISM in which one dispersive force that, unless checked, leads inexorably to the individual gives up something to the benefit of another (see disbanding of groups (and thus the loss of the very purpose also SOCIOBIOLOGY). for which the groups formed). The problem arises because Evolutionary theory identifies three ways cooperation can the advantages of free riding are often considerable, espe- evolve, which differ in the delay before the “debt” incurred by cially when the risks of being caught (and thus of being cooperating is repaid (see Bertram 1982). In mutualism, both punished or discriminated against) are slight. Perhaps as a individuals gain an immediate advantage from cooperating. result, strategies that help to detect or deter free riders are This may be an appropriate explanation for many cases of common in most human societies. These include being sus- group living where individuals gain mutually and simulta- picious of strangers (whose willingness to cooperate neously from living together (e.g., through increased protec- remains in doubt), rapidly changing dialects (which helps tion from predators, group defense of a territory, etc). In identify the group of individuals with whom you grew up, reciprocal altruism, the debt is repaid at some future time, who are likely to be either relatives or to bear obligations of providing this is during the lifetime of the altruist. This may mutual aid; see Nettle and Dunbar 1997), entering into con- be an appropriate explanation for cases where individuals who ventions of mutual obligation (e.g., blood brotherhood, are unrelated to each other form a coalition for mutual protec- exchange of gifts, or formal treaties), and ostracizing or 202 Cortex punishing those who cheat on the system (e.g., castigating has the standard structure of the logical statement “If P (= hunters who eat all their meat rather than sharing it, as even number), then Q (= vowel on reverse).” Because the among !Kung San bushmen: Lee 1979). four cards correspond to the statements P, not-P, Q and not- In addition to these purely behavioral mechanisms (per- Q, the correct logical solution is to choose the cards that haps the product of CULTURAL EVOLUTION), there is also correspond to P and not-Q. Most subjects incorrectly choose evidence to suggest that there may be dedicated “cheat P alone or the P and the Q cards. In the social contract ver- detection” modules hardwired in the human brain. The evi- sion, the cards correspond to people sitting around a table dence for this derives from studies that consider abstract and whose drinks or ages are specified (as in column 3). The social versions of the Wason selection task (a verbal task rule in this case is “If you want to drink alcoholic beverages, about logical reasoning: see Table 1). Although most people you must be over the age of 21 years.” Here it is obvious get the answer wrong when presented with the abstract ver- that only the age of the beer drinker (P) and the drink of the sion of the Wason task, they usually get it right when the 16-year-old (not-Q) need to be checked. Even though most task is presented as a logically identical social contract people get the original abstract Wason task wrong, they get problem that involves detecting who is likely to be cheating the social contract version right. the system (see Cosmides and Tooby 1992). This is assumed See also EVOLUTION to happen because we have a cognitive module that is sensi- —R. I. M. Dunbar tive to social cheats, but that cannot easily recognize the same kind of logical problem in another form. References The mechanisms involved in the evolution of cooperation Axelrod, R. (1984). The Evolution of Cooperation. New York: have been of considerable interest to economists and other Basic Books. social scientists, as well as to evolutionary psychologists (see Berté, N. A. (1988). K’ekchi’ horticultural exchange: Productive EVOLUTIONARY PSYCHOLOGY). GAME THEORY, in particular, and reproductive implications. In L. Betzig, M. Borgerhoff provides considerable insights into the stability of coopera- Mulder, and P. Turke, Eds., Human Reproductive Behavior. tive behavior. The situation known as “prisoner’s dilemma” Cambridge: Cambridge University Press, pp. 83–96. has been the focus of much of this research. It involves two Bertram, B. C. R. (1982). Problems with altruism. In King’s Col- allies who must independently decide whether to cooperate lege Sociobiology Group, Eds., Current Problems in Sociobiol- with each other (to gain a small reward) or defect (to gain a ogy. Cambridge: Cambridge University Press, pp. 251–268. very large reward)—but with the risk of doing very badly if Boyd, R., and P. Richerson. (1992). Punishment allows the evolu- cooperation is met with defection. A computer tournament tion of cooperation (or anything else) in sizeable groups. Ethol- ogy and Sociobiology 13: 171–195. that pitted alternative algorithms against each other in an evo- Cosmides, L., and J. Tooby. (1992). Cognitive adaptations for lutionary game revealed that the very simplest rule of behav- social exchange. In J. Barkow, L. Cosmides and J. Tooby, Eds., ior is the most successful. This rule is known as “tit-for-tat” The Adapted Mind: Evolutionary Psychology and the Genera- (or TfT): Cooperate on the first encounter with an opponent tion of Culture. Oxford: Oxford University Press, pp. 163–228. and thereafter do exactly what the opponent did on the previ- Dunbar, R., A. Clark, and N. L. Hurst. (1995). Conflict and coop- ous round (cooperate if he cooperated, defect if he defected). eration among the Vikings: Contingent behavioral decisions. In more conventional face-to-face situations, cues pro- Ethology and Sociobiology 16: 233–246. vided by nonverbal behavior may be important in promoting Enquist, M., and O. Leimar. (1993). The evolution of cooperation both trust in another individual and the sense of obligation in mobile organisms. Animal Behavior 45: 747–757. to others required for successful cooperation. Experiments Gigerenzer, G., and K. Hug. (1992). Domain-specific reasoning: Social contracts, cheating and perspective change. Cognition have shown that simply allowing individuals to discuss even 43: 127–171. briefly which strategy is best greatly increases the frequency Lee, R. B. (1979). The !Kung San: Men, Women and Work in a of cooperation. Allowing them to exert moral pressure or Foraging Society. Cambridge: Cambridge University Press. fines on defectors improves the level of group cooperative- Nettle, D., and R. I. M. Dunbar. (1997). Social markers and the ness still further. evolution of reciprocal exchange. Current Anthropology 38: 93–99. Table 1. The Wason Selection Task Orstrom, E., R. Gardner, and J. Walker. (1994). Rules, Games and Common-Pool Resources. Ann Arbor: University of Michigan Social contract Standard version Logical equivalent Press. version Palmer, C. T. (1991). Kin-selection, reciprocal altruism and infor- A P Drinking a beer mation sharing among Maine lobstermen. Ethology and Socio- H Not-P Drinking a Coke biology 12: 221–235. 4 Q 21 years old Panter-Brick, C. (1989). Motherhood and subsistence work: The 7 Not-Q 16 years old Tamang of rural Nepal. Human Ecology 17: 205–228. Trivers, R. L. (1971). The evolution of reciprocal altruism. Quar- terly Review of Biology 46: 35–57. The Wason selection task (table 1) was developed as a test of logical reasoning. When presented with four cards Cortex bearing letters and numbers (as shown in column 1) and informed that “an even number always has a vowel on its See reverse,” the subject has to decide which card or cards to CEREBRAL CORTEX; VISUAL CORTEX, CELL TYPES, AND turn over in order to check the validity of the rule. This rule CONNECTIONS IN Cortical Localization, History of 203 In 1861, Paul BROCA described several patients with Cortical Localization, History of longstanding difficulties in speaking, which he attributed to damage to their left frontal lobes. This was the first gener- During the first twenty-five centuries of studies of brain ally accepted evidence for localization of a specific psycho- function, almost all investigators ignored or belittled the logical function in the cerebral cortex (and was viewed at CEREBRAL CORTEX. One exception was the Alexandrian the time as a vindication of Gall’s ideas of localization). anatomist Erasistratus (fl. c. 290 B.C.E.), who on the basis of Soon after, Fritsch and Hitzig demonstrated specific move- comparative studies attributed the greater intelligence of ments from electrical stimulation of the cortex of a dog, and humans to their more numerous cortical convolutions. This drew the inference that some psychological functions and view was ridiculed by Galen (129–199), the most influential perhaps all of them need circumscribed centers of the cor- of all classical biomedical scientists, whose sarcastic dis- tex. The next major development was Carl Wernicke’s 1876 missal of any significant role for the cortex continued to be report of a second type of language difficulty, or APHASIA, quoted into the eighteenth century. Another major exception namely, one in understanding language; he associated this was Thomas Willis (1621–1675), a founder of the Royal type of aphasia with damage to the posterior cortex in the Society, and author of the first monograph on the brain. On region where the occipital, temporal, and parietal areas the basis of his dissections, experiments on animals, and meet. Furthermore, he extended the idea of specialized cor- clinical studies of humans, he attributed memory and volun- tical areas by stressing the importance of the connections tary movement functions to the cortex. However, by far the among different areas, particularly for higher mental func- dominant view on cortical function before the beginning of tions. the nineteenth century was that the cortex was merely a pro- The last years of the nineteenth century saw several acri- tective rind (cortex means “rind” in Latin), a glandular monious controversies about the location of the various cor- structure (the early microscopists saw globules in cortex, tical sensory areas, involving such figures as David Ferrier, probably artifacts), or a largely vascular structure made up E. A. Schafer, and Hermann Munk. These issues were of small blood vessels. The apparent insensitivity of the cor- resolved first in monkeys and then in humans, so that by the tex to direct mechanical and chemical stimulation was used end of World War I (with its rich clinical material), the loca- as an argument against the cortex having any important tion and organization of the primary visual, auditory, somes- functions in sensation, mentation, or movement. thetic, and motor areas of the cortex had been defined. By The systematic localization of different psychological this time, the cerebral cortex had been divided up into multi- functions in different regions of the cerebral cortex begins ple regions on the basis of regional variations in its cellular with Franz Joseph Gall (1758–1828) and his collaborator J. or fiber structure. The more lasting of these cortical archi- C. Spurzheim (1776–1832), the founders of phrenology. tectonic parcellations were those by Korbinian Brodmann The central ideas of their phrenological system were that the and Constantin von Economo, who created the numbering brain was an elaborately wired machine for producing and lettering schemes, respectively, that are still in use. behavior, thought, and emotion, and that the cerebral cortex Despite their new labels, however, the functions of vast consisted of a set of organs with different functions. Postu- regions of the cortex, other than the primary sensory and lating about thirty-five affective and intellectual faculties, motor areas, remained mysterious. These regions were they assumed that these were localized in specific cortical termed association cortex, initially because they were organs and that the size of each cortical organ was indicated thought to be the site of associations among the sensory and by the prominence of the overlying skull, that is, by cranial motor areas. Under the influence of British association psy- bumps. Their primary method was to examine the skulls of a chology (typified by John Stuart Mill and Alexander Bain), wide variety of people, from lunatics and criminals to the association cortex was believed to be the locus of the associ- eminent and accomplished. Although the absurdity of their ation of ideas, and after Pavlov, the locus of the linkage dependence on cranial morphology was quickly recognized between conditioned stimuli and responses. in the scientific community, Gall’s ideas about the cortex as Parallel with the success of the localizers around the a set of psychological organs stimulated investigation of the turn of the century, there was also a strong antilocalization effects of cortical lesions in humans and animals and of tendency. Adherents of this view, such as C.E. Brown- structural variations across different cortical regions, and Sequard, Friedrich Goltz, Camillo GOLGI, and Jacques thus had a lasting influence on the development of modern Loeb, emphasized such phenomena as the variability and neuroscience. recovery of symptoms after brain damage. They stressed Examining a variety of animals, Pierre Flourens (1794– that higher cognitive functions, particularly INTELLIGENCE 1867) found that different major brain regions had different and MEMORY, could not be localized in specific regions of functions; he implicated the cerebral hemispheres in will- the cortex. Like Flourens, Goltz reported that it was the ing, remembering, and perceiving, and the CEREBELLUM in size and not the location of the lesion that determined the movement. Within the cortex, however, he found no local- severity of its effects on such higher functions. This holistic ization of function: only the size and not the site of the view of brain function was reinforced by the rise of lesion mattered. Although these results appeared to refute GESTALT PSYCHOLOGY. the punctate localizations of Gall, they actually supported The best-known investigator of the relative importance of both the general idea of localization of function in the brain the size and site of a cortical lesion was Karl S. LASHLEY, and the specific importance Gall had given to the cerebral easily the foremost figure in the study of the brain in the hemispheres in cognition. 1940s and 50s. On the basis of a long series of experiments, 204 Cortical Localization, History of particularly on rats in a complex maze, he proposed two of the cerebral cortex. As we begin to understand the paral- principles of brain organization, “equipotentiality” and lel serial and hierarchical ways that the cortex processes, “mass action” (Lashley 1929). Equipotentiality was the stores, and retrieves information, the phrase “localization of apparent capacity of any intact part of a functional area to function” sounds increasingly archaic and simplistic. carry out, with or without reduction in efficiency, the func- See also ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC tions lost by destruction of the whole. Lashley assumed EVOKED FIELDS; MEMORY, ANIMAL STUDIES; MEMORY, equipotentiality to vary with different brain areas and with HUMAN NEUROPSYCHOLOGY different functions and thought it might only hold for asso- — Charles Gross ciation cortex and for functions more complex than sensory or motor ones such as maze learning. Furthermore, equipo- References tentiality was not absolute but subject to a law of mass action whereby the efficiency of a whole complex function Finger, S. (1994). Origins of Neuroscience. Oxford: Oxford Uni- might be reduced in proportion to the extent of brain injury versity Press. within an equipotential area. He stressed that both principles Gross, C. G. (1987). Early history of neuroscience. In G. Adelman, were compatible with cortical localization of functions and Ed., Encyclopedia of Neuroscience, vol. 2. Boston: Birkhauser, himself reported several findings of specific cortical local- pp. 843–846. izations. Gross, C. G. (1997). From Imhotep to Hubel and Wiesel: the story Lashley’s most famous (or infamous) result was that both of visual cortex. In J. H. Kaas, K. Rockland, and A. Peters, Eds., Cerebral Cortex. Vol. 12, Extrastriate Cortex in Primates. principles held for the entire cerebral cortex of rats learning New York: Plenum Press. a complex maze. That is, performance in this maze was Gross, C. G. (1998). Brain, Vision, Memory: Tales in the History of independent of the site of the cortical lesion and only depen- Neuroscience. Cambridge, MA: MIT Press. dent on its size. We now know that these mass action results Krech, D. (1963). Localization of function. In L. Postman, Ed., were due to increasing encroachment on multiple areas crit- Psychology in the Making. New York: Knopf. ical for different components of maze learning with increas- Lashley, K. S. (1929). Brain Mechanisms and Intelligence. Chi- ing size of lesion. In recent years, Lashley’s specific ideas cago: University of Chicago Press. on equipotentiality and mass action (and many of his other contributions) are often forgotten, and he is inaccurately Further Readings described as an extreme “antilocalizer,” who thought the brain was like a bowl of jelly. Boring, E. (1957). A History of Experimental Psychology. 2nd ed. Starting in the 1930s, systematic evidence for the local- New York: Appleton-Century-Crofts. Brazier, M. (1988). A History of Neurophysiology in the 19th Cen- ization of various cognitive functions in regions of associ- tury. New York: Raven Press. ation cortex began to emerge, particularly from students Clarke, E., and K. Dewhurst. (1972). An Illustrated History of and associates of Lashley. In an experiment still at the core Brain Function. 2nd ed. San Francisco: Norman. of contemporary research on the frontal lobes, Carlyle Clarke, E., and L. Jacyna. (1987). Nineteenth-Century Origins of Jacobsen showed that frontal cortex lesions impair the per- Neuroscientific Concepts. Berkeley: University of California formance of delayed response tasks, in which the monkey Press. must remember which of two cups a peanut was placed Clarke, E., and C. O’Malley. (1996). The Human Brain and Spinal under, a deficit Jacobsen described as one of short-term Cord.: A Historical Study Illustrated by Writings from Antiquity memory. (This result, through no fault of Jacobsen’s, led to the Twentieth Century. San Francisco: Norman. directly to the introduction of frontal lobotomy as a psy- Corsi, P., Ed. (1991). The Enchanted Loom. Chapters on the His- tory of Neuroscience. Oxford: Oxford University Press. chosurgical procedure in humans.) In another seminal Fearing, F. (1970). A Study in the History of Physiological Psy- experiment, K.-L. Chow, in 1950, showed that lesions of chology. Cambridge, MA: MIT Press. temporal cortex yield a deficit in pattern recognition, a Fulton, J. (1966). Selected Readings in the History of Physiology. finding that helped spark the study of extrastriate mecha- 2nd ed. Springfield, IL: Thomas. nisms in vision. Harrington, A. (1987). Medicine, Mind and the Double Brain. Up to the 1950s, the advances in understanding the func- Princeton: Princeton University Press. tions of the cerebral cortex had relied almost entirely on the Liddell, E. (1960). The Discovery of Reflexes. Oxford: Oxford study of brain damage in humans and other primates. The University Press. introduction of evoked response and SINGLE-NEURON Meyer, A. (1971). Historical Aspects of Cerebral Anatomy. RECORDING techniques provided powerful new methods for Oxford: Oxford University Press. Neuburger, M. (1981). The Historical Development of Experimen- studying localization of cortical function, methods soon tal Brain and Spinal Cord Physiology before Flourens. Balti- revealing that much of association cortex was made up of more: Johns Hopkins University Press. areas devoted to processing specific aspects of a single sen- Poytner, F. N. (1958). The History and Philosophy of Knowledge sory modality. Furthermore, these higher sensory areas were of the Brain and Its Functions. Oxford: Blackwell. often involved in attentional and mnemonic functions as Polyak, H. (1957). The Vertebrate Visual System. Chicago: Univer- well as perceptual ones. sity of Chicago Press. Most recently, the introduction of functional MAGNETIC Shepherd, G. (1991). Foundations of the Neuron Doctrine. Oxford: RESONANCE IMAGING (fMRI) and POSITRON-EMISSION Oxford University Press. TOMOGRAPHY (PET) scanning have begun to radically Singer, C. (1957). A Short History of Anatomy and Physiology enhance our understanding of the functional specialization from the Greeks to Harvey. New York: Dover. Creativity 205 compared to the formulation of a problem no one had previ- Spillane, J. (1981). The Doctrine of the Nerves: Chapters in the History of Neurology. Oxford: Oxford University Press. ously recognized. Young, R. (1970). Mind, Brain and Adaptation in the Nineteenth 2. Incubation Some of the most important mental Century. Oxford: Oxford University Press. work in creative problems takes place below the threshold of consciousness, where problematic issues identified dur- Creativity ing the preceding stage remain active without the person controlling the process. By allowing ideas to be associated with the contents of memory more or less at random, incu- In psychology, creativity is usually defined as the produc- bation also allows completely unexpected combinations to tion of an idea, action, or object that is new and valued, emerge. As long as one tries to formulate or solve a problem although what is considered creative at any point in time consciously, previous habits of mind will direct thoughts in depends on the cultural context. rational, but predictable directions. The early history of research in creativity includes 3. Insight When a new combination of ideas is strong Cesare Lombroso’s investigation of the relationship enough to withstand unconscious censorship, it emerges between genius and madness, and Sir Francis Galton’s into awareness in a moment of illumination—the “Eureka!” genetic studies of genius. Guilford (1967) developed a the- or “Aha!” experience usually thought to be the essence of ory of cognitive functioning that took creativity into creativity. Without preparation evaluation, and elaboration, account, and a battery of tests that measured fluency, flexi- however, no new idea or product will follow. bility, and originality of thought in both verbal and visual 4. Evaluation The insight that emerges must be as- domains. His model and the tests he developed, such as the sessed consciously according to the rules and conventions of “Brick Uses” and “Unusual Uses” tests, are still the foun- the given domain. Most novel ideas fail to withstand critical dation for much of creativity testing and research (e.g., examination. One can go wrong by being either too critical Torrance 1988). or not critical enough. Contemporary approaches to creativity range from math- 5. Elaboration Thomas Edison made popular the say- ematical modeling and computer simulations of break- ing “Genius is 1 percent inspiration and 99 percent perspira- throughs in science (Langley et al. 1987) to the intensive tion.” Even the most brilliant insight disappears without a study of creative individuals (Gruber 1981; Gardner 1993). trace unless the person is able and willing to develop its Other approaches include the historiographic method implications, to transform it into a reality. But this stage applied to the content of large numbers of creative works, or does not involve a simple transcription of a model perfectly to biographies (Martindale 1990; Simonton 1990). Most formed in the mind. Most creative achievements involve studies, however, are still done with schoolchildren and stu- drastic changes that occur as the creator translates the dents, and assess performance on Guilford-type tests (for insight into a concrete product. A painter may approach the reviews, see Sternberg 1988 and Runco and Albert 1990). canvas with a clear idea of how the finished painting should look, but most original pictures evolve during the process of Stages of the Creative Process painting, as the combination of colors and shapes suggests new directions to the artist. Contrary to the popular image of creative solutions appear- ing with the immediacy of a popping flashbulb, most novel achievements are the result of a much longer process, some- Creativity as a Systemic Phenomenon times lasting many years. We can differentiate five stages of No person can be creative without having access to a tradi- this process (Wallas 1926), with the understanding that tion, a craft, a knowledge base. Nor can we trust the subjec- these stages are recursive, and may be repeated in several tive report of a person to the effect that his or her insight was full or partial cycles before a creative solution appears. indeed creative. It is one of the peculiarities of human psy- 1. Preparation It is almost impossible to have a good chology that most people believe their thoughts to be origi- new idea without having first been immersed in a particular nal and valuable. To accept such personal assessment at face symbolic system or domain. Creative inventors know the ins value would soon deprive the concept of creativity of any and outs of their branch of technology, artists are familiar specific meaning. with the work of previous artists, scientists have learned Creativity can best be understood as a confluence of whatever there is to know about their specialty. One must three factors: a domain, which consists of a set of rules and also feel a certain unease about the state of the art in one’s practices; an individual, who makes a novel variation in the domain. There has to be a sense of curiosity about some contents of the domain; and a field, which consist of experts unresolved problem—a machine that could be improved, a who act as gatekeepers to the domain, and decide which disease that has to be cured, a theory that could be made sim- novel variation is worth adding to it (Csikszentmihalyi pler and more elegant. Sometimes the problem is presented 1996). A burst of creativity is generally caused, not by indi- to the artist, scientist, or inventor by an outside emergency or viduals being more creative, but by domain knowledge requirement. The most important creative problems, how- becoming more available, or a field being more supportive ever, are discovered as the individual is trying to come to of change. Conversely, lack of creativity is usually caused, terms with the problematic situation (Getzels 1964). In such not by individuals lacking original thoughts, but by the cases, the problem itself may not be clearly formulated until domain having exhausted its possibilities, or the field not the very end of the process. As Albert Einstein noted, the recognizing the most valuable original thoughts. solution of an already formulated problem is relatively easy 206 Creoles The Creative Person Andreasen, N. C. (1987). Creativity and mental illness: prevalence rates in writers and first-degree relatives. American Journal of Three aspects of creative persons are particularly important: Psychiatry 144: 1288–1292. cognitive processes, personality, and values and motiva- Csikszentmihalyi, M. (1996). Creativity: Flow and the Psychology of Discovery and Invention. New York: HarperCollins. tions. While, in most cases, a certain level of intelligence is Feist, G. J. (Forthcoming). Personality in scientific and artistic cre- a prerequisite—a threshold of 120 IQ is often mentioned ativity. In R. J. Sternberg, Ed., Handbook of Human Creativity. (Getzels and Jackson 1962)—the relationship of IQ to cre- Cambridge: Cambridge University Press. ativity varies by domain, and after a relatively low thresh- Gardner, H. (1993). Creating Minds. New York: Basic Books. old, there seems to be no further contribution of IQ to Getzels, J. W. (1964). Creative thinking, problem-solving, and creativity. The first and longest study of high-IQ children instruction. In E. R. Hilgard, Ed., Theories of Learning and (Terman 1947; see also Sears and Sears 1980) found little Instruction. Chicago: University of Chicago Press. evidence of adult creativity in a sample whose mean IQ as Getzels, J. W., and M. Csikszentmihalyi. (1976). The Creative children was 152, or even in a subsample with an IQ above Vision: A Longitudinal Study of Problem Finding in Art. New 170. York: Wiley. Getzels, J. W., and P. Jackson. (1962). Creativity and Intelligence. The most obvious characteristic of original thinkers is New York: Wiley. what Guilford (1967) identified as “divergent thinking” or Gruber, H. (1981). Darwin on Man. Chicago: University of Chi- “thinking outside the box.” Divergent thinking involves cago Press. unusual associations of ideas, changing perspectives, and Guilford, J. P. (1967). The Nature of Human Intelligence. New novel approaches to problems, in contrast to convergent York: McGraw-Hill. thinking, which involves linear, logical steps. Correlations Jamison, K. R. (1989). Mood disorders and patterns of creativity in between divergent thinking tests and creative achievement British writers and artists. Psychiatry 52: 125–134. tend to be low, however, and some scholars even claim that Kris, E. (1952). Psychoanalytic Explorations in Art. New York: the cognitive approach of creative individuals does not dif- International Universities Press. fer qualitatively from that of normal people except in its Langley, P., H. A. Simon, G. L. Bradshaw, and J. M. Zytkow. (1987). Scientific Discovery: Computational Exploration of the speed (Simon 1988) and quantity of ideas produced (Simon- Creative Process. Cambridge, MA: MIT Press. ton 1990). Martindale, C. (1990). The Clockwork Muse: The Predictability of Some forms of mental disease such as manic depression, Artistic Change. New York: Basic Books. addiction, and suicide are more frequent among individuals Piechowski, M. J. (1991). Emotional development and emotional involved in artistic and literary pursuits (Andreasen 1987; giftedness. In N. Colangelo and G. A. Davis, Eds., Handbook of Jamison 1989), but this might have less to do with creativity Gifted Education. Boston: Allyn and Bacon, pp. 285–306. than with the lack of recognition that obtains in artistic Runco, M. A., and S. Albert, Eds. (1990). Theories of Creativity. domains. At the same time, creative individuals appear to be Newbury Park, CA: Sage. extremely sensitive to all kinds of stimuli, including Sears, P., and R. R. Sears. (1980). 1,528 little geniuses and how aversive ones (Piechowski 1991), possibly accounting for they grew. Psychology Today February: 29–43. Simon, H. A. (1988). Creativity and motivation: a response to their higher rates of emotional instability. Csikszentmihalyi. New Ideas in Psychology 6(2): 177–181. Personality traits often associated with creativity include Simonton, D. K. (1990). Scientific Genius. Cambridge: Cambridge openness to experience, impulsivity, self-confidence, intro- University Press. version, aloofness, and rebelliousness (Getzels and Csik- Sternberg, R. J., Ed. (1988). The Nature of Creativity. Cambridge: szentmihalyi 1976; Feist forthcoming). Such people also Cambridge University Press. seem to have a remarkable ability to be both playful and Terman, L. M. (1947). Subjects of IQ 170 or above. In Genetic hard-working, introverted and extroverted, aloof and gregar- Studies of Genius, vol. 4, chap. 21. Stanford: Stanford Univer- ious, traditional and rebellious, as the occasion requires sity Press. (Csikszentmihalyi 1996). The creative person might be less Torrance, E. P. (1988). The nature of creativity as manifest in its distinguished by a set of traits than by the ability to experi- testing. In R. J. Sternberg, Ed., The Nature of Creativity. Cam- bridge: Cambridge University Press, pp. 43–75. ence the world along modalities that in other people tend to Wallas, G. (1926). The Art of Thought. New York: Harcourt-Brace. be stereotyped. Throughout their lives, creative persons exhibit a childlike curiosity and interest in their domains, Further Readings value their work above conventional monetary or status rewards (Getzels and Csikszentmihalyi 1976), and enjoy it Sternberg, R. J., Ed. (1998). Handbook of Human Creativity. Cam- primarily for intrinsic reasons (Amabile 1983). Creativity is bridge: Cambridge University Press. its own reward. See also CONCEPTUAL CHANGE; EDUCATION; EXPERTISE; Creoles INTELLIGENCE; PICTORIAL ART AND VISION; SCIENTIFIC THINKING AND ITS DEVELOPMENT Creoles constitute a unique language group. Other groups —Mihalyi Csikszentmihalyi consist either of languages derived from a hypothesized pro- tolanguage (as Welsh, English, Greek, and Sanskrit derive References from Proto Indo-European) or a single historical ancestor (as Portuguese, Spanish, and Italian derive from Latin). Cre- Amabile, T. (1983). The Social Psychology of Creativity. New oles, however, have no clear affiliation. Although the term York: Springer. Creoles 207 creole has been used to characterize any language with an than non-creoles, a prediction that has only recently begun to appearance of language mixture, languages generally accep- be tested (Adone 1994). In innateness studies (see INNATE- ted as creoles have all arisen in recent centuries from con- NESS OF LANGUAGE), it suggests in addition to syntactic prin- tacts between speakers of unrelated languages, usually as ciples a default SEMANTICS yielding highly specific analyses indirect results of European colonialism. They are found of TENSE AND ASPECT, modality, negation, articles, and pur- throughout the tropics, especially where large numbers of posive constructions, among others (an element missing from workers have been imported as slaves (or, occasionally, generative accounts of innateness of language). In the EVOLU- indentured laborers) to work on European-owned planta- TION OF LANGUAGE it suggests that syntactically structured tions. Because more work has been done on plantation cre- language could have emerged abruptly from a structureless oles than on other types (those that developed in racially protolanguage (Bickerton 1990a). mixed communities on African and Asian coasts, or those Although bioprogram theory has had “an explosive derived initially from maritime contacts in the Pacific), most impact . . . upon all aspects of the field” (McWhorter 1996), of what follows will apply primarily to plantation creoles. other approaches to creole genesis continue to flourish. Per- Until recently, few scholars (e.g., Schuchardt 1882–91; haps the most currently popular alternative is substratism, Hesseling 1933) took these languages seriously; most which claims that languages spoken by non-European treated them as deformed versions of European languages. ancestors of creole speakers served as sources for character- Even to scholars who accepted them as true languages, their istically creole structures such as serial verb constructions origins were controversial: some saw them as radical devel- (e.g., equivalents of “I carry X come give Y” for “I bring X opments of French, English, and other languages, others as to Y”) and focusing of verbs by fronting and copying (e.g., non-European languages thinly disguised by European equivalents of “is break he break the glass” for “he broke the vocabularies, still others as descendants of an Ur-creole per- glass” as opposed to merely cracking it). Alleyne (1980), haps springing from Afro-Portuguese contacts in the fif- Boretsky (1983), and Holm (1988–89), among others, teenth and sixteenth centuries, perhaps even dating back to exemplify this approach. However, substratists attribute the medieval Lingua Franca. Other issues, such as whether these and other creole features to the Kwa language group in creoles were necessarily preceded by, and derived from, West Africa, heavily represented in some creole-speaking some radically impoverished quasi-language (jargon or areas (Haiti, Surinam), but more lightly, if at all, in others early-stage pidgin), or whether children or adults played the (the Gulf of Guinea, Mauritius) and not at all in still others major role in their creation, were debated with equal heat (Hawaii, the Seychelles). but equally little agreement. Substratism’s mirror-image, superstratism—the belief More than a century ago, Coelho (1880–86) pointed out that creoles derive their syntax from colloquial versions of that creoles showed structural similarities much greater than European language—is nowadays largely confined to French would be predicted given their wide distribution, their varied creolists (e.g., Chaudenson 1992). More widespread is a histories, and the large number of languages spoken in the “componential” approach (Hancock 1986, Mufwene 1986) contact situation where they arose. However, creoles would claiming that different mixtures of substratal, superstratal, have held little interest for cognitive science had it not been and universal features contributed to different creoles, the proposed that, due to their mode of origin, they reflected uni- mixture being determined in each case by social, historical, versals of language more directly than “normal” languages. and demographic factors. However, because no one has yet This hypothesis of a species-specific biological program proposed a formula for determining the relative contributions for language, providing default settings where acquisitional of the various components, this approach is at present virtu- input is greatly reduced and/or deformed (often referred to ally unfalsifiable, and it constitutes a research program rather as “bioprogram theory”), assumes a version of GENERATIVE than a theory. GRAMMAR proposed by Borer (1983), and now widely Controversies have thus far centered around creole ori- accepted by generativists, in which parametric variation in gins. But only recently, after decades of speculation and SYNTAX arises solely from variability in grammatical MOR- conjecture, have serious attempts been made to gather his- PHOLOGY. In the (severely depleted) morphology of creoles, torical data (e.g., Arends 1995; Baker 1996). Not all this many (often most) grammatical morphemes are distinct work is of equal value; claims of a scarcity of children in from those of any language spoken in the contact situation. early colonies are refuted by contemporary statistics This indicates that grammatical morphemes, lost in the pid- (Postma 1990; Bickerton 1990b). Documentation for the ginization process, are replaced by the creation of new mor- earliest stages of creole languages remains sparse, except phemes, often from semantically “bleached” referential for Hawaii, where rich data for all phases of the pidgin-cre- items (thus the verb for “go” marks irrealis mode, locative ole cycle has been unearthed by Roberts (1995, 1998). verbs mark imperfective aspect, etc.). Some grammatical These data confirm, at least for Hawaii, the main claims of functions required immediate recreation of new mor- bioprogram theory: that the creole was created (a) from a phemes, whereas morphemes for other functions were rec- primitive, structureless pidgin (b) in a single generation (c) reated centuries later, if at all (Bickerton 1988). The by children rather than adults. Hopefully, ongoing historical implication of these facts for the study of universal grammar research on other creoles will determine the extent to which remain, surprisingly, unexplored. these conform to Hawaii’s pattern. Bioprogram theory has obvious implications for other See also CULTURAL EVOLUTION; LANGUAGE AND CULTURE fields of inquiry. In LANGUAGE ACQUISITION, it suggests that —Derek Bickerton creoles should be acquired more rapidly and with fewer errors 208 Cross-Cultural Variation set of items designed to tap the respondents’ shared knowl- References edge. The data consist of a respondent-item matrix contain- Adone, D. (1994). The Acquisition of Mauritian Creole. Amster- ing each respondent’s answers to each of the items. An dam: John Benjamins. appropriate cultural consensus model provides estimates of Alleyne, M. (1980). Comparative Afro-American. Ann Arbor, MI: each respondent’s competence (knowledge) as well as an Karoma. estimate of the culturally correct answer to each item. When Arends, J., Ed. (1995). The Early Stages of Creolization. Amster- the theory was developed in the mid-1980s it was motivated dam: John Benjamins. by the observation that when an anthropologist goes to a Baker, P., Ed. (1996). From Contact to Creole and Beyond. Lon- new culture and asks questions, neither the answers to the don: University of Westminster Press. questions nor the cultural competence of the respondents is Bickerton, D. (1981). Roots of Language. Ann Arbor, MI: Karoma. Bickerton, D. (1984). The language bioprogram hypothesis. Be- known. It has since been applied to a number of research havioral and Brain Sciences 7. questions, for example, folk medical beliefs, judgment of Bickerton, D. (1988). Creole languages and the bioprogram. In F. J. personality traits in a college sorority, semiotic characteriza- Newmeyer, Ed., Linguistics: The Cambridge Survey, vol. 2. tions of alphabetic systems, occupational prestige, causes of Cambridge: Cambridge University Press, pp. 267–284. death, illness beliefs of deaf senior citizens, hot-cold con- Bickerton, D. (1990a). Language and Species. Chicago: University cepts of illness, child abuse, national consciousness in of Chicago Press. Japan, measuring interobserver reliability, and three-way Bickerton, D. (1990b). Haitian demographics and creole genesis. social network data. Canadian Journal of Linguistics 35: 217–219. Consensus theory uses much of the accumulated knowl- Borer, H. (1983). Parametric Syntax. Dordrecht: Foris. edge of traditional psychometric test theory without assum- Boretsky, N. (1983). Kreolsprachen, Substrate und Sprachwandel. Weisbaden: Otto Harrassowitz. ing knowledge of the “correct” answers in advance. Chaudenson, R. (1992). Des Iles, des Hommes, des Langues. Paris: Traditional test theory begins with respondent-item “perfor- L’Harmattan. mance” data (i.e., items’ scores as “correct” or “incorrect”), Coelho, F. A. (1880–86). Os dialectos romanicos o neo-latinos na whereas consensus theory begins with “response” data Africa, Asia e America. Boletin da Sociedade de Geografia de (items coded as responses given by the respondent, for Lisboa 2: 129–196; 3: 451–478; 6: 705–755. example, “true” or “false,” without scoring the responses). Hancock, I. (1986). The domestic hypothesis, diffusion and com- The different models of the theory depend on the format of ponentiality: an account of Atlantic Anglophone Creole ori- the questions, for example, true-false, multiple choice, or gins. In P. Muysken and N. Smith, Eds., Substrata Versus ranking. Anthropology is the prototypical social science that Universals in Creole Genesis. Amsterdam: John Benjamins. can use such a methodology; however, research in other Hesseling, D. C. (1933). Hoe onstond de eigenaardige vorm van het Kreols? Neophilologus 18: 209–215. areas of social and behavioral science, such as cognitive Holm, J. (1988–89). Pidgins and Creoles. 2 vols. Cambridge: psychology, social networks, and sociology, can also benefit Cambridge University Press. from its use. McWhorter, J. (1996). Review of Pidgins and Creoles, An Intro- Cultural consensus theory fits into the category of infor- duction, edited by J. Arends, P. Muysken, and N. Smith. Jour- mation-pooling methods in which one has answers from nal of Pidgin and Creole Languages 11: 145–151. several “experts” to a fixed body of “objective” questions. Mufwene, S. (1986). The universalist and substrate hypotheses The goal is to aggregate rationally the experts’ responses to complement one another. In P. Muysken and N. Smith, Eds., select the most likely “correct answer” to each question, and Substrata Versus Universals in Creole Genesis. Amsterdam: also to assess one’s degree of confidence in these selections. John Benjamins. Cultural consensus theory provides an information-pooling Postma, J. M. (1990). The Dutch in the Atlantic Slave Trade, 1600– 1815. Cambridge: Cambridge University Press. methodology that does not incorporate a researcher’s prior Roberts, S. J. (1995). Pidgin Hawaiian: a sociohistorical study. beliefs about the correct answers or any prior calibrations of Journal of Pidgin and Creole Languages 10: 1–56. the experts, and instead, it estimates both the respondents’ Roberts, S. J. (1998). The role of diffusion in creole genesis. To competencies and the consensus answers from the same set appear in Language. of questionnaire data. Schuchardt, H. (1882–91). Kreolische Studien. Vienna: G. A central concept in the theory is the use of the pattern of Gerold’s Sohn/K. Tempsky. agreement or consensus among respondents to make infer- ences about their differential knowledge of culturally shared Cross-Cultural Variation information represented in the questions. It is assumed that the sole source of correspondence between the answers of any two respondents is a function of the extent to which the See CULTURAL VARIATION; HUMAN UNIVERSALS; LAN- knowledge of each is correlated with (overlaps) this shared GUAGE AND CULTURE; LINGUISTIC UNIVERSALS AND UNI- information. In other words, when responses are not based VERSAL GRAMMAR on shared information they are assumed to be uncorrelated. More formally, the model is derived from a set of three basic Cultural Consensus Theory assumptions that are elaborated appropriately for each ques- tion format: Cultural consensus theory is a collection of formal statisti- Assumption 1: Common Truth. There is a fixed answer key cal models designed to measure cultural knowledge shared applicable to all respondents. by a set of respondents. Each respondent is given the same Cultural Evolution 209 Assumption 2: Local Independence. The respondent-item Batchelder, W. H., and A. K. Romney. (1988). Test theory without an answer key. Psychometrika 53: 71–92. response random variables satisfy conditional indepen- Batchelder, W. H., and A. K. Romney. (1989). New results in test dence (conditional on the correct answer key). theory without an answer key. In E. Roskam, Ed., Advances in Assumption 3: Homogeneity of Items. Each respondent has Mathematical Psychology, vol. 2. Heidelberg and New York: a fixed competence over all questions. Springer-Verlag, pp. 229–248. Batchelder, W. H., E. Kumbasar, and J. P. Boyd. (1997). Consensus In some contexts Assumption 3 is replaced with a weaker analysis of three-way social network data. Journal of Mathe- one, monotonicity, that allows them to differ in difficulty: matical Sociology 22: 29–58. Basically, monotonicity says that respondents who have Brewer, D. D., A. K. Romney, and W. H. Batchelder. (1991). Con- more competence on any subset of questions will have more sistency and consensus: a replication. Journal of Quantitative competence on all subsets. Anthropology 3: 195–205. Klauer, K. C., and W. H. Batchelder. (1996). Structural analysis of Formal process models have been derived for the analy- subjective categorical data. Psychometrika 61: 199–240. sis of dichotomous, multiple-choice, matching, and continu- Romney, A. K., W. H. Batchelder, and S. C. Weller. (1987). Recent ous item formats. Informal data models have also been applications of consensus theory. American Behavioral Scien- developed for rank order and interval level formats. The the- tist 31: 163–177. ory has also been extended to the analysis of multiple cul- Romney, A. K., S. C. Weller, and W. H. Batchelder. (1986). Cul- tures by relaxing the first axiom. In this situation each ture as consensus: a theory of culture and accuracy. American respondent belongs to exactly one culture, but different cul- Anthropologist 88: 313–338. tures may have different answer keys. Weller, S. C. (1987). Shared knowledge, intracultural variation, For very small sets of respondents (six or fewer), itera- and knowledge aggregation. American Behavioral Scientist tive maximum likelihood estimates of the parameters can be 31:178–193. Weller, S. C., and N. C. Mann. (1997). Assessing rater perfor- obtained by existing methods. For example, in the true-false mance without a “gold standard” using consensus theory. Med- case, the consensus model is equivalent to the two-class ical Decision Making 17: 71–79. latent structure model with the roles of respondents and Weller, S. C., L. M. Pachter, R. T. Trotter, and R. D. Baer. (1993). items interchanged; thus known estimation methods for that Empacho in four Latino groups: a study of intra- and inter- model can be used. For other situations, new estimation cultural variation in beliefs. Medical Anthropology 15: 109– methods have been developed and assessed with Monte 136. Carlo data. The theory enables the calculation of the mini- mal number of respondents needed to reconstruct the cor- Cultural Evolution rect answers as a function of preselected levels of mean cultural competence of the respondents and levels of confi- dence in the reconstructed answers. It is also possible to Human cultures include among other things mental repre- estimate the amount of sampling variability among respon- sentations with some between-group differences and within- dents and thus identify “actual” variance in cultural compe- group similarities. Ecological constraints, historical condi- tence. tions, and power relations may influence the transmission of The theory performs better than does using a simple culture. A cognitive account assumes that, all these being majority rule to reconstruct the answer key, especially in equal, some trends in culture result from universal proper- cases where there are small numbers of respondents with ties of human minds. These may account for patterns of heterogeneous competence. The success of cultural consen- change as well as for stability over time and space. sus theory as an information-pooling method can be traced In the past, various forms of evolutionism described cul- to several factors: (1) it is normally applied to items that tap tures as cognitively different and some of them as intrinsi- high concordance cultural codes where mean levels of con- cally more complex or developed than others (see Ingold sensus are high; (2) the theory allows the differential 1986). In this view, differences in social and economic com- weighting of the respondents’ responses in reconstructing plexity between human groups corresponded to cognitive the answer key; and (3) the theory uses precise assumptions differences between peoples or races. It is clear to modern derived from successful formal models in test theory, latent cognitive scientists that this is untenable. Different environ- structure analysis, and signal detection theory. The model ments make different demands on a cognitive system, but has been subjected to extensive testing through simulation there is no hierarchy of complexity or development between and Monte Carlo methods. them, and the relevant cognitive structures are typical of the See also CONCEPTS; CULTURAL PSYCHOLOGY; RADICAL human species as a whole. A cognitive approach must address three related ques- INTERPRETATION tions: (1) Is cultural transmission similar to genetic trans- —A. Kimball Romney and William H. Batchelder mission? (2) How did hominization lead to the appearance of culture? and (3) How are cultural representations con- Further Readings strained by the human genotype? 1. Are cultural memes like genes? Many authors have Batchelder, W. H., and A. K. Romney. (1986). The statistical anal- suggested that cultural evolution could be modeled on terms ysis of a general condorcet model for dichotomous choice situ- derived from natural selection (Campbell 1970). Mentally ations. In B. Grofman and G. Owen, Eds., Information Pooling represented units of information, usually called memes and Group Decision Making, pp. 103–112. Greenwich, Con- (Dawkins 1976), result in overt behavior, are passed on necticut: JAI Press. 210 Cultural Evolution through social interaction and modified by memory and irrelevant facts and correlations in the environment. These inference. Different memes may have different cultural fit- considerations lead to the research program of EVOLUTION- ness values: culture evolves through differential transmis- ARY PSYCHOLOGY, which specifies a large number of cogni- sion of ideas, values and beliefs (Durham 1991: 156). tive adaptations. These are specialized in particular aspects Coevolution theories describe significant trends in meme of experience that would have been of plausible relevance to transmission using the formal tools of population genetics fitness in the environment of evolutionary adaptation, (Lumsden and Wilson 1981; Boyd and Richerson 1985; though not necessarily in a modern environment. To some Durham 1991). Patterns of transmission and change depend extent, the archaeological record supports this notion of spe- on quantitative factors, such as the frequency of a trait in cialized microcapacities appearing side by side and making cultural elders or the number of variants available in a given up an ever more complex mind. However, late developments group, but also on cognitive processes. Durham, for (in particular, the cultural differences between Neanderthals instance, makes a distinction between primary values, a set and modern humans) may also suggest that communication of evolved, universal propensities toward certain representa- between modules was as important as development in each tions, and secondary values, socially acquired expectations of them (Mithen 1996). concerning the possible consequences of behavior (1991: 3. Is culture constrained by genes? What are the con- 200, 432). nections between the genotype and recurrent features of An alternative to replication is an epidemiological culture? Coevolution theories have challenged the assump- model, in which cultural evolution is construed as the out- tion of early human SOCIOBIOLOGY that people’s concepts come of mental contagion (Sperber 1985). This approach and values generally tend to maximize their reproductive emphasizes the differences between gene and meme trans- potential (Cavalli-Sforza and Feldman 1973). Most cultural mission. Cultural representations are not literally replicated, variants are adaptively neutral, and many are in fact mal- because human communication is intrinsically inferential adaptive, so coevolution models postulate different evolu- and works by producing publicly available tokens (e.g., tion tracks for genes and memes. Beyond this, one may utterances, gestures) designed to change other agents’ repre- argue that evolved properties of human cognition influence sentations (Sperber and Wilson 1986). Cultural epidemics cultural evolution in two different ways. are distinct from a replication process in that acquisition First, human minds comprise a set of ready-made behav- typically produces variants rather than copies of the repre- ioral recipes that are activated by particular cues in the natu- sentations of others; rough replication, then, is an exception ral and social environment. Whether those cues are present that must be explained, rather than the norm (Sperber 1996). and which capacities are activated may vary from place to 2. What made (and makes) culture possible? There are place. So different environments set parameters differently important differences between various types of animal tradi- for such universal capacities as social exchange, detection of tions and complex, flexible human cultures, which often cheaters, or particular strategies in mate-selection and in the show an accumulation of modifications over generations allocation of parental investment (see Barkow, Cosmides, (i.e., the ratchet effect; Tomasello, Kruger, and Ratner 1993: and Tooby 1992 for a survey of these domains). 508). Humans may have developed very general learning Second, humans develop universal conceptual structures capacities that allow them to acquire whatever information that constrain the transmission of particular representations. can be found in their environment. Alternatively, EVOLUTION This can be observed even in beliefs and values whose may have given humans a more numerous and complex set overt content seems culturally variable. Children gradually of specialized cognitive capacities. develop a set of quasi-theoretical, domain-specific assump- The first type of explanation can be found in Donald’s tions about the different types of objects in the world as well account of the appearance of cognitive plasticity as a crucial as expectations about their observable and underlying prop- evolutionary change. The primate mind became modern by erties. The experimental evidence demonstrates the effects developing a powerful learning device without constraining of such principles in domains like THEORY OF MIND, the per- restrictions as to the range of mental contents that could be ception of mechanical causation or the specific properties of learned (Donald 1993). An explanation in terms of more living things (Hirschfeld and Gelman 1994). This intuitive specific capacities is Tomasello, Kruger, and Ratner’s ontology has direct effects on the acquisition of cultural rep- (1993) account of cultural learning, as distinct from social resentations. learning based on IMITATION and found in higher primates. In some domains, information derived from cultural Cultural learning requires mind-reading and perspective- input is acquired inasmuch as it tends to enrich early skele- taking capacities. Obviously, these capacities would have tal principles. This is the case for number systems, for been boosted by the appearance of verbal communication. instance, as cultural input provides names for intuitive con- One may push this further and argue that the appearance cepts of numerosity (Gallistel and Gelman 1992). In the of culture depended, not on a more powerful general learn- same way, FOLK PSYCHOLOGY is built by using cultural ing capacity, but on a multiplication of specialized capaci- input, for instance about motivation, emotion, through-pro- ties (Rozin 1976, Tooby and Cosmides 1989). This would cesses, and so on, that provide explanations for the intui- have been less costly in evolutionary terms. It only required tions delivered by our theory of mind. In folk biology, too, gradual addition of small modules rather than the sudden cultural input that is spontaneously selected tends to enrich appearance of a general and flexible mind. Also, it makes intuitive principles about the taxonomic ordering of living more computational sense. An unbiased all-purpose learn- species or possession of an essence as a feature of each spe- ing capacity could be overloaded with many adaptively cies (Atran 1990). Even such social constructs as kinship Cultural Psychology 211 terms or notions of family and race can be construed as Cavalli-Sforza, L. L., and M. W. Feldman. (1973). Cultural versus biological inheritance: phenotypic transmission from parents to enriching an intuitive apprehension of social categories children. American Journal of Human Genetics 25: 618–637. (Hirschfeld 1994). Dawkins, R. (1976). The Selfish Gene. New York: Oxford Univer- In other domains, cultural input is selectively attended to sity Press. inasmuch as it violates the expectations of intuitive ontol- Donald, M. (1993). Precis of origins of the modern mind: three ogy. Religious ontologies, for instance, postulate agents stages in the evolution of culture and cognition. Behavioral and whose physical or biological properties are counterintuitive, Brain Sciences 16: 737–791. given ordinary expectations about intentional agents. Such Durham, W. (1991). Coevolution: Genes, Cultures and Human combinations are very few in number and account for most Diversity. Stanford, CA: Stanford University Press. cultural variants in religious systems (Boyer 1994). Their Gallistel, C. R., and R. Gelman. (1992). Preverbal and verbal presence in individual religious representations can be dem- counting and computation. Cognition 44: 79–106. Hirschfeld, L. A. (1994). The acquisition of social categories. In onstrated experimentally (Barett and Keil 1996). L. A. Hirschfeld and S. A. Gelman, Eds., Mapping The Mind: In some domains more complex processes are involved. Domain-Specificity in Culture and Cognition. New York: Cam- This is the case for scientific theories and other forms of bridge University Press. scholarly knowledge that diverge from intuitive ontology. Hirschfeld, L. A., and S. A. Gelman, Eds. (1994). Mapping The Such systems of representations generally require consider- Mind: Domain-Specificity in Culture and Cognition. New York: able social support (intensive tuition and specialized institu- Cambridge University Press. tions like schools). They generally include an explicit Ingold, T. (1986). Evolution and Social Life. Cambridge: Cam- description of their divergence from intuitive ontology, and bridge University Press. therefore a METAREPRESENTATION of ordinary representa- Lumsden, C. J., and E. O. Wilson. (1981). Genes, Minds and Cul- tions about the natural world. This is why such systems typ- ture. Cambridge, MA: Harvard University Press. Mithen, S. (1996). The Prehistory of the Mind. London: Thames ically require LITERACY, which boosts metarepresentational and Hudson. capacities and provides external memory storage, allowing Rozin, P. (1976). The evolution of intelligence and access to the for incremental additions to cultural representations. cognitive unconscious. In J. M. Sprague and A. N. Epstein, Human cognition comprises a series of specialized Eds., Progress in Psychobiology and Physiological Psychology. capacities. Transmission patterns probably vary as a func- New York: Academic Press. tion of which domain-specific conceptual predispositions Sperber, D. (1985). Anthropology and psychology: towards an epi- are activated. So there may be no overall process of cultural demiology of representations. Man 20: 73–89. transmission, but a series of domain-specific cognitive Sperber, D. (1996). Explaining Culture: A Naturalistic Approach. tracks of transmission. Models of cultural evolution are tau- Oxford: Blackwell. tological if they state only that whatever got transmitted Sperber, D., and D. Wilson. (1986). Relevance, Communication and Cognition. New York: Academic Press. must have been better than what did not (Durham 1991: Tomasello, M., A. C. Kruger, and H. H. Ratner. (1993). Cultural 194). This is where cognitive models are indispensable. learning. Behavioral and Brain Sciences 16: 495–510. Experimental study of cognitive predispositions provides Tooby, J., and L. Cosmides. (1989). Evolutionary psychology and independent evidence for the underlying mechanisms of the generation of culture (i): theoretical reflections. Ethology cultural evolution. and Sociobiology 10: 29–49. See also ADAPTATION AND ADAPTATIONISM; COGNITIVE ARCHAEOLOGY; COGNITIVE ARTIFACTS; DOMAIN SPECIFIC- Cultural Models ITY; NAIVE BIOLOGY; NAIVE MATHEMATICS —Pascal Boyer See METAPHOR AND CULTURE; MOTIVATION AND CULTURE References Cultural Psychology Atran, S. (1990). Cognitive Foundations of Natural History: Towards an Anthropology of Science. Cambridge: Cambridge The most basic assumption of cultural psychology can be University Press. traced back to the eighteenth-century German romantic phi- Barett, J. L., and F. C. Keil. (1996). Conceptualizing a non-natural losopher Johann Gottfried von Herder, who proposed that entity: anthropomorphism in God concepts. Cognitive Psychol- “to be a member of a group is to think and act in a certain ogy 31: 219–247. way, in the light of particular goals, values, pictures of the Barkow, J., L. Cosmides, and J. Tooby, Eds. (1992). The Adapted world; and to think and act so is to belong to a group” (Ber- Mind: Evolutionary Psychology and the Generation of Culture. New York: Oxford University Press. lin 1976: 195). During the past twenty-five years there has Boyd, R., and P. Richerson. (1985). Culture and the Evolutionary been a major renewal of interest in cultural psychology, pri- Process. Chicago: University of Chicago Press. marily among anthropologists (D’Andrade 1995; Geertz Boyer, P. (1994). The Naturalness of Religious Ideas: A Cognitive 1973; Kleinman 1986; Levy 1973; Shore 1996; Shweder Theory of Religion. Berkeley and Los Angeles: University of 1991; Shweder and LeVine 1984; White and Kirkpatrick California Press. 1985), psychologists (Bruner 1990; Cole 1996; Goodnow, Campbell, D. T. (1970). Natural selection as an epistemological Miller, and Kessel 1995; Kitayama and Markus 1994; model. In N. Naroll and R. Cohen, Eds., A Handbook of Markus and Kitayama 1991; Miller 1984; Nisbett and Method in Cultural Anthropology. Garden City, NY: Chapman Cohen 1995; Russell 1991; Yang forthcoming) and linguists and Hall. 212 Cultural Psychology sometimes referred to as “intentional” or “symbolic” states. (Goddard 1997; Wierzbicka 1992a), although relevant work Cultural psychology is the study of those intentional and has been done by philosophers as well (Harre 1986; MacIn- symbolic states of individuals (a belief in a reincarnating tyre 1981; Taylor 1989). The contemporary field of cultural soul, a desire to purify one’s soul and protect it from pollu- psychology is concerned, as was Herder, with both the psy- tions of various kinds) that are part and parcel of a particular chological foundations of cultural communities and the cul- cultural conception of things made manifest in, and tural foundations of mind. It is concerned with the way acquired by means of involvement with, the speech, laws culture and psyche make each other up, over the history of and customary practices of some group. the group and over the life course of the individual. It has been noted by Clifford Geertz, and by others inter- The word “cultural” in the phrase “cultural psychology” ested in lived realities, that “one does not speak language; refers to local or community-specific conceptions of what is one speaks a language.” Similarly, one does not categorize; true, good, beautiful, and efficient (“goals, values and pic- one categorizes something. One does not want; one wants tures of the world”) that are socially inherited, made mani- something. On the assumption that what you think about fest in the speech, laws, and customary practices of can be decisive for how you think, the focus of cultural psy- members of some self-monitoring group, and which serve to chology has been on content-laden variations in human mark a distinction between different ways of life (the Amish mentalities rather than on the abstract common denomina- way of life, the way of life of Hindu Brahmans in rural tors of the human mind. Cultural psychologists want to India, the way of life of secular urban middle-class Ameri- know why Tahitians or Chinese react to “loss” with an expe- cans). rience of headaches and back pains rather than with the A community’s cultural conception of things will usually experience of “sadness” so common in the Euro-American include some vision of the proper ends of life; of proper val- cultural region (Levy 1973; Kleinman 1986). They seek to ues; of proper ways to speak; of proper ways to discipline document population-level variations in the emotions that children; of proper educational goals; of proper ways to are salient or basic in the language and feelings of different determine kinship connections and obligations; of proper peoples around the world (Kitayama and Markus 1994; gender and authority relations within the family; of proper Russell 1991; Shweder 1993; Wierzbicka 1992a). They aim foods to eat; of proper attitudes toward labor and work, sex- to understand why Southern American males react more uality and the body, and members of other groups whose violently to insult than Northern American males (Nisbett beliefs and practices differ from one’s own; of proper ways and Cohen 1996) and why members of sociocentric subcul- to think about salvation; and so forth. tures perceive, classify, and moralize about the world differ- A community’s cultural conception of things will also ently than do members of individualistic subcultures usually include some received, favored, or privileged “reso- (Markus and Kitayama 1991; Triandis 1989; Shweder lution” to a series of universal, scientifically undecidable, 1991). and hence existential questions. These are questions with It is precisely because cultural psychology is the study of respect to which “answers” must be given for the sake of the content-laden intentional/symbolic states of human social coordination and cooperation, whether or not they are beings that cultural psychology should be thought of as the logically or ultimately solvable by human beings, questions study of peoples (such as Trobriand Islanders or Chinese such as “What is me and what is not me?”, “What is male and Mandarins), not people (in general or in the abstract). The what is female?”, “How should the burdens and benefits of psychological subject matter definitive of cultural psychol- life be fairly distributed?”, “Are there community interests or ogy thus consists of those aspects of the mental functioning cultural rights that take precedence over the freedoms (of of individuals that have been ontogenetically activated and speech, conscience, association, choice) associated with historically reproduced by means of some particular cultural individual rights?” and “When in the life of a fetus or child conception of things, and by virtue of participation in, does social personhood begin?” Locally favored and socially observation of, and reflection on the activities and practices inherited “answers” to such questions are expressed and of a particular group. This definition of research in cultural made manifest (and are thus discernible) in the speech, laws, psychology sets it in contrast (although not necessarily in and customary practices of members of any self-monitoring opposition) to research in general psychology, where the group. In sum, local conceptions of the true, the good, the search is for points of uniformity in the psychological func- beautiful, the efficient, plus discretionary “answers” to cog- tioning of people around the world. Without denying the nitively undecidable existential questions, all made apparent existence of some empirically manifest psychological uni- in and through practice, is what the word “cultural” in “cul- formities across all human beings, the focus in cultural psy- tural psychology” is all about. chology is on differences in the way members of different The word “psychology” in the phrase “cultural psychol- cultural communities perceive, categorize, remember, feel, ogy” refers broadly to mental functions, such as perceiving, want, choose, evaluate, and communicate. The focus is on categorizing, reasoning, remembering, feeling, wanting, psychological differences that can be traced to variations in choosing, valuing, and communicating. What defines a communally salient “goals, values, and pictures of the function as a “mental” function per se (over and above, or in world.” contrast to a “physical” function) has something to do with Cultural psychology is thus the study of the way the the capacity of the human mind to grasp ideas, to do things human mind can be transformed, given shape and defini- for reasons or with a purpose in mind, to be conscious of tion, and made functional in a number of different ways that alternatives and aware of the content or meaning of its own are not uniformly distributed across cultural communities experience. This is one reason that “mental” states are Cultural Relativism 213 around the world. “Universalism without the uniformity” is Yang, K.-S. (1997). Indigenizing westernized Chinese psychology. In M. Bond, Ed., Working at the Interface of Culture: Twenty one of the slogans cultural psychologists sometimes use to Lives in Social Science. London: Routledge. talk about “psychic unity,” and about themselves. See also CULTURAL EVOLUTION; CULTURAL SYMBOLISM; Further Readings ETHNOPSYCHOLOGY; HUMAN UNIVERSALS; INTENTIONALITY Fiske, A. (1992). The Four Elementary Forms of Sociality: Frame- —Richard A. Shweder work for a Unified Theory of Social Relations. New York: Free Press. References Jessor, R., A. Colby, and R. Shweder, Eds. (1996). Ethnography and Human Development: Context and Meaning in Social Berlin, I. (1976). Vico and Herder. London: Hogarth. Inquiry. Chicago: University of Chicago Press. Bruner, J. (1990). Acts of Meaning. Cambridge, MA: Harvard Uni- Kakar, S. (1978). The Inner World: A Psychoanalytic Study of versity Press. Childhood and Society in India. New York: Oxford University Cole, M. (1996). Cultural Psychology: A Once and Future Disci- Press. pline. Cambridge, MA: Harvard University Press. Kakar, S. (1996). The Colors of Violence: Cultural Identities, Reli- D’Andrade, R. (1995). The Development of Cognitive Anthropol- gion and Conflict. Chicago: University of Chicago Press. ogy. Cambridge: Cambridge University Press. Lebra, T. (1992). Culture, Self and Communication. Ann Arbor: Geertz, C. (1973). The Interpretation of Cultures. New York: Basic University of Michigan Press. Books. Lucy, J. (1992). Grammatical Categories and Cognition: A Case Goddard, C. (1997). Contrastive semantics and cultural psychol- Study of the Linguistic Relativity Hypothesis. New York: Cam- ogy: “Surprise” in Malay and English. Culture and Psychology bridge University Press. 2: 153–181. Lutz, C., and White, G. (1986). The anthropology of emotions. Goodnow, J., P. Miller, and F. Kessel. (1995). Cultural practices as Annual Review of Anthropology 15: 405–436. contexts for development. In New Directions for Child Devel- MacIntyre, A. (1981). After Virtue: A Study in Moral Theory. opment. Vol. 67. San Francisco: Jossey-Bass. Notre Dame: University of Notre Dame Press. Harre, R. (1986). The Social Construction of Emotions. Oxford: Markus, H., S. Kitayama, and R. Heiman. (1998). Culture and Blackwell. “basic” psychological principles. In E. T. Higgins and A. W. Kitayama, S., and H. Markus, Eds. (1994). Emotion and Culture: Kruglanski, Eds., Social Psychology: Handbook of Basic Prin- Empirical Studies of Mutual Influence. Washington, DC: Amer- ciples. New York: Guilford. ican Psychological Association. Miller, J. (1994). Cultural psychology: bridging disciplinary Kleinman, A. (1986). Social Origins of Distress and Disease. New boundaries in understanding the cultural grounding of self. In P. Haven: Yale University Press. Bock, Ed., Handbook of Psychological Anthropology. West- Levy, R. (1973). Tahitians: Mind and Experience in the Society port, CT: Greenwood Press. Islands. Chicago: University of Chicago Press. Much, N. (1995). Cultural psychology. In J. Smith, R. Harre, and L. Markus, H., and S. Kitayama. (1991). Culture and the self: impli- van Langenhove, Eds., Rethinking Psychology. London: Sage. cations for cognition, emotion, and motivation. Psychological Shweder, R. (1998). Welcome to Middle Age! (And Other Cultural Review 98: 224–253. Fictions.) Chicago: University of Chicago Press. Miller, J. (1984). Culture and the development of everyday social Shweder, R., J. Goodnow, G. Hatano, R. LeVine, H. Markus, and explanation. Journal of Personality and Social Psychology 46: P. Miller. (1997). The cultural psychology of development: one 961–978. mind, many mentalities. In W. Damon, Ed., Handbook of Child Nisbett, R., and D. Cohen. (1995). The Culture of Honor: The Psychology, vol. 1: Theoretical Models of Human Develop- Psychology of Violence in the South. Boulder, CO: Westview ment. New York: John Wiley and Sons. Press. Shweder, R., and M. Sullivan. (1993). Cultural psychology: who Russell, J. (1991). Culture and the categorization of emotions. Psy- needs it? Annual Review of Psychology 44: 497–523. chological Bulletin 110: 426–450. Stigler, J., R. Shweder, and G. Herdt. (1990). Cultural Psychology: Shore. B. (1996). Culture in Mind: Cognition, Culture and the Essays on Comparative Human Development. Chicago: Uni- Problem of Meaning. New York: Oxford University Press. versity of Chicago Press. Shweder, R. (1991). Thinking Through Cultures: Expeditions in Wierzbicka, A. (1992). Defining emotion concepts. Cognitive Sci- Cultural Psychology. Cambridge, MA: Harvard University ence 16: 539–581. Press. Wierzbicka, A. (1993). A conceptual basis for cultural psychology. Shweder, R. (1993). The cultural psychology of the emotions. In Ethos: Journal of the Society for Psychological Anthropology M. Lewis and J. Haviland, Eds., Handbook of Emotions. New 21: 205–231 York: Guilford. Shweder, R., and R. LeVine. (1984). Culture Theory: Essays on Mind, Self and Emotion. New York: Cambridge University Cultural Relativism Press. Taylor, C. (1989). Sources of the Self: The Making of Modern Iden- tities. Cambridge, MA: Harvard University Press. How are we to make sense of the diversity of beliefs and Triandis, H. (1989). The self and social behavior in differing cul- ethical values documented by anthropology’s ethnographic tural contexts. Psychological Review 93: 506–520. record? Cultural relativism infers from this record that sig- Wierzbicka, A. (1992a). Talking about emotions: semantics, cul- nificant dimensions of human experience, including moral- ture and cognition. Cognition and Emotion 6: 285–319. ity and ethics, are inherently local and variable rather than White, G., and J. Kirkpatrick, Eds. (1985). Person, Self and Expe- universal. Most relativists (with the exception of develop- rience: Exploring Pacific Ethnopsychologies. Berkeley: Uni- mental relativists discussed below) interpret and evaluate versity of California Press. 214 Cultural Relativism such diverse beliefs and practices in relation to local cultural Both a philosophical and a moral stance, cultural relativ- frameworks rather than universal principles. ism makes two different sorts of claims: (1) an ontological There are many variations on the theme of cultural rela- claim about the nature of human understanding, a claim tivism. Six important variants are described below: subject to empirical testing and verification, and (2) a 1. Epistemological relativism, the most general phrasing moral/political claim advocating tolerance of divergent cul- of cultural relativism, proposes that human experience is tural styles of thought and action. mediated by local frameworks for knowledge (Geertz Cultural relativism implies a fundamental human psychic 1973). Most epistemological relativism assumes that experi- diversity. Such diversity need not preclude important uni- enced reality is largely a social and cultural construction and versals of thought and feeling. Relativism and universalism so this position is often called “social constructionism” are often seen as mutually exclusive. At the relativist end of (Berger and Luckmann 1966). the spectrum are proponents of CULTURAL PSYCHOLOGY 2. Logical relativism claims that there are no transcul- who argue that the very categories and processes by which tural and universal principles of rationality, logic and rea- psychologists understand the person are themselves cultural soning. This claim was debated in the 1970s in a series of constructs, and who imply that academic psychology is publications featuring debates among English philosophers, actually a Western ETHNOPSYCHOLOGY (Shweder 1989). anthropologists, and sociologists about the nature and uni- From this perspective comparative or cross-cultural psy- versality of rationality in logical and moral judgment (B. chology become impossible, inasmuch as the psychology of Wilson 1970). each community would need to be studied in its own analyt- 3. Historical relativism views historical eras as a cul- ical terms. At the universalist end of the spectrum is EVOLU- tural and intellectual history of diverse and changing ideas, TIONARY PSYCHOLOGY, which looks at human cognitive paradigms, or worldviews (Burckhardt 1943; Kuhn 1977). architecture as having evolved largely during the upper 4. Linguistic relativism focuses on the effects of particu- Paleolithic, subject to the general Darwinian forces of natu- lar grammatical and lexical forms on habitual thinking and ral selection and fitness maximization (Barkow, Cosmides, classification (Whorf 1956; Lucy 1992). and Tooby 1992). Local cultural differences are viewed as 5. Ethical relativism claims that behavior can be mor- relatively trivial compared with the shared cognitive abili- ally evaluated only in relation to a local framework of val- ties that are the products of hominid evolution. ues and beliefs rather than universal ethical norms (Ladd Many cognitive anthropologists see in the relativist/uni- 1953). Proponents advocate tolerance in ethical judgments versalist distinction a false dichotomy. An adequate model to counter the presumed ethnocentricism of universalistic of mind must encompass both universal and variable proper- judgments (Herskovitz 1972; Hatch 1983). Opponents ties. Although they acknowledge the importance of a shared claim that extreme ethical relativism is amoral and poten- basic cognitive architecture and universal process of both tially immoral since it can justify, by an appeal to local or information processing and meaning construction, many historical context, any action, including acts like genocide cognitive anthropologists do not see cultural variation as that most people would condemn (Vivas 1950; Norris trivial but stress the crucial mediating roles of diverse social 1996). This debate engages the highly visible discourse on environments and variable cultural models in human cogni- the doctrine of universal human rights, and the extent to tion (D’Andrade 1987; Holland and Quinn 1987; Hutchins which it reflects natural rights rather than the cultural val- 1996; Shore 1996). ues of a politically dominant community (R. Wilson 1997). Although cultural relativism has rarely been treated as a Important and emotionally salient issues engaged in this problem of cognitive science, COGNITIVE ANTHROPOLOGY debate include the status of women, abortion, religious tol- is a useful perspective for reframing the issues of cultural erance, the treatment of children, arranged marriages, relativism. For cognitive anthropologists, a cultural unit female circumcision, and capital punishment. A common comprises a population sharing a large and diverse stock of thread linking many of these issues is the status of “the cultural models, which differ from community to commu- individual” and by implication social equality versus social nity. Once internalized, cultural models become conven- hierarchy, and cultural relativism can be used to justify tional cognitive models in individual minds. Cultural relations of inequality (Dumont 1970). models thus have a double life as both instituted models 6. Distinct from evolutionary psychologists mentioned (public institutions) and conventional mental models (indi- above are developmental relativists who ascribe differences viduals’ mental representations of public forms; Shore in thought or values to different stages of human develop- 1996). Other kinds of cognitive models include “hard- ment, either in terms of evolutionary stages or developmen- wired” schemas (like those governing facial recognition) tal differences in moral reasoning between individuals. A and personal/idiosyncratic mental models that differ from commonplace assumption of Victorian anthropology, evo- person to person. Thus viewed, culture is not a bounded lutionism still has echoes in the genetic epistemology of unit but a dynamic social distribution of instituted and developmental psychologists like PIAGET, Kohlberg, and mental models. Werner (Piaget 1932; Kohlberg 1981, 1983; Werner 1948/ When culture is conceived as a socially distributed sys- 1964). Genetic epistemology acknowledges the cultural tem of models, the sources of cultural relativity become diversity and relativity of systems of reasoning and sym- more complex and subtle but are easier to specify. Rather bolism but links these differences to a universalistic devel- than draw simple oppositions between distinct cultures, we opmental (and, by common implication, evolutionary) can specify which models (rather than which cultures) are trajectory. different and how they differ. Thus the similarity or differ- Cultural Relativism 215 ence between communities is not an all-or-nothing phenom- Kohlberg, L. (1981). Essays in Moral Development, vol. 1: The Philosophy of Moral Development. New York: Harper and enon, but is a matter of particular differences or similarities. Row. In addition, significant conflict or contradiction among Kohlberg, L. (1983). Essays in Moral Development, vol. 2: The cultural models within a community becomes easier to Psychology of Moral Development. New York: Harper and account for, as do conflicts between cultural models and Row. personal models or between cultural models and relatively Kuhn, T. (1977). The Structure of Scientific Revolutions. 2nd ed. unmodeled (diffuse/inarticulate) feelings and desires. Such Chicago: University of Chicago Press. internal conflicts do not argue against the intersubjective Ladd, J. (1953). Ethical Relativism. Belmont, CA: Wadsworth. sharing of cultural models within a community or the Lakoff, G. (1987). Women, Fire and Dangerous Things. Chicago: important difference between communities. But they sug- University of Chicago Press. gest a softening of the oppositions between discrete cultures Lucy, J. (1992). Language Diversity and Thought: A Reformula- tion of the Linguistic Relativity Hypothesis. New York: Cam- that has been the hallmark of much of the discourse of cul- bridge University Press. tural relativism. Nuckolls, C. (1997). Culture and the Dialectics of Desire. Madi- Many within-culture conflicts suggest existential dilem- son: University of Wisconsin Press. mas that have no final resolution (e.g., autonomy versus Piaget, J. (1932). The Development of Moral Reasoning in Chil- dependency needs, equality and hierarchy; Fiske 1990; dren. New York: Free Press. Nuckolls 1997). There are models and countermodels, as in Shore, B. (1996). Culture in Mind: Cognition, Culture and the political discourse. Cultural models sometimes provide tem- Problem of Meaning. New York: Oxford University Press. porary resolutions, serving as salient cognitive and emo- Shweder, R. (1989). Cultural psychology: what is it? In J. Stigler, tional resources for clarifying experience. Sometimes, as in R. Shweder, and G. Herdt, Eds., Cultural Psychology: The Chi- religious ritual, cultural models simply crystallize contra- cago Symposia on Culture and Development. New York: Cam- bridge University Press, pp. 1–46. dictions, representing them as sacred paradox. Such resolu- Vivas, E. (1950). The Moral Life and the Ethical Life. Chicago: tions are never complete, and never exhaust the experience University of Chicago Press. of individuals. In this way, the relativity between cultures is Werner, H. (1948/1964). Comparative Psychology of Mental complemented by a degree of experiential relativity within Development. Rev. ed. New York: International University cultures (variation and conflict) and periodically within Press. individuals (ambivalence). Whorf, B. L. (1956). Language, Thought and Reality. Cambridge, See also COLOR CATEGORIZATION; CULTURAL SYMBOL- MA: MIT Press. ISM; CULTURAL VARIATION; LINGUISTIC RELATIVITY Wilson, B., Ed. (1970). Rationality. Oxford: Basil Blackwell. Wilson, R., Ed. (1997). Human Rights, Cultural Context: HYPOTHESIS; MOTIVATION AND CULTURE; SAPIR, EDWARD Anthropological Perspectives. London and Chicago: Pluto —Bradd Shore Press. References Further Readings Barkow H., L. Cosmides, and J. Tooby, Eds. (1992). The Adapted Benedict, R. (1934). Patterns of Culture. Boston: Houghton Miff- Mind: Evolutionary Psychology and the Generation of Culture. lin. New York: Oxford University Press. Fernandez, J. (1990). Tolerance in a repugnant world and other Berger, P., and T. Luckmann. (1966). The Social Construction of dilemmas of cultural relativism in the work of Melville J. Her- Reality: A Treatise on the Sociology of Knowledge. Garden skovitz. Ethos 18(2): 140–164. City: Doubleday. Geertz, C. (1984). Distinguished lecture: anti-anti relativism. Burckhardt, J. (1943). Reflections on History. Trans. M. D. Hot- American Anthropologist 86(2): 263–278. tinger. London: G. Allen and Unwin. Hartung, F. E. (1954). Cultural relativity and moral judgments. Philosophy of Science 21: 118–126. D’Andrade, R. (1987). Cultural meaning systems. In R. Shweder Horton, R. (1967). African traditional thought and Western sci- and R. LeVine, Eds., Culture Theory: Essays on Mind, Self ence. Africa 37: 50–71, 155–187. and Emotion, pp. 88–119. Cambridge: Cambridge University Lucy, J. (1985). Whorf’s view of the linguistic mediation of Press. thought. In E. Mertz and R. Parmentier, Eds., Semiotic Media- Dumont, L. (1970). Homo Hierarchicus. Chicago: University of tion: Sociological and Psychological Perspectives. Orlando, Chicago Press. FL: Academic Press, pp. 73–98. Fiske, A. P. (1990). Relativity within Moose (”Mossi”) culture: Norris, C. (1996). Reclaiming Truth: Contribution to a Critique of four incommensurable models for social relationships. Ethos Cultural Relativism. Durham: Duke University Press. 18(2): 180–204. Overing, J., Ed. (1985). Reason and Morality. London: Tavistock. Geertz, C. (1973). The Interpretation of Cultures. New York: Basic Schoeck, H., and J. M. Wiggens. (1961). Relativism and the Study Books. of Man. Princeton, NJ: Van Nostrand. Hatch, E. (1983). Culture and Morality: The Relativity of Values in Shweder, R. (1990). Ethical relativism: is there a defensible ver- Anthropology. New York: Columbia University Press. sion? Ethos 18(2): 205–218. Herskovitz, M. J. (1972). Cultural Relativism: Perspectives in Cul- Shweder, R., M. Mahapatra, and J. G. Miller. (1987). Culture and tural Pluralism. New York: Random House. moral development. In J. Kagan and S. Lamb, Eds., The Emer- Holland, D., and N. Quinn, Eds. (1987). Cultural Models in Lan- gence of Morality in Young Children. Chicago: University of guage and Thought. Cambridge: Cambridge University Illinois Press, pp. 1–90. Press. Spiro, M. (1986). Cultural relativism and the future of anthropol- Hutchins, E. (1996). Cognition in the Wild. Cambridge, MA: MIT ogy. Cultural Anthropology 1(3): 259–286. Press. 216 Cultural Symbolism that a rational interpretation is unavailable or insufficient. Cultural Symbolism This conception of cultural symbolism also implies that we cannot assume that material and other public symbols “con- At about 50,000 B.P., the archaeological record shows a sud- tain” meanings in the form of a code, in much the same way den change in the nature and variety of artifacts produced by as the letters of a writing system contain phonological infor- modern humans, with the massive production of cave paint- mation. Symbolism does not work in that way in our spe- ings, elaborate artifacts of no practical utility, the use of cies. Bees or vervet monkeys do produce signals that are ocher, the introduction of burial practices, and so on. Here reliable indicators of the states of affairs that caused their we have the first traces of the emergence of cultural symbol- production. Cultural symbols, much like human communi- ism (although the phenomenon itself may have appeared cation in general, trigger inferential processes that are not earlier). The term has a wider extension for anthropologists constrained by the features of the public representation who are not limited to preserved artifacts and can observe itself, but by what these features reveal of the communica- such cultural products as public utterances, ritual, clothing, tor’s communicative intentions. In other words, you cannot music, etiquette, dance, and prohibitions. All these produc- achieve communication (and this extends to cultural “mean- tions have three main characteristics: (1) their particular fea- ings”) unless you activate a rich intuitive psychology (see tures are to a large extent unmotivated by immediate THEORY OF MIND). This argument finds some support from survival needs and are often devoid of any practical pur- studies showing that even artifact production among pose; (2) they seemingly involve a capacity to “reify” men- humans requires such perspective-taking and inferences tal representations, so that certain communicative or about the other’s intentions (see, e.g., Tomasello, Kruger, memory effects can be achieved by producing material and Ratner 1993). objects and observable events; (3) their features vary from Cultural symbolism often combines universal, intuitive one human group to another. concepts (e.g., a theory of physical objects as cohesive, a In the social sciences, the loose term symbolism was theory of living things as internally propelled) in counterin- applied to all such productions for a simple reason: although tuitive ways (e.g., a theory of superhuman agents as nonma- they often seemed to convey some overt “meaning,” this terial and nonbiological; Boyer 1994). This, too, requires an meaning did not seem sufficient to explain their occurrence ability to rearrange representations that derive from basic or transmission. A common strategy, then, was to explain cognitive dispositions. This is why the appearance of cul- symbolism as a symptom of social relations. Durkheim, for tural symbolism has been linked to the emergence of a instance, treats religion as a symbol of social order and “metarepresentational” capacity riding piggyback on more superhuman agency as a symbol of society itself. For Marx, specialized “modular” cognitive systems (Mithen 1996). an ideology symbolizes (and distorts) social relations. Alter- All this may explain why, as soon as it appears in the natively, hermeneutic approaches to culture emphasize com- archaeological record and wherever it is found in the mon concerns of mankind that find their expression in anthropological evidence, cultural symbolism is “cultural” cultural symbolism. Religion, for instance, is described as in the sense of varying between human groups. Humans expressing universal metaphysical questions or anxieties. tend to talk about the same topics the world over and make The common thread in these very different frameworks is use of a similar evolved cognitive architecture (see EVOLU- that cultural productions stand for something else, which TIONARY PSYCHOLOGY). However, inferences produced on may or may not be accessible to people’s consciousness and the basis of a public representation depend on cues reveal- which is encoded in public representations. ing intentions, which themselves may largely depend on From a cognitive perspective, the main question is to the group’s history, in particular on the fact that certain account for the capacities that make symbolism possible, public representations, or elements thereof, have been used and for the causes of acquisition and transmission of partic- in the same group before. Such historical variations may ular patterns (see CULTURAL EVOLUTION). An important result in different implicit schemata and therefore in differ- attempt in this direction can be found in D. Sperber’s cog- ences of cultural “style” between groups (see CULTURAL nitive account of symbolism (Sperber 1975, 1996). For PSYCHOLOGY). Sperber, certain cultural phenomena are “symbolic” to par- See also CULTURAL VARIATION; RELIGIOUS IDEAS AND ticular actors inasmuch as their rational interpretation does PRACTICES not lead to a limited and predictable set of inferences. This —Pascal Boyer triggers a search for conjectural representations that, if true, would make a rational interpretation possible. The produc- tion and use of public representations is then described in References terms of communicative intentions. What people do with Boyer, P. (1994). Cognitive constraints on cultural representations: public representations, just as they do with verbal utter- Natural ontologies and religious ideas. In L. A. Hirschfeld and ances, is to engage in a goal-directed, relevance-optimizing S. Gelman, Eds., Mapping the Mind: Domain-specificity in search for possible descriptions of the communicator’s Culture and Cognition. New York: Cambridge University intentions (see RELEVANCE). Press. This conception has two interesting consequences. First, Mithen, S. (1996). The Prehistory of the Mind. London: Thames it suggests that there are no such things as “symbols” as a and Hudson. particular class of cultural products. Any conceptual or per- Sperber, D. (1975). Rethinking Symbolism. Cambridge: Cambridge ceptual item can become symbolic, if there is some index University Press. Cultural Variation 217 organization of cognitive diversity: Sperber, D. (1996). Explaining Culture: A Naturalistic Approach. Oxford: Blackwell. 1. Zero diversity (high concordance) Tomasello, M., A. C. Kruger, and H. H. Ratner. (1993). Cultural 2. Unorganized diversity (random differences or idiosyn- Learning. Behavioral and Brain Sciences 16: 495–510. crasy) 3. Ad hoc communication (enhanced agreement among Cultural Universals individuals engaged in the same task) 4. Inclusion (systematic differences between experts and novices) See 5. End linkage (systematic differences between experts CULTURAL RELATIVISM; HUMAN UNIVERSALS; LINGUIS- engaged in complementary tasks) TIC UNIVERSALS AND UNIVERSAL GRAMMAR 6. Administration (systematic differences between manag- ers and subordinates executing sub-plans) Cultural Variation The first two patterns were regarded as logical extremes, the latter four as ways of accepting and organizing cognitive Cultural variation refers to differences in knowledge or diversity. belief among individuals. This article focuses on intracul- Subsequent authors have emphasized one or another of tural variation, on differences in belief among individual these patterns. Roberts studied high concordance codes members of the same cultural group. For example, Ameri- (pattern 1) for color, kin, and clothing, among other cans differ in their environmental beliefs and values (Kemp- domains. He argued that “such codes merit the heavy cul- ton, Boster, and Hartley 1995); Mexicans differ in their tural investment made in them, for they aid rapid and accu- beliefs about disease (Weller 1984); Ojibway differ in their rate communication” (Roberts 1987: 267). D’Andrade knowledge of hypertension (Garro 1988); Aguaruna women (1976) suggested that most cultural beliefs were either gen- differ in their knowledge of the names of manioc varieties erally shared or were idiosyncratic (patterns 1 and 2), using (Boster 1985); and Americans differ in their familiarity, as an example the distribution of disease beliefs in the vocabulary size, and recognition ability in various semantic United States and Mexico. He later (1981) suggested that domains (Gatewood 1984). Cross-cultural variation, the the division of labor in society would augment the total cul- general differences between cultural groups, is discussed tural information pool from two to four orders of magnitude elsewhere (see HUMAN UNIVERSALS, CULTURAL RELATIV- beyond what an individual knows (pattern 5; cf. Gatewood ISM, and COGNITIVE ANTHROPOLOGY). 1983). Gardner (1976) describes Dene bird classification as Cultural variation (studied by anthropologists) contrasts a case in which cultural norms are absent and most knowl- both with sociolinguistic variation (studied by linguists) and edge is unique to the individual (pattern 2). In contrast, with individual differences (studied by psychologists). Boster describes Aguaruna manioc identification as a case “Cultural (or cognitive) variation” refers to relatively stable in which there is a single cultural model known to varying substantive differences in belief. “Sociolinguistic (or con- degrees by different informants and in which “deviations textual) variation” usually refers to transient stylistic differ- from the model are patterned according to the sexual divi- ences in speech. In this case, speakers share a model of what sion of labor, membership in kin and residential groups, and their choices of register say about themselves and make dif- individual expertise” (1985: 193; patterns 3, 4, and 5). ferent choices of self-representation in different social con- There appear to be many similar instances in which one texts. For example, they may choose to show solidarity with can infer the knowledge of individuals from their degree of other members of their social group in one setting and com- agreement with others (e.g., Boster 1985, 1991; D’Andrade pete for status in another. “Individual differences” usually 1987; Garro 1986; Gatewood 1984; Romney, Weller, and refers to differences in task performance attributable to Batchelder 1986; Weller 1984; see CULTURAL CONSENSUS intrinsic differences in the way individuals process informa- THEORY). In these instances, individuals who give the model tion. These differences, though sometimes produced by responses are more likely to be reliable on retest (Boster training, are often interpreted as (biologically based) varia- 1985), consistent (Weller 1984), and experienced with the tion in intelligence, temperament, or cognitive style. domain (Boster 1985; Gatewood 1984; Garro 1986; Weller Various patterns of intracultural variation have been pro- 1984). This pattern holds even in cases, such as a word asso- posed. The simplest pattern is one implicit in most classic ciation task, in which there are no culturally normative ethnography and often incorporated into the concept of cul- responses (D’Andrade 1987). However, there are some ture itself: Individual members of a cultural group share cases in which domain novices agree with each other more knowledge and beliefs with other members of the group. than do experts (e.g., similarity judgment of fish; Boster and This assumption of within-group uniformity is often cou- Johnson 1989). The exceptions are often instances in which pled with an assumption of between-group divergence. For domain novices can generate consistent responses with a a classic review of the culture concept, see Kroeber and simple heuristic. It is important to ensure that any task used Kluckhorn (1952). to assess cultural knowledge be representative of natural Wallace (1961), building on Sapir’s (e.g., 1938) and Hal- uses of domain knowledge and have ECOLOGICAL VALIDITY. lowell’s (e.g., 1955) emphasis on the uniqueness of individ- Just as authors have differed in the patterns of intracul- uals, argued against this uniformitarian view of culture and tural variation they emphasize, they differ in their descrip- asserted that cognitive non-sharing is a “functional prereq- tion of the processes that generate those patterns. For uisite of society.” He identified six possible patterns of the Wallace, cognitive diversity mainly reflects the division of 218 Culture labor: different patterns emerge depending on how tasks are Garro, L. (1988). Explaining high blood pressure: variation in knowledge about illness. American Ethnologist 15: 98–119. divided among individuals (cf. Durkheim 1933). Gatewood, J. B. (1983). Loose talk: linguistic competence and rec- Roberts (1964) developed a view of cultures, similar to ognition ability. American Anthropologist 85(2): 378–387. Wallace’s, as “information economies” that create, distrib- Gatewood, J. B. (1984). Familiarity, vocabulary size, and recogni- ute, and use information. He showed how aspects of social tion ability in four semantic domains. American Ethnologist organization affect how cultural groups as a whole store and 11(3): 507–527. retrieve information. Elsewhere, he demonstrated how Hallowell, A. I. (1955). Culture and Experience. Philadelphia: explicit cultural models of error are used to evaluate and University of Pennsylvania Press. correct mistakes, in trapshooting, tavern pool playing, and Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT flying. See Roberts (1987) for a review. Press. Boster (1991) extended Roberts’s model of culture as an Kempton, W., J. S. Boster, and J. A. Hartley. (1995). Environmen- tal Values in American Culture. Cambridge, MA: MIT Press. information economy. He proposed that patterns of intracul- Kroeber, A. L., and C. Kluckhohn. (1952). Culture: a critical tural variation reflect the “quality, quantity, and distribution review of concepts and definitions. Papers of the Peabody of individuals’ opportunities to learn” (1991: 204). He Museum of American Archaeology and Ethnology, vol. 47. argues that domains observable by direct inspection (e.g., Cambridge, MA: Harvard University. FOLK BIOLOGY) or introspection (e.g., COLOR CLASSIFICA- Roberts, J. (1964). The self management of cultures. In W. Goode- TION) give individuals equal and ample opportunities to nough, Ed., Explorations in Cultural Anthropology: Essays in learn regardless of their cultural background. These proper- Honor of George Peter Murdock. New York: McGraw-Hill, pp. ties give rise to high cross-cultural and intracultural agree- 433–454. ment. In contrast, domains that can only be learned from Roberts, J. (1987). Within culture variation. American Behavioral others (e.g., mythologies) are likely to be highly variable Scientist 31(2): 266–279. Romney, A. K., S. C. Weller, and W. H. Batchelder. (1986). Cul- both within and between societies, and have a distribution ture as consensus: a theory of culture and informant accuracy. that reflects the social communication network. American Anthropologist 88: 313–338. Hutchins (1995), like Roberts, sees whole groups as Sapir, E. (1938). Why anthropology needs the psychiatrist. Psychi- computational engines. But for Hutchins, cognition is dis- atry 1: 7–12. tributed not just among humans but also among artifacts Wallace, A. (1961). Culture and Personality. New York: Random such as navigation charts and compasses, for they serve to House. store, transform, and transmit information, just as do the Weller, S. C. (1984). Consistency and consensus among infor- humans who use them. mants: disease concepts in a rural Mexican town. American See also COGNITIVE ARTIFACTS; CULTURAL SYMBOLISM; Anthropologist 86(4): 966–975. FOLK PSYCHOLOGY; LANGUAGE AND CULTURE; LINGUISTIC RELATIVITY HYPOTHESIS Culture —James Boster See INTRODUCTION: CULTURE, COGNITION, AND EVOLUTION References Culture and Language Boster, J. S. (1985). Requiem for the omniscient informant: there’s life in the old girl yet. In J. Dougherty, Ed., Directions in Cog- nitive Anthropology. Urbana: University of Illinois Press, pp. See CREOLES; LANGUAGE AND CULTURE; LANGUAGE VARIA- 177–197. TION AND CHANGE; PARAMETER-SETTING APPROACHES TO Boster, J. S. (1991). The information economy model applied to biological similarity judgment. In L. Resnick, J. Levine, and S. ACQUISITION, CREOLIZATION, AND DIACHRONY Teasley, Eds., Perspectives on Socially Shared Cognition. Wash- ington, DC: American Psychological Association, pp. 203–235. Culture and Metaphor Boster, J. S., and J. C. Johnson. (1989). Form or function: a com- parison of expert and novice judgments of similarity among fish. American Anthropologist 91(4): 866–889. See METAPHOR AND CULTURE D’Andrade, R. G. (1976). A propositional analysis of U.S. Ameri- can beliefs about illness. In K. H. Basso and H. A. Selby, Eds., Culture and Representations of Self Meaning in Anthropology. Albuquerque: University of New Mexico Press, pp. 155–180. D’Andrade, R. G. (1981). The cultural part of cognition. Cognitive See METAPHOR AND CULTURE; MOTIVATION AND CULTURE; Science 5: 179–195. SELF D’Andrade, R. G. (1987). Modal responses and cultural expertise. American Behavioral Scientist 31(2): 266–279. Darwin, Charles Durkheim, E. (1933). Division of Labor in Society. New York: Macmillan. Gardner, P. (1976). Birds, words, and a requiem for the omniscient Charles Darwin (1809–1882) formulated the most impor- informant. American Ethnologist 3: 446–468. tant biological theory of the last century and a half: his the- Garro, L. (1986). Intracultural variation in folk medical knowl- ory of EVOLUTION by natural selection. By explaining that edge: a comparison of curers and non-curers. American Anthro- “mystery of mysteries,” the origin of species, Darwin over- pologist 88(2): 351–370. Darwin, Charles 219 turned long-entrenched biological and religious assump- Human reason, he believed, gradually emerged out of tions. He applied his general theory to the human animal instincts, which themselves derived from inherited habits and thereby rendered an account of moral behavior and and selection operating on such habits. From late 1838 to rational mind that has formed the foundation for many com- early 1840, in a set of notebooks (“M” and “N”) and in loose plementary theories today. notes, he worked out a theory of conscience, which would Charles Darwin was born on February 12, 1809, the son be elaborated thirty years later in the Descent of Man (1871). of Robert Waring Darwin, a Shrewsbury physician, and Sus- Darwin continued to work on his basic theory of evolu- annah Wedgwood Darwin, daughter of Josiah Wedgwood, tion through the 1840s and into the early 1850s, simulta- who founded the famous pottery firm. When he was sixteen, neously undertaking the time-consuming labor that Darwin went to Edinburgh medical school, following in the produced four large volumes on barnacles. In 1856, after shadows of his famous grandfather Erasmus Darwin, and of prodding by his friend the eminent geologist Charles Lyell, his father and older brother. At Edinburgh, he came into Darwin began to compose a volume that would detail his contact with Robert Grant, who helped him cultivate the theory. In mid-June 1858, he received from Alfred Russel study of invertebrates and introduced him to the evolution- Wallace, then in Malaya, a letter describing a theory of spe- ary works of his own grandfather and of Lamarck. After cies origin that was nearly identical to his own. Darwin Darwin left Edinburgh without a degree, his father, greatly thought his originality had now vanished under a veil of disappointed, sent him to Cambridge to become a country honor. It took Lyell and other of Darwin’s friends to con- parson. He spent most of his time at university in the pur- vince him that he should continue working on his book, suits of a gentleman, with some added beetle collecting. A which he did, though in abbreviated form. On the Origin of teacher and friend, John Henslow, nevertheless detected a Species by Means of Natural Selection was published in spark in the young man and recommended him to serve as November 1859 and sold out within a few weeks. During naturalist on a vessel that would sail around the world chart- Darwin’s lifetime, the Origin went through six editions, ing the seas for British naval and commercial craft. each incorporating alterations and responses to critics. With Under the command of the twenty-six-year-old Robert the last edition, the book had changed by some 50 percent. FitzRoy, H.M.S. Beagle sailed from Falmouth Harbor on The Origin had barely mentioned humankind. Critics, December 29, 1831, and reached the coast of South Amer- however, immediately understood the theory’s implications, ica two months later. While on board, Darwin occupied and most of their objections focused on the problem of himself with reading Alexander von Humboldt’s Personal human evolution. In 1870, when Wallace seemed to have Narrative of Travels and steering clear of FitzRoy’s foul excluded human beings from the natural process of species moods. The Beagle charted the waters along the east and change, Darwin felt compelled to reveal his full conception. west coasts of South America, the Pacific islands, and Aus- The Descent of Man and Selection in Relation to Sex, tralia. Darwin traveled into the interior of these lands to appearing in 1871, made his theories of the evolution of record geological information, as well as to collect fossils mind and conscience quite explicit. Mind reached its human and animal specimens to be shipped back to London for form under the aegis of natural selection and language, the careful description and cataloguing. The ship docked at Fal- latter producing heritable modifications in brain patterns. mouth on October 4, 1836, almost five years after it had Darwin argued that human moral instincts would be departed. During the voyage Darwin seems not to have seri- acquired through community selection, inasmuch as self- ously considered the possibility that species had transmuted, sacrificing behavior would do the agent little good but though he may have had some suspicions. Only in March would benefit the clan, which would include many relatives 1837, as he tried to make sense of the morphology of mock- of the agent. In competition among clans, those whose ingbirds collected on the Galápagos Islands, did his biologi- members exercised more altruistic instincts would have the cal orthodoxy begin to crumble. advantage; and so moral conscience would gradually During the spring and summer of 1837, Darwin became increase in humankind. Because the pricks of such con- gradually committed to the idea that species had been trans- science would little directly benefit the moral individual, formed over time, and he started to develop hypotheses con- Darwin thought his theory quite different from those that cerning the causes of change. Initially he supposed that the were based in utilitarian selfishness—the philosophical direct impact of the environment and inherited habit had ground for many comparable theories today in SOCIOBIOL- altered species’ forms—notions he retained in his later theo- OGY. Darwin had intended to discuss thoroughly the EMO- rizing. He thought that innate behavior, instincts, also under- TIONS in the Descent, but saved his theories of emotional went transformations through time, being first acquired as instinct for his Expression of the Emotions in Man and Ani- habits. On September 28, 1838, Darwin read Malthus’s mals (1872). Though Konrad Lorenz regarded this book as Essay on the Principle of Population, which allowed him to the foundational document for ETHOLOGY, Darwin had formulate, in the words of his Autobiography, “a theory by explained emotional instincts solely through the inheritance which to work,” his theory of natural selection. of acquired habit. And so the Darwinian shade that yet hov- Darwin did not wish to exempt human beings from the ers over current biology does bear but passing resemblance evolutionary process. During the late 1830s and early 1840s, to the man who lived in the last century. he devised theories of the evolution of mind and conscience. See also ADAPTATION AND ADAPTATIONISM; ALTRUISM; Influenced by the empiricism of David HUME and his grand- EVOLUTIONARY PSYCHOLOGY father, Darwin regarded intelligence as a generalizing and —Robert J. Richards loosening of the cerebral structures that underlay instinct. 220 Data Mining Decision Making References Darwin, C. (1969). The Autobiography of Charles Darwin., N. Barlow, Ed., New York: Norton. Decision making is the process of choosing a preferred Darwin, C. (1987). Charles Darwin’s Notebooks, 1836–1844. P. option or course of action from among a set of alternatives. Barrett et al., Eds., Ithaca: Cornell University Press. Decision making permeates all aspects of life, including Darwin, C. (1985–). The Correspondence of Charles Darwin. 9 decisions about what to buy, whom to vote for, or what job vols. to date. Cambridge: Cambridge University Press. to take. Decisions often involve uncertainty about the exter- Darwin, C. (1871). The Descent of Man and Selection in Relation nal world (e.g., What will the weather be like?), as well as to Sex. 2 vols. London: Murray. conflict regarding one’s own preferences (e.g., Should I opt Darwin, C. (1872). The Expression of the Emotions in Man and Animals. London: Murray. for a higher salary or for more leisure?). The decision- Darwin, C. (1839). Journal of Researches into the Geology and making process often begins at the information-gathering Natural History of the Various Countries Visited by H.M.S. stage and proceeds through likelihood estimation and delib- Beagle. London: Henry Coburn. eration, until the final act of choosing. Darwin, C. (1854). A Monograph of the Fossil Balanidae and Ver- The study of decision making is an interdisciplinary rucidae of Great Britain. London: Palaeontological Society. enterprise involving economics, political science, sociology, Darwin, C. (1851). A Monograph of the Fossil Lepadidae or, psychology, statistics, and philosophy. Decisions are made Pedunculated Cirripedes of Great Britain. London: Ray Soci- by individuals and by groups. Important results have been ety. obtained both in the theoretical and experimental study of Darwin, C. (1851). A Monograph of the Sub-Class Cirripedia. The group decision making. With an eye toward the cognitive Balanidae (or Sessile Cirripedes), the Verrucidae, &c. London: Ray Society. sciences, this article focuses on the empirical study of deci- Darwin, C. (1851). A Monograph of the Sub-Class Cirripedia, with sion making at the individual level. (The focus is on choice Figures of all the Species. The Lepadidae or, Pedunculated Cir- behavior; for more on judgment, see JUDGMENT HEURIS- ripedes. London: Ray Society. TICS.) Darwin, C. (1859). On the Origin of Species by Means of Natural One can distinguish three approaches to the analysis of Selection. London: Murray. decision making: normative, descriptive, and prescriptive. The Humboldt, A., and A. Bonpland. (1818–1829). Personal Narrative normative approach assumes a rational decision-maker who of Travels to the Equinoctial Regions of the New Continent, has well-defined preferences that obey certain axioms of during the Years 1799–1804. 7 vols. London: Longman, Hurst, rational behavior. This conception, known as RATIONAL Rees, Orme, and Brown. CHOICE THEORY, is based primarily on a priori considerations Malthus, T. (1826). An Essay on the Principle of Population. 6th ed., 2 vols. London: Murray. rather than on empirical observation. The descriptive approach to decision making is based on empirical observa- Further Reading tion and on experimental studies of choice behavior. It is con- cerned primarily with the psychological factors that guide Bowler, P. (1984). Evolution: The History of an Idea. Berkeley: behavior in decision-making situations. Experimental evi- University of California Press. dence indicates that people’s choices are often at odds with Browne, J. (1995). Charles Darwin: Voyaging. New York: Alfred the normative assumptions of the rational theory. In light of Knopf. this, the prescriptive enterprise focuses on methods of improv- Dennett, D. (1995). Darwin’s Dangerous Idea. New York: Simon ing decision making, bringing it more in line with normative and Schuster. Desmond, A. (1989). The Politics of Evolution. Chicago: Univer- desiderata (see, e.g., von Winterfeld and Edwards 1986). sity of Chicago Press. In some decision contexts, the availability of the chosen Glass, B., Ed. (1968). Forerunners of Darwin. Baltimore: Johns option is essentially certain (as when choosing among Hopkins University Press. dishes from a menu, or cars at a dealer’s lot). Other deci- Hull, D. (1973). Darwin and His Critics. Cambridge, MA: Harvard sions are made under UNCERTAINTY: they can be “risky,” University Press. where the probabilities of the outcomes are known (e.g., Kohn, D., Ed. (1985). The Darwinian Heritage. Princeton: Prince- gambling or insurance), or they can be “ambiguous,” as are ton University Press. most real world decisions, in that their precise likelihood is Mayr, E. (1991). One Long Argument: Charles Darwin and the not known and needs to be judged “subjectively” by the Genesis of Modern Evolutionary Thought. Cambridge, MA: decision maker. When making decisions under uncertainty, Harvard University Press. Ospovat, D. (1981). The Development of Darwin’s Theory. Cam- a person has to consider both the desirability of the potential bridge, MA: Harvard University Press. outcomes and their probability of occurrence (see PROBABI- Richards, R. (1987). Darwin and the Emergence of Evolutionary LISTIC REASONING). Indeed, part of the study of decision- Theories of Mind and Behavior. Chicago: University of Chi- making concerns the manner in which these factors are cago Press. combined. Ruse, M. (1996). Monad to Man: The Concept of Progress in Evo- Presented with a choice between a risky prospect that lutionary Biology. Cambridge, MA: Harvard University Press. offers a 50 percent chance to win $200 (and a 50 percent chance to win nothing) and an alternative of receiving $100 Data Mining for sure, most people prefer the sure gain over the gamble, although the two prospects have the same expected value. (The expected value is the sum of possible outcomes See KNOWLEDGE ACQUISITION; MACHINE LEARNING Decision Making 221 weighted by their probability of occurrence. The expected observed in choices involving gains, whereas risk seeking value of the gamble above is .50 × $200 + .50 × 0 = $100.) tends to hold in choices involving losses. Preference for a sure outcome over a risky prospect of equal An S-shaped value function, based on these attitudes expected value is called risk aversion; indeed, people tend toward risk, forms part of prospect theory (Kahneman and to be risk averse when choosing between prospects with TVERSKY 1979), an influential descriptive theory of choice. positive outcomes. The tendency towards risk aversion can The value function of prospect theory has three important be explained by the notion of diminishing sensitivity, first properties: (1) it is defined on gains and losses rather than formalized by Daniel Bernoulli (1738), who thought that total wealth, which captures the fact that people normally “the utility resulting from a fixed small increase in wealth treat outcomes as departures from a current reference point, will be inversely proportional to the quantity of goods previ- rather than in terms of final assets, as posited by the rational ously possessed.” Bernoulli proposed that people have a theory of choice; (2) it is steeper for losses than for gains: concave utility function that captures their subjective value thus, a loss of $X is more aversive than a gain of $X is for money, and that preferences should be described using attractive. The fact that losses loom larger than correspond- expected utility instead of expected value (a function is con- ing gains is known as loss aversion; and (3) it is concave for cave if a line joining two points on the curve lies below it). gains and convex for losses, which yields the risk attitudes According to expected utility, the worth of a gamble offer- described above: risk aversion in the domain of gains and ing a 50 percent chance to win $200 (and 50 percent chance risk seeking in the domain of losses. to win nothing) is .50 × u($200), where u is the person’s These attitudes seem compelling and unobjectionable, utility function (u(0) = 0). It follows from a concave func- yet their combination can lead to normatively problematic tion that the subjective value attached to a gain of $100 is consequences. In one example (Tversky and Kahneman more than 50 percent of the value attached to a gain of $200, 1986), respondents are asked to assume themselves to be which entails preference for the sure $100 gain and, hence, $300 richer and are then asked to choose between a sure risk aversion. gain of $100 or an equal chance to win $200 or nothing. Expected UTILITY THEORY and the assumption of risk Alternatively, they are asked to assume themselves to be aversion play a central role in the standard economic anal- $500 richer, and made to choose between a sure loss of $100 ysis of choice between risky prospects. In fact, a precipi- and an equal chance to lose $200 or nothing. In accord with tating event for the empirical study of decision making the properties described above, most subjects choosing came from economics, with the publication of von Neu- between gains are risk averse and prefer the certain $100 mann and Morgenstern’s (1947) normative treatment of gain, whereas most subjects choosing between losses are expected utility, in which a few compelling axioms, when risk seeking, preferring the risky prospect over the sure satisfied, were then shown to imply that a person’s $100 loss. The two problems, however, are essentially iden- choices can be thought of as favoring the alternative with tical: when the initial $300 or $500 payment is added to the the highest subjective expected utility. The normative the- respective outcomes, both problems amount to a choice ory was introduced to psychologists in the late 1950s and between $400 for sure as opposed to an even chance at $300 early 1960s (Edwards 1961; Luce and Raiffa 1957), and or $500. This is known as a framing effect. It occurs when has generated extensive research into “behavioral decision alternative framings of what is essentially the same decision theory.” Because the normative treatment specifies simple problem give rise to predictably different choices. The way and compelling principles of rational behavior, it has a problem is described—in terms of gains or losses—can since served as a benchmark against which behavioral trigger conflicting risk attitudes; similarly, different meth- studies of decision making are compared. Research over ods of eliciting preference—for example, through choice the last four decades has gained important insights into versus independent evaluation—can lead people to weigh the decision-making process, and has documented sys- certain aspects of the options differently. This leads to viola- tematic ways in which decision-making behavior departs tions of the normative requirements of “description invari- form the normative benchmark (see ECONOMICS AND COG- ance” and “procedure invariance,” according to which logically equivalent representations of a decision problem NITIVE SCIENCE). When asked to choose between a prospect that offers a as well as logically equivalent methods of elicitation should 50 percent chance to lose $200 (and a 50 percent chance at yield the same preferences. nothing) and the alternative of losing $100 for sure, most Monetary gambles have traditionally served as meta- people prefer to take an even chance at losing $200 or noth- phors for uncertain awards and as frequent stimuli in deci- ing over a sure $100 loss. This is because diminishing sensi- sion research. However, much attention has also been tivity applies to negative as well as to positive outcomes: the devoted to choices between nonmonetary awards, which impact of an initial $100 loss is greater than that of an addi- tend to be multidimensional in nature—for example, the tional $100. The worth of a gamble that offers a 50 percent need to choose between job options that differ in prestige, chance to lose $200 is thus greater (i.e., less negative) than salary, and rank, or between apartments that differ in size, that of a sure $100 loss; .50 × u(−$200) > u(−$100). This price, aesthetic attractiveness, and location. People are typi- results in a convex function for losses and a risk-seeking cally uncertain about how much weight to assign to the dif- preference for the gamble over a sure loss. Preference for a ferent dimensions of options; such assignments are often risky prospect over a sure outcome of equal expected value contingent on relatively immaterial changes in the task, the is called risk seeking. With the exception of prospects that description, and the nature of the options under consider- involve very small probabilities, risk aversion is generally ation (Hsee 1998; Tversky, Sattath, and Slovic 1988). 222 Decision Making The study of decision-making incorporates issues of neman and A. Tversky, Eds. Choices, values and frames. Cam- bridge: Cambridge University Press. PLANNING, PROBLEM SOLVING, PSYCHOPHYSICS, MEMORY, Kachelmeier, S. J., and M. Shehata. (1992). Examining risk prefer- and SOCIAL COGNITION, among others. Behavioral studies of ences under high monetary incentives: Experimental evidence decision-making have included process-tracing methods, from the People’s Republic of China. American Economic such as verbal protocols (Ericsson and Simon 1984), infor- Review 82: 1120–1141. mation-acquisition sequences (Payne, Bettman, and Johnson Kahneman, D. (1994). New challenges to the rationality assump- 1993), and eye-movement data (Russo and Dosher 1983). tion. Journal of Institutional and Theoretical Economics 150 Decision patterns involving hypothetical problems have been (1): 18–36. replicated in real settings and with experts, for example, phy- Kahneman, D., J. L. Knetsch, and R. Thaler. (1990). Experimental sicians’ decisions regarding patient treatment (McNeil et al. tests of the endowment effect and the Coase theorem. Journal 1982; Redelmeier and Shafir 1995), academics’ retirement of Political Economy 98 (6): 1325–1348. Kahneman, D., and A. Tversky. (1979). Prospect theory: An analy- investment decisions (Benartzi and Thaler 1995), and taxi sis of decision under risk. Econometrica 47: 263–291. drivers’ allocation of working hours (Camerer et al. 1997). Loewenstein, G., and J. Elster, Eds. (1992). Choice Over Time. Some replications have provided substantial as opposed to New York: Russell Sage Foundation. minor payoffs (e.g., the equivalent of a month’s salary paid Luce, R. D., and H. Raiffa. (1957). Games and Decisions. New to respondents in the Peoples’ Republic of China; Kachel- York: Wiley. meier and Shehata 1992). March, J. (1978). Bounded rationality, ambiguity and the engineer- People’s choices are influenced by various aspects of the ing of choice. Bell Journal of Economics 9 (2): 587–608. decision-making situation. Among these are the conflict or McNeil, B. J., S. G. Pauker, H. C. Sox, and A. Tversky. (1982). On difficulty that characterize a decision (March 1978; Tversky the elicitation of preferences for alternative therapies. New and Shafir 1992), the regret anticipated in cases where England Journal of Medicine 306: 1259–1262. Payne, J. W., J. R. Bettman, and E. J. Johnson. (1993). The another option would have been better (Bell 1982), the role Adaptive Decision Maker. Cambridge: Cambridge University that reasons play in justifying one choice over another Press. (Shafir, Simonson, and Tversky 1993), the attachment that is Redelmeier, D., and E. Shafir. (1995). Medical decision making in felt for options already in one’s possession (Kahneman, situations that offer multiple alternatives. Journal of the Ameri- Knetsch, and Thaler 1990), the influence exerted by costs can Medical Association 273 (4): 302–305. already suffered (Arkes and Blumer 1985), the effects of Russo, J. E., and B. A. Dosher. (1983). Strategies for multiattribute temporal separation on future decisions (Loewenstein and binary choice. Journal of Experimental Psychology: Learning, Elster 1992), and the occasional inability to predict future or Memory, and Cognition 9 (4): 676–696. remember past satisfaction (Kahneman 1994). Research in Shafir, E., I. Simonson, and A. Tversky. (1993). Reason-based decision-making has uncovered psychological principles that choice. Cognition 49 (2): 11–36. Tversky, A., and D. Kahneman. (1986). Rational choice and the account for empirical findings that are counterintuitive and framing of decisions. Journal of Business 59 (4, pt. 2): 251– incompatible with normative analyses. People do not always 278. have well-ordered preferences: instead, they approach deci- Tversky, A., S. Sattath, and P. Slovic. (1988). Contingent weight- sions as problems that need to be solved, and construct pref- ing in judgment and choice. Psychological Review 95 (3): 371– erences that are heavily influenced by the nature and the 384. context of decision. Tversky, A., and E. Shafir. (1992). Choice under conflict: The See also BOUNDED RATIONALITY; CAUSAL REASONING; dynamics of deferred decision. Psychological Science 3 (6): DEDUCTIVE REASONING; RATIONAL DECISION MAKING; 358–361. von Neumann, J., and O. Morgenstern. (1947). Theory of Games TVERSKY and Economic Behavior. 2nd ed. Princeton: Princeton Univer- —Eldar Shafir sity Press. von Winterfeld, D., and W. Edwards. (1986). Decision Analysis and Behavioral Research. Cambridge: Cambridge University Press. References Further Readings Arkes, H. R., and C. Blumer. (1985). The psychology of sunk cost. Organizational Behavior and Human Performance 35: 129– Baron, J. (1994). Thinking and Deciding. 2nd ed. Cambridge: 140. Cambridge University Press. Bell, D. E. (1982). Regret in decision making under uncertainty. Bell, D. E., H. Raiffa, and A. Tversky, Eds. (1988). Decision Mak- Operations Research 30: 961–981. ing: Descriptive, Normative, and Prescriptive Interactions. Benartzi, S., and R. Thaler. (1995). Myopic loss aversion and the New York: Cambridge University Press. equity premium puzzle. Quarterly Journal of Economics 110 Camerer, C. F. (1995). Individual decision making. In J. H. Kagel (1): 73–92. and A. E. Roth, Eds., Handbook of Experimental Economics. Camerer, C., L. Babcock, G. Loewenstein, and R. Thaler. (1997). Princeton, NJ: Princeton University Press, pp. 587–703. A target income theory of labor supply: Evidence from cab Dawes, R. M. (1988). Rational Choice in an Uncertain World. drivers. Quarterly Journal of Economics 112 (2). New York: Harcourt Brace Jovanovich. Edwards, W. (1961). Behavioral decision theory. Annual Review of Edwards, W. (1954). The theory of decision making. Psychologi- Psychology 12: 473–498. cal Bulletin 51: 380–417. Ericsson, K. A., and H. A. Simon. (1984). Protocol Analysis: Ver- Goldstein, W. M., and R. M. Hogarth. (1997). Research on Judg- bal Reports as Data. Cambridge, MA: MIT Press. ment and Decision Making: Currents, Connections, and Con- Hsee, C. K. (1998). The evaluability hypothesis: Explaining joint- troversies. Cambridge: Cambridge University Press. separate evaluation preference reversal and beyond. In D. Kah- Decision Trees 223 A decision tree with a range of discrete (symbolic) class Hogarth, R. M. (1987). Judgment and Choice. 2nd ed. New York: Wiley. labels is called a classification tree, whereas a decision tree Raiffa, H. (1968). Decision Analysis: Introductory Lectures on with a range of continuous (numeric) values is called a Choices under Uncertainty. Reading, MA: Addison-Wesley. regression tree. A domain element is called an instance or Shafir, E., and A. Tversky. (1995). Decision making. In E. E. an example or a case, or sometimes by another name appro- Smith and D. N. Osherson, Eds., An Invitation to Cognitive Sci- priate to the context. An instance is represented as a con- ence, 2nd ed. vol. 3: Thinking). Cambridge, MA: MIT Press, junction of variable values. Each variable has its own pp. 77–100. domain of possible values, typically discrete or continuous. Tetlock, P. E. (1992). The impact of accountability on judgment The space of all possible instances is defined by set of possi- and choice: Toward a social contingency model. In M. P. ble instances that one could generate using these variables Zanna, Ed., Advances in Experimental Social Psychology, vol. and their possible values (the cross product). 25. New York: Academic Press. Yates, J. F. (1990). Judgment and Decision Making. Englewood Decision trees are attractive because they show clearly Cliffs, NJ: Prentice-Hall. how to reach a decision, and because they are easy to con- struct automatically from labeled instances. Two well known programs for constructing decision trees are C4.5 Decision Theory (Quinlan 1993) and CART (Classification and Regression Tree) (Breiman et al. 1984). The tree shown in figure 1 was See RATIONAL AGENCY; RATIONAL CHOICE THEORY; RATIO- generated by the ITI (Incremental Tree Inducer) program NAL DECISION MAKING; UTILITY THEORY (Utgoff, Berkman, and Clouse 1997). These programs usu- ally make quick work of training data, constructing a tree in a matter of a few seconds to a few minutes. For those who Decision Trees prefer to see a list of rules, there is a simple conversion, which is available in the C4.5 program. For each leaf of the A decision tree is a graphical representation of a procedure tree, place its label in the right-hand side of a rule. In the for classifying or evaluating an item of interest. For exam- left-hand side, place the conjunction of all the conditions ple, given a patient’s symptoms, a decision tree could be that would need to be true to reach that leaf from the root. used to determine the patient’s likely diagnosis, or outcome, Decision trees are useful for automating decision pro- or recommended treatment. Figure 1 shows a decision tree cesses that are part of an application program. For example, for forecasting whether a patient will die from hepatitis, for the optical character recognition (OCR) task, one needs based on data from the University of California at Irvine to map the optical representation of a symbol to a symbol repository (Murphy and Aha 1994). A decision tree repre- name. The optical representation might be a grid of pixel sents a function that maps each element of its domain to an values. The tree could attempt to map these pixel values to a element of its range, which is typically a class label or symbol name. Alternatively, the designer of the system numerical value. At each leaf of a decision tree, one finds an might include the computation of additional variables, also element of the range. At each internal node of the tree, one called features, that make the mapping process simpler. finds a test that has a small number of possible outcomes. Decision trees are used in a large number of applications, By branching according to the outcome of each test, one and the number continues to grow as practitioners gain arrives at a leaf that contains the class label or numerical experience in using trees to model decision making pro- value that corresponds to the item in hand. In the figure, cesses. Present applications include various pixel classifica- each leaf shows the number of examples of each class that tion tasks, language understanding tasks such as pronoun fall to that leaf. These leaves are usually not of one class, so resolution, fault diagnosis, control decisions in search, and one typically chooses the most frequently occurring class numerical function approximation. label. A decision tree is typically constructed recursively in a top-down manner (Friedman 1977; Quinlan 1986). If a set of labeled instances is sufficiently pure, then the tree is a leaf, with the assigned label being that of the most fre- quently occurring class in that set. Otherwise, a test is con- structed and placed into an internal node that constitutes the tree so far. The test defines a partition of the instances according to the outcome of the test as applied to each instance. A branch is created for each block of the partition, and for each block, a decision tree is constructed recur- sively. One needs to define when a set of instances is to be con- sidered sufficiently pure to constitute a leaf. One choice would be to require absolute purity, meaning that all the instances must be of the same class. Another choice would be to require that the class distribution be significantly lop- sided, which is a less stringent form of the complete lopsid- edness that one gets when the leaf is pure. This is also Figure 1. 224 Decision Trees During the mid-1990s, researchers developed methods known as prepruning because one restricts the growth of the for using ensembles of decision trees to improve accuracy tree before it occurs. (Dietterich and Bakiri 1995; Kong and Dietterich 1995; One also needs a method for constructing and selecting a Breiman 1996). To the extent that different decision trees test to place at an internal node. If the test is to be based on for the same task make independent errors, a vote of the set just one variable, called a univariate test, then one needs to of decision trees can correct the errors of the individual be able to enumerate possible tests based on that variable. If trees. the variable is discrete, then the possible outcomes could be the possible values of that variable. Alternatively, a test See also DECISION MAKING; GREEDY LOCAL SEARCH; could ask whether the variable has a particular value, mak- HEURISTIC SEARCH; RATIONAL DECISION MAKING ing just two possible outcomes, as is the case in figure 1. If —Paul Utgoff the variable is continuous, then some form of discretization needs to be done, so that only a manageable number of out- References comes is possible. One can accomplish this by searching for a cutpoint, and then forming a test whether the variable Breiman, L. (1996). Bagging predictors. Machine Learning 24: value is less than the cutpoint, as shown in figure 1. 123–140. If the test is to be based on more than one variable, called Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. (1984). a multivariate test, then one needs to be able to search Classification and Regression Trees. Belmont, CA: Wadsworth quickly for a suitable test. This is often done by mapping the International Group. discrete variables to continuous variables, and then finding a Dietterich, T. G., and G. Bakiri. (1995). Solving multiclass learn- good linear combination of those variables. A univariate test ing problems via error-correcting output codes. Journal of Arti- ficial Intelligence 2: 263–286. is also known as an axis-parallel split because in a geomet- Friedman, J. H. (1977). A recursive partitioning decision rule for ric view of the instance space, the partition formed by a nonparametric classification. IEEE Transactions on Computers, univariate test is parallel to the axes of the other variables. A C-26: 404–408. multivariate test is also known as an oblique split because it Kong, E. B., and T. G. Dietterich. (1995). Error-correcting output need not have any particular characteristic relationship to coding corrects bias and variance. Machine Learning: Proceed- the axes (Murthy, Kasif, and Salzberg 1994). ings of the Twelfth International Conference. Tahoe City, CA: One must choose the best test from among those that are Morgan Kaufmann, pp. 313–321. allowed at an internal node. This is typically done in a Murphy, P. M., and D. W. Aha. (1994). UCI Repository of Machine greedy manner by ranking the tests according to a heuristic Learning Databases. Irvine, CA: University of California, function, and picking the test that is ranked best. Many heu- Department of Information and Computer Science. Murthy, S. K., S. Kasif, and S. Salzberg. (1994). A system for ristic tests have been suggested, and this problem is still induction of oblique decision trees. Journal of Artificial Intelli- being studied. For classification trees, most are based on gence Research 2: 1–32. entropy minimization. By picking a test that maximizes the Quinlan, J. R. (1986). Induction of decision trees. Machine Learn- purity of the blocks, one will probably obtain a smaller tree ing 1: 81–106. than otherwise, and researchers and practitioners alike have Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San a longstanding preference for smaller trees. Popular heuris- Mateo, CA: Morgan Kaufmann. tic functions include information gain, gain ratio, GINI, Utgoff, P. E., N. C. Berkman, and J. A. Clouse. (1997). Decision and Kolmogorov-Smirnoff distance. For regression trees, tree induction based on efficient tree restructuring. Machine most tests are based on variance minimization. A test that Learning 29: 5–44. minimizes the variance within the resulting blocks will also tend to produce a smaller tree than one would obtain other- Further Readings wise. Brodley, C. E., and P. E. Utgoff. (1995). Multivariate decision It is quite possible that a tree will overfit the data. The trees. Machine Learning 19: 45–77. tree may have more structure than is helpful because it is Buntine, W., and T. Niblett. (1992). A further comparison of split- attempting to produce several purer blocks where one less ting rules for decision-tree induction. Machine Learning 8: 75– pure block would result in higher accuracy on unlabeled 85. instances (instance not used in training). This can come Chou, P. A. (1991). Optimal partitioning for classification and about due to inaccurate variable measurements or inaccurate regression trees. IEEE Transactions on Pattern Analysis and label or value assignments. A host of postpruning methods Machine Intelligence 13: 340–354. are available that reduce the size of the tree after it has been Draper, B. A., C. E. Brodley, and P. E. Utgoff. (1994). Goal- directed classification using linear machine decision trees. grown. A simple method is to set aside some of the training IEEE Transactions on Pattern Analysis and Machine Intelli- instances, called the pruning set, before building the tree. gence 16: 888–893. Then after the tree has been built, do a postorder traversal of Jordan, M. I. (1994). A statistical approach to decision tree model- the tree, reducing each subtree to a leaf if the proposed leaf ing. Machine Learning: Proceedings of the Eleventh Interna- would not be significantly less accurate on the pruning set tional Conference. New Brunswick, NJ: Morgan Kaufmann, than the subtree it would replace. This issue of balancing the pp. 363–370. desire for purity with the desire for accuracy is also called Moret, B. M. E. (1982). Decision trees and diagrams. Computing the bias-variance trade-off. A smaller tree has higher bias Surveys 14: 593–623. because the partition is coarser, but lower variance because Murphy, P., and M. Pazzani. (1994). Exploring the decision forest: the leaves are each based on more training instances. An empirical investigation of Occam’s Razor in decision tree Deductive Reasoning 225 from nonentailments, but factors such as these affect all induction. Journal of Artificial Intelligence Research 1: 257– 275. human thinking and tell us nothing new about deductive Pagallo, G., and D. Haussler. (1990). Boolean feature discovery in reasoning. empirical learning. Machine Learning 5,1: 71–99. Investigating the role of entailment in thought requires Quinlan, J. R., and R. L. Rivest. (1989). Inferring decision trees some degree of abstraction from everyday cognitive foibles. using the minimum description length principle. Information But it is not always easy to say how far such abstraction and Computation 80: 227–248. should go. According to GRICE (1989), Lewis (1979), and Safavian, S. R., and D. Langrebe. (1991). A survey of decision tree Sperber and Wilson (1986), among others, ordinary conver- classifier methodology. IEEE Transactions on Systems, Man sational settings impose restrictions on what people say, and Cybernetics 21: 660–674. restrictions that can override entailments or supplement Utgoff, P. E. (1989). Incremental induction of decision trees. entailments (see PRAGMATICS). If Martha says, “Some of Machine Learning 4: 161–186. White, A. P., and W. Z. Liu. (1994). Bias in information-based my in-laws are honest,” we would probably understand her measures in decision tree induction. Machine Learning 15: to imply that some of her in-laws are dishonest. This fol- 321–329. lows from a conversational principle that enjoins her to make the most informative statement available, all else Decompositional Strategies being equal. If Martha believes that all her in-laws are hon- est, she should have said so; because she did not say so, we infer that she believes not all are honest. We draw this con- See FUNCTIONAL DECOMPOSITION versational IMPLICATURE even if we recognize that “Some of my in-laws are honest” does not entail “Some of my in- Deductive Reasoning laws are dishonest.” Experimental results suggest that people do not abandon Deductive reasoning is a branch of cognitive psychology their conversational principles when they become subjects investigating people’s ability to recognize a special rela- in reasoning experiments (e.g., Fillenbaum 1977; Sperber, tion between statements. Deductive LOGIC is a branch of Cara, and Girotto 1995). Moreover, conversational implica- philosophy and mathematics investigating the same rela- tures are just one among many types of nondeductive infer- tion. We can call this relation entailment, and it holds ences that people routinely employ. In many situations it is between a set of statements (the premises) and a further satisfactory to reach conclusions that are plausible on the statement (the conclusion) if the conclusion must be true basis of current evidence, but where the conclusion is not whenever all the premises are true. Consider the premises necessarily true when the evidence is. It is reasonable to “Calvin bites his nails while working” and “Calvin is conclude from “Asteroid gamma-315 contains carbon com- working” and the conclusion “Calvin is biting his nails.” pounds” that “Asteroid gamma-359 contains carbon com- Because the latter statement must be true whenever both pounds,” even though the first statement might be true and the former statements are, these premises entail the con- the second false. Attempts to reduce these plausible infer- clusion. By contrast, the premises “Calvin bites his nails ences to entailments have not been successful (as Osherson, while working” and “Calvin is biting his nails” does not Smith, and Shafir 1986 have argued). entail “Calvin is working,” inasmuch as it is possible that Subjects sometimes misidentify these plausible arguments Calvin bites his nails off the job. as deductively correct, and psychologists have labeled this Historically, logicians have constructed systems that tendency a content effect (Evans 1989 contains a review of describe entailments among statements in some domain of such effects). These content effects, of course, do not mean discourse. To compare these systems to human intuition, that people have no grasp of individual entailments (see, e.g., psychologists present to their subjects arguments (premise- Braine, Reiser, and Rumain 1984; Johnson-Laird and Byrne conclusion combinations), some of which embody the target 1991; and Rips 1994, for evidence concerning people’s entailments. The psychologists ask the subjects to identify mastery of entailments that depend on logical constants such those arguments in which the conclusion “follows logically” as “and,” “if,” “or,” “not,” “for all,” and “for some”). Subjects from the premises (or in which the conclusion “must be true may rely on plausible inferences when it becomes difficult for whenever the premises are true”). Alternatively, the psy- them to judge whether an entailment holds; they may rely on chologist can present just the premises and ask the subjects entailments only when the context pushes them to do so; or to produce a conclusion that logically follows from them. they may falsely believe that the experiment calls for Whether a subject’s answer is correct or incorrect is usually plausible inferences rather than for entailments. determined by comparing it to the dictates of the logic sys- However, if there is a principled distinction between tem. entailments and other inference relations and if people rou- One purpose of investigating people’s ability to recog- tinely fail to observe this distinction, then perhaps they have nize entailments is to find out what light (if any) entail- difficulties with the concept of entailment itself. Some psy- ment sheds on thinking and to use these findings as a chologists and some philosophers believe that there is no basis for revising theories of logic and theories of mind. reasoning process that is distinctive to entailments (e.g., Given this goal, certain differences between entailments Harman 1986). Some may believe that people (at least those and psychological decisions about them are uninforma- without special logic or math training) have no proper con- tive. Subjects’ inattention, memory limits, and time limits cept of entailment that distinguishes it from other inference all restrict their success in distinguishing entailments relations. The evidence is clouded here, however, by meth- 226 Deficits odological issues (see Cohen 1981). For example, psycholo- Cohen, L. J. (1981). Can human irrationality be experimentally demonstrated? Behavioral and Brain Sciences 4: 317–370. gists rarely bother to give their subjects a full explanation of Evans, J. St. B. T. (1989). Bias in Human Reasoning. Hillsdale, NJ: entailment, relying instead on phrases like “logically fol- Erlbaum. lows.” Perhaps subjects interpret these instructions as equiv- Fillenbaum, S. (1977). Mind your p’s and q’s: the role of content alent to the vaguer sort of relation indicated in natural and context in some uses of and, or and if. In G. H. Bower, Ed., language by “therefore” or “thus.” Psychology of Learning and Motivation, vol. 11. Orlando: Aca- The problem of whether people distinguish entailments demic Press. is complicated on the logic side by the existence of multiple Goodman, N. (1965). Fact, Fiction, and Forecast. 2nd ed. India- logic systems (see MODAL LOGIC). There is no one logic that napolis: Bobbs-Merrill. captures all purported entailments, but many proposed sys- Grice, H. P. (1989). Studies in the Way of Words. Cambridge, MA: tems that formalize different domains. Some systems are Harvard University Press. Harman, G. (1986). Change in View. Cambridge, MA: MIT Press. supersets of others, adding new logical constants to a core Johnson-Laird, P. N., and R. M. J. Byrne. (1991). Deduction. Hills- logic in order to describe entailments for concepts like time, dale, NJ: Erlbaum. belief and knowledge, or permission and obligation. Other Lemmon, E. J. (1959). Is there only one correct system of modal systems are rival formulations of the same domain. Psychol- logic. Proceedings of the Aristotelian Society 23: 23–40. ogists sometimes take subjects’ rejection of a specific logic Lewis, D. (1979). Score keeping in a language game. Journal of principle as evidence of failure in the subjects’ reasoning; Philosophical Logic 8: 339–359. however, some such rejections may be the result of an incor- Osherson, D. N., E. E. Smith, and E. B. Shafir. (1986). Some ori- rect choice of a logic standard. According to many philoso- gins of belief. Cognition 24: 197–224. phers (e.g., Goodman 1965), justification of a logic system Rips, L. J. (1994). The Psychology of Proof. Cambridge, MA: MIT depends in part on how close it comes to human intuition. If Press. Sperber, D., F. Cara, and V. Girotto. (1995). Relevance theory so, subjects’ performance may sometimes be grounds for explains the selection task. Cognition 57: 31–95. revision in logic. Sperber, D., and D. Wilson. (1986). Relevance. Cambridge, MA: The variety of logic systems also raises the issue of Harvard Press. whether human intuitions about entailment are similarly varied. According to one view, the intuitions belong to a unified set that incorporates the many different types of Further Readings entailment. Within this set, people may recognize entail- Braine, M. D. S., and D. P. O’Brien. (1998). Mental Logic. Mah- ments that are specialized for broad domains, such as time, wah, NJ: Erlbaum. obligation, and so on; but intuitions about each domain are Cheng, P. W., K. J. Holyoak, R. E. Nisbett, and L. M. Oliver. internally consistent. Rival analyses of a specific constant (1986). Pragmatic versus syntactic approaches to training (e.g., “it ought to be the case that . . .”) compete for which deductive reasoning. Cognitive Psychology 18: 293–328. gives the best account of reasoning. According to a second Evans, J. St. B. T., S. E. Newstead, and R. M. J. Byrne. (1993). view, however, there are many different intuitions about Human Reasoning. Hillsdale, NJ: Erlbaum. entailment, even within a single domain. Rival analyses for Nisbett, R. E. (1993). Rules for Reasoning. Hillsdale, NJ: Erlbaum. “it ought to be the case that. . .” may then correspond to dif- Oaksford, M., and N. Chater. (1994). A rational analysis of the ferent (psychologically real) concepts of obligation, each selection task as optimal data selection. Psychological Review with its associated inferences (cf. Lemmon 1959). 101: 608–631. Polk, T. A., and A. Newell. (1995). Deduction as verbal reasoning. The first view lends itself to a theory in which people Psychological Review 102: 533–566. automatically translate natural language arguments into a single LOGICAL FORM on which inference procedures oper- ate. The second view suggests a more complicated process: Deficits when subjects decide whether a natural language argument is deductively correct, they may perform a kind of model- See MODELING NEUROPSYCHOLOGICAL DEFICITS fitting, determining if any of their mental inference pack- ages makes the argument come out right (as Miriam Bassok has suggested, personal communication, 1996). Both views Deixis have their advantages, and both deserve a closer look. See also CAUSAL REASONING; EVOLUTIONARY PSYCHOL- See INDEXICALS AND DEMONSTRATIVES OGY; INDUCTION; LOGICAL REASONING SYSTEMS; MENTAL MODELS; NONMONOTONIC LOGICS Demonstratives —Lance J. Rips See INDEXICALS AND DEMONSTRATIVES References Dendrite Braine, M. D. S., B. J. Reiser, and B. Rumain. (1984). Some empirical justification for a theory of natural propositional rea- soning. In G. H. Bower, Ed., Psychology of Learning and Moti- See COMPUTATIONAL NEUROSCIENCE; NEURON vation, vol. 18. Orlando: Academic Press. Depth Perception 227 Density Estimation See UNSUPERVISED LEARNING Depth Perception Depth perception is one of the oldest problems of philoso- phy and experimental psychology. It has always intrigued people because of the two-dimensionality of the retinal image, although this is not really relevant because, as Des- cartes (1637) realized, we do not perceive the retinal image. DESCARTES was one of the first to suggest that depth could be computed from changes in the accommodation and con- vergence of the eyes. Accommodation is the focusing of the lenses and convergence is the inward/outward turning of the eyes stimulated by a change in the distance of the object of regard. Unfortunately, convergence and accommodation vary little beyond a meter or two and we clearly have a sense of depth well beyond that. Cutting and Vishton (1995) Figure 1. note that different cues to depth seem to be operative for near (personal) space, ambient (action) space, and vista (far) points of all sets of parallel lines on a planar surface.) For space. Convergence and accommodation clearly apply best example, the angular extent in the optic array of a location to the space approximately within arm’s reach, but even in on a surface and the horizon of that surface specifies the that region their effectiveness in giving an impression of absolute distance of that location to a scale factor given by absolute distance varies among persons and is imprecise. the observer’s eye height. Furthermore, the angular dis- Interposition, or the hiding of one object by another, cre- tances from two locations on a surface to the horizon can ates an ordinal sense of depth at all distances. But how is give the relative distances of those locations independently interposition specified in visual stimulation? Interposition of eye height (Sedgwick 1986). The horizon can be implic- or occlusion can be indicated monocularly by such contour itly specified by converging lines even if they do not extend arrangements as T junctions and alignments (Kanisza 1979) to a vanishing point and also by randomly arranged ele- or binocularly by the fact that parts of background objects or ments of finite uniform size (see figure 1). regions are hidden from one eye and not the other It is generally agreed that the familiar size of isolated (Leonardo da Vinci 1505). Current evidence suggests that objects is not used to derive distance from their angular size unlike monocular occlusion cues, binocular occlusion cues although this is possible in principle. However, relative size, can elicit a sense of metric depth (Gillam, Blackburn, and especially for objects of a similar shape, is an excellent cue Nakayama 1999). to relative distance. Likewise, an object changing size is It is generally found that people are quite accurate in normally seen as looming or receding in depth. judging ambient distance (up to approximately twenty feet). Parallax, defined as the difference in the projection of a This is typically demonstrated by having them survey the static scene as viewpoint changes relative to the scene, is a scene, close their eyes, and walk to a predesignated object potent source of information about depth. Motion parallax (Loomis et al. 1992). Perhaps the most important source of refers to successive viewpoint changes produced by motion distance information in ambient and far space is spatial lay- of the observer, whereas binocular parallax refers to the out information of the kind first analyzed by James Jerome simultaneous differences in viewpoint provided by two hori- GIBSON (1950, 1966), although much of the underlying per- zontally separated eyes. The disparate images thus produced spective geometry was known by artists of the Renaissance result in “stereoscopic vision.” Wheatstone (1838), who dis- (Alberti 1435). Gibson pointed out that objects are nearly covered stereoscopic vision, showed that the projections of a always located on a ground plane and that if the ground is scene to the two eyes differ in a number of ways (see figure homogeneously textured, it provides a scale in the “optic 2), and there is some evidence that the visual system array” (the projection of the layout to a station point) which responds directly to certain higher order image differences can be used to compare distances between elements in any such as contour orientation. Binocular disparity is usually direction at any distance. If the size of the units of texture specified, however, as the difference in the horizontal angles are known, for example by their relationship to the observer, subtended at the two eyes by two points separated in depth. the scale may also specify absolute distance. In practice it Stereoscopic vision really comes into its own in near has been found that random dot textures give a much poorer vision where it is important in eye-hand coordination, and in sense of depth than regular textures, especially regular tex- ambient vision where it allows discrete elements that do not tures that include lines converging toward a vanishing point. provide perspective cues, such as the leaves and branches of Vanishing points and horizons provide depth information in a forest, to be seen in vivid relief. The disparity produced by their own right. (A horizon is the locus of the vanishing a given depth interval declines rapidly as the distance of the 228 Depth Perception Figure 2. Descartes, René 229 interval from the observer increases. Nevertheless it is pos- Leonardo da Vinci. (1505). Codex Manuscript D. In the Biblio- thèque Nationale, Paris. English translation in D. S. Strong sible with moderate stereoscopic acuity to detect that an (1979), Leonardo on the Eye. New York: Garland, pp 41–92. object at about five hundred meters is nearer than infinity. Loomis, J. M., J. A. Da Silva, N. Fujita, and S. S. Fukusima. Because the binocular disparity produced by a given depth (1992). Visual space perception and visually directed action. interval declines with its distance, the depth response to dis- Journal of Experimental Psychology: Human Perception and parity must be scaled for distance to reflect depth accu- Performance 18(4): 906–921. rately. Scaling has largely been studied at close distances Rogers, B. J., and M. E. Graham. (1979). Motion parallax as an where it is excellent under full cue conditions although it is independent cue for depth perception. Perception 8: 125–134. not yet clear how the scaling is achieved (Howard and Rog- Sedgwick, H. A. (1986). Space perception. In K. Boff, L. Kauf- ers 1995). The accuracy of scaling in vista space is not man, and J. Thomas, Eds., Handbook of Perception and Human known. Stereoscopic depth is best for objects that are later- Performance, vol. 1. New York: John Wiley and Sons. Westheimer, G. (1979) Cooperative neural processes involved in ally close to each other (Gogel 1963). At such separations stereoscopic acuity. Experimental Brain Research 36: 585–597. stereoscopic depth is a “hyperacuity” because disparities of Wheatstone, C. (1838). Contributions to the physiology of vision: only 10–30 sec of arc can be responded to as a depth separa- 1. On some remarkable and hitherto unobserved phenomena of tion (Westheimer 1979). binocular vision. Philisophical Transactions of the Royal Soci- Despite the fact that binocular and motion parallax have ety, London 128: 371–394. identical geometry, stereopsis is the superior sense for depth perception under most conditions, especially when there are Descartes, René only two objects separated in depth. Motion parallax is almost as good as stereopsis, however, in eliciting percep- tion of depth in densely patterned corrugated surfaces (Rog- A dominant figure of mid-seventeenth century philosophy ers and Graham 1979). A strong sense of solidity is also and science, René Descartes (1596–1650) developed a obtained monocularly when a skeletal object, such as a tan- sweepingly anti-Aristotelian, mechanist theory of nature, gled wire, is rotated in depth and viewed with a stationary while also advocating a conception of the “rational soul” as eye (kinetic depth effect). The depth variations in densely a distinct, immaterial entity, endowed by God with certain textured surfaces can also be perceived on the basis of the innate intellectual concepts. Generations of anglophone phi- monocular transformations they undergo during motion. losophers have tended (with some notable exceptions) to Many of the possible sources of information about depth construe Descartes’s importance as deriving mainly from have not yet been adequately investigated, especially the his radical development of problems of scepticism, and his sources used in ambient and vista space. There are also sharp dualistic distinction between mind and body, in his unresolved theoretical issues such as the relationship central philosophical work, the Meditations on First Philos- between the apparent distances of objects to the observer ophy (1641). Today, however, this conception is often criti- and their apparent distances from each other. cized as historically naive. On one hand, the Meditations See also HIGH-LEVEL VISION; ILLUSIONS; MID-LEVEL themselves were intended by Descartes to provide a “foun- dation” for his comprehensive mechanistic account of natu- VISION; STRUCTURE FROM VISUAL INFORMATION SOURCES; ral phenomena; the doctrines and arguments of the work SURFACE PERCEPTION; VISUAL PROCESSING STREAMS need to be interpreted in this light. On the other hand, Des- —Barbara Gillam cartes’s vision of a comprehensive, unified theory of nature, grounded in a small number of “distinctly conceivable,” References quantitatively expressible concepts (especially size, figure Alberti, L. (1435). On Painting. Translated by J. R. Spencer, 1956. and motion) was in itself of incalculable significance in the New Haven, CT: Yale University Press. history of Western thought (cf. UNITY OF SCIENCE). Cutting, J. E., and P. Vishton. (1995). Perceiving layout and know- Prominent among Descartes’s aims as a systematic scien- ing distances: The integration, relative potency and contextual tist was the incorporation of biological phenomena (such as use of different information about depth. In W. Epstein and S. nutrition and growth), and many psychological phenomena Rogers, Eds., Perception of Space and Motion. San Diego: Aca- as well (such as reflex behaviors and some kinds of learn- demic Press, pp. 69–117. ing), in the universal mechanistic physics that he envisaged. Descartes, R. (1637). Discourse on Method, Optics, Geometry and Works in which he develops mechanist approaches to physi- Meteorology. Translated by Paul J. Olscamp, 1965. Indianapo- lis: Bobbs-Merrill. ology and psychology include the early Treatise on Man Gibson, J. J. (1950). The Perception of the Visual World. Boston: (published only after his death); the Dioptrics (published, Houghton Mifflin. together with the Geometry and Meteors, with the wide- Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. ranging, partly autobiographical work, Discourse on the Boston: Houghton Mifflin. Method of Rightly Conducting One’s Reason and Seeking Gillam, B. J., S. Blackburn, and K. Nakayama. (1999). Unpaired Truth in the Sciences, 1637); parts of the compendious Prin- background stereopsis: Metrical encoding of depth and slant ciples of Philosophy (1644); and the late Passions of the without matching contours. To appear in Vision Research. Soul (1649). Gogel, W. C. (1963). The visual perception of size and distance. Basic to Descartes’s approach to the understanding of Vision Research 3: 101–120. animal (including human) behavior is the notion that one Howard, I. P., and B. J. Rogers. (1995). Binocular Vision and Ste- reopsis. New York: Oxford University Press. should push mechanistic-materialist explanations as far as Kanisza, G. (1979). Organization in Vision. New York: Praeger. one can. From early in his career he proclaimed that all the 230 Detectors behavior of nonhuman animals (“brutes”) can be explained In the Meditations Descartes characterizes both sensa- in mechanistic terms. In the Discourse on the Method (Part tions and our intellectual apprehensions (such as our repre- V) he defended this position by arguing that the behavior of sentation of God) as ideas. The Cartesian theory of ideas brutes uniformly fails two tests that he considers to be cru- had great impact on subsequent philosophy. It involves cial to establishing the presence of some principle other complex notions about representation and misrepresentation than the strictly mechanistic. The first is the ability to that continue to attract the interest of philosophers and respond adaptively to a variety of challenging circum- scholars today (cf. MENTAL REPRESENTATION). stances in appropriate ways. A bird, for instance, might There is substantial—probably conclusive—evidence in seem to show more “skills” in building a nest that we can Descartes’s post-Meditations writings that he intended to summon; but so does a clock show more “skill” than we limit strictly mental phenomena to those states that are command in measuring time. Yet neither the bird nor the accessible to an individual’s conscious awareness. But this clock is able to respond to novel circumstances in the austere aspect of his mind-body dualism sits uneasily with inventive way characteristic of humans. Descartes’s other, some features of his impressive accounts of human visual more famous, test for the presence of a nonmechanistic perception in the first half of the Dioptrics; and of human principle is the ability to use language: to “express our emotions in the Passions of the Soul. In both works he not thoughts to others,” no matter what may be said in our only blends sensational with intellectual states in his presence. He acknowledges that brutes can utter cries and explanations of mental phenomena, but also invokes con- grunts that have some kind of communicative effect, but he siderations that are hard to apportion between the “con- stresses that these fall short, drastically, of the range and scious-mental” and “purely mechanistic-material” divide. versatility of human language. In the case of human beings, This is particularly true of his account of distance percep- however, behavioral adaptability and “true language” dem- tion in the Dioptrics, in which Descartes certainly seems onstrate the presence of a nonmechanistic principle, a con- to invoke a kind of COMPUTATION that cannot plausibly be scious rational soul. regarded as accessible to consciousness (cf. COMPUTA- Descartes’s Discourse conception of nonhuman animals TIONAL VISION). as “automata, or self-moving machines” is today sometimes See also NATIVISM, HISTORY OF; RATIONALISM VS. characterized as “mechanomorphism.” Widely rejected and EMPIRICISM even ridiculed in his lifetime, it remains a target, or stalking —Margaret D. Wilson horse, for contemporary advocates of animal intelligence, consciousness, and (in some species) perhaps language (cf. Further Readings COGNITIVE ETHOLOGY, PRIMATE LANGUAGE). In the Meditations (as anticipated by Part IV of the Dis- Chomsky, N. (1966). Cartesian Linguistics. New York: Harper and course) Descartes approaches issues of reason and con- Row. sciousness from a first-person rather than a behavioral Cottingham, J., Ed. (1992). The Cambridge Companion to Des- perspective. He argues, first, that all his apparent percep- cartes. Cambridge: Cambridge University Press. tions of a physical world are initially subject to doubt (in Descartes, R. (1984–85). Philosophical Writings. 3 vols. Trans. J. that they could in principle occur even if no physical world Cottingham, R. Stoothoff, D. Murdoch. Cambridge, England: existed). In fact, he is able to find reason to doubt even the Cambridge University Press. Des Cheyne, D. (1996). Physiologia: Natural Philosophy in Late simplest and most evident propositions. But second, even in Aristotelian and Cartesian Thought. Ithaca, NY: Cornell Uni- the face of such skepticism his own existence as a “thinking versity Press. thing” at least is indubitable. Later he argues that he is the Garber, D. (1992). Descartes’ Metaphysical Physics. Chicago: creature of a perfect God; hence his clearest and most dis- University of Chicago Press. tinct “perceptions” (mainly the deliverances of pure intel- Gaukroger, S. (1995). Descartes: An Intellectual Biography. lect) are beyond doubt. Finally he concludes that as a Oxford: Oxford University Press. thinking thing, he is a substance distinct from any body, Hoffman, P. (1996). Descartes on misrepresentation. Journal of the though (as he goes on to say) one at present closely joined History of Philosophy. 34: 357–381. with a body, with which he, as a mind, interacts (cf. SELF Smith, N. K. (1966). New Studies in the Philosophy of Descartes: and MIND-BODY PROBLEM). He further maintains that, given Descartes as Pioneer. London: Macmillan. Voss, S., Ed. (1993). Essays on the Philosophy and Science of the goodness of God, his normal conviction that his seeming René Descartes. New York: Oxford University Press. perceptions of bodies are in fact caused by external physical Wilson, M. D. (1995). Animal ideas. Proceedings of the American things must be correct. Philosophical Association 69: 7–25. Along the way, however, Descartes repeatedly under- Wilson, M. D. (1978). Descartes. London: Routledge and Kegan scores the point that bodies are not in fact as they appear Paul. in ordinary sense experience. For instance, their appear- Wolf-Devine, C. (1993). Descartes on Seeing. Carbondale: South- ances as colored are misleading, in that we tend to sup- ern Illinois University Press. pose that (say) green as we sensibly experience it is a real quality of some bodies; whereas in fact it is just a sensa- tion in our minds, resulting from the effect of external Detectors things (constituted of bits of matter in motion) on our ner- vous systems, and of the latter on us as immaterial mental substances. See FEATURE DETECTORS Discourse 231 1986; Roberts 1996). The information structure of a dis- Development course is far richer than its linguistic content alone. Only the full range of contextual information for a given dis- See COGNITIVE DEVELOPMENT; INFANT COGNITION; NATIV- course, including a grasp of the interlocutors’ inferred ISM; NEURAL DEVELOPMENT intentions, the intended rhetorical relations among their utterances, and other nonlinguistically given information Developmental Language Disorders that they share, can resolve all the potential ambiguities in the linguistic strings uttered, clarify the often inexplicit connections between utterances, and lead us to grasp what See LANGUAGE IMPAIRMENT, DEVELOPMENTAL the speaker(s) intend to convey. These three ways of characterizing discourse—as an Diachrony event revolving around verbal exchange, as the linguistic content of that exchange, and as the structure on the infor- mation involved—are not mutually exclusive; there is no See PARAMETER-SETTING APPROACHES TO ACQUISITION, verbal exchange without linguistic content, and the latter CREOLIZATION, AND DIACHRONY can be taken as one aspect of the abstract information struc- ture of the exchange. However, most of the work on dis- Discourse course in artificial intelligence, linguistics, philosophy of language, PSYCHOLINGUISTICS, sociology, and anthropology Discourse is the ground of our experience of language and can be classified according to which of the three aspects it of linguistic meaning. It is in discourse that we learn lan- focuses on. For example, sociologists, sociolinguists, and guage as children, and in discourse that we most adequately ethnographers interested in conversational analysis (Sacks, convey our thought. The individual utterances in a discourse Schegloff, and Jefferson 1978) focus on the discourse event are notoriously vague and full of potential AMBIGUITY. Yet itself and its social character, including the way that inter- in the context of the discourse in which they occur, vague- locutors organize their participation in such events in an ness and ambiguity are rarely a problem. That is, the overall orderly, apparently conventional manner, varying somewhat discourse profoundly influences the interpretation of indi- from culture to culture. Speakers take turns, the opportunity vidual linguistic constituents within it, as witnessed by our for taking a turn being cued by a number of conventional discomfort with the ethics of taking what someone says out means (including set phrases, intonation, and pauses), and of context. negotiate the beginning and end of a discourse as well as the Discourse can be characterized in three principal ways. shift from topic to topic within it. In sociologically informed We may think of discourse as a type of event, in which anthropological linguistics (Duranti 1997), discourse events human agents engage in a verbal exchange; in the limit are taken to play a crucial role in the creation, reproduction, case, the monologue, there is only one agent, but even and legitimation of a community’s social alliances and then there is an intended audience, if only reflexive or cleavages. Those working in the tradition of discourse anal- imaginary. We may also think of discourse as the linguis- ysis (see van Dijk 1985; Carlson 1983) focus instead on the tic content of that exchange, an ordered string of words linguistic content of the verbal exchange, the text, some with their associated syntactic and prosodic structures. Or arguing that it is generated by the syntactic rules of a text we may characterize discourse as the more complex struc- grammar. But probably the majority of theorists who work ture of information that is presupposed and/or conveyed on discourse today would agree that discourse is not a lin- by the interlocutors during the course of the discourse guistic structure generated by a grammar, but instead is event in view of the explicit linguistic content of the structured by nonlinguistic, logical, and intentional fac- exchange. The information structure of a discourse may be tors—aspects of what we have called the information struc- characterized as an ordered set containing several distin- ture of discourse. guished kinds of information, for example: a set of dis- A number of prima facie unrelated pragmatic phenomena course participants (the interlocutors); the linguistic depend on the information structure of discourse, suggesting content of the discourse, with each utterance indexed by that this aspect of discourse can provide the basis for a uni- the speaker; the information presupposed or proffered by fied theory of the role of pragmatic factors in linguistic inter- speakers during the discourse via the linguistic content of pretation. Dynamic theories of semantic interpretation take their utterances; an association of the proffered informa- the meaning of an utterance to be a function from contexts tion with various topics or questions under discussion, the (the context of utterance) to contexts (the context of utter- topics and subtopics hierarchically organized by virtue of ance updated with the information proffered in the utter- their relations to the (often only inferred) goals and inten- ance). One can view the three basic types of speech acts— tions of the various speakers (the intentional structure of assertions, questions, and orders—as functions that update the discourse); a set of entities discussed (the domain of the information structure of a discourse in different ways. discourse); a changing set of the entities and topics which Assertions update the information proffered in the discourse the interlocutors focus on during their discussion, orga- (see DYNAMIC SEMANTICS; Stalnaker 1979); a question sets nized as a function of time (the attentional structure of up a new (sub)topic for discussion, hence affecting the inten- discourse); and other kinds of information and structures tional structure of the discourse (Ginzburg 1996; Roberts on information, as well (Lewis 1979; Grosz and Sidner 1996); an order, if accepted, commits the person ordered to 232 Discourse behave in accordance with the order, and this commitment is See alsoFOCUS; PRAGMATICS; PRESUPPOSITION; PROS- part of the information indirectly conveyed by the discourse. ODY AND INTONATION; STRESS, LINGUISTIC Many secondary subtypes of these three basic types of —Craige Roberts speech acts have been proposed in the literature, including, for example, predictions, confirmations, concessives, and promises; rhetorical questions as well as probes for informa- References tion; requests, permissions, advisories, and commands; and Carlson, L. (1983). Dialogue Games: An Approach to Discourse so forth (see Searle 1969). Work in artificial intelligence on Analysis. Dordrecht: Reidel. plan inference (see Cohen, Morgan, and Pollack 1990) has Cohen, P., J. Morgan, and M. Pollack. (1990). Intentions in Com- argued that the secondary speech act type of an utterance munication. Cambridge, MA: Bradford Books/ MIT Press. can be derived from its basic speech act type and proffered Duranti, A. (1997). Linguistic Anthropology. New York: Cam- content, its inferred role in the intentional structure of the bridge University Press. discourse, and the inferred domain goals of the interlocutor Ginzburg, J. (1996). The semantics of interrogatives. In S. Lappin, at the time of utterance (i.e., her general goals at that time, Ed., The Handbook of Contemporary Semantic Theory, pp. not necessarily just those expressed in discourse). 385–422. Oxford: Blackwell. In his important work on meaning in conversation, H. Grosz, B., and C. Sidner. (1986). Attention, intentions, and the Paul GRICE argued that much of the MEANING conveyed in structure of discourse. Computational Linguistics 12: 175–204. Lewis, D. (1979). Score-keeping in a language game. In R. discourse is not directly proffered via the linguistic content Bauerle, U. Egli, and A. von Stechow, Eds., Semantics from a of the utterances in the discourse, but instead follows from Different Point of View. Berlin: Springer. what is said in view of the intentions of the interlocutors and Mann, W. C., and S. A. Thompson. (1987). Rhetorical Structure a set of guidelines, the conversational maxims, which char- Theory: A Theory of Text Organization. Information Sciences acterize the rational behavior of agents in a communicative Institute, University of Southern California. situation (see IMPLICATURE and RELEVANCE). Although the McCafferty, A. (1987). Reasoning about Implicature. Ph.D. diss. crucial role of interlocutors’ intentions is sometimes over- University of Pittsburgh. looked in older work on implicature, contemporary work Roberts, C. (1996). Information structure: Towards an integrated (e.g., McCafferty 1987; Welker 1994) pays considerable theory of formal pragmatics. In Y.-H. Yoon and A. Kathol, attention to the role of that facet of the information structure Eds., OSU Working Papers in Linguistics, vol. 49: Papers in Semantics. Ohio State University Department of Linguistics. of discourse that we have called its intentional structure in Sacks, H., E. A. Schegloff, and G. Jefferson. (1978). A simplest explaining how implicatures are generated. Similarly, work systematics for the organization of turn-taking in conversation. on ANAPHORA in discourse, especially from a computational In J. Schenkein, Ed., Studies in the Organization of Conversa- point of view (e.g., Grosz and Sidner 1986), emphasizes the tional Interaction. New York: Academic Press, pp. 7–55. role both of intentional structure and of attentional structure Searle, J. R. (1969). Speech Acts. Cambridge: Cambridge Univer- in constraining the set of possible antecedents for a given sity Press. pronoun or other anaphoric element across discourse; see Sgall, P., E. Hajicova, and J. Panevova. (1986). The Meaning of also the related work on centering (Walker, Joshi, and Sentence in its Semantic and Pragmatic Aspect. Dordrecht: Prince 1998). And the intentional structure of discourse may Reidel. be seen to reflect strategies of inquiry which correspond to Stalnaker, R. (1979). Assertion. In P. Cole, Ed., Syntax and Seman- tics. New York: Academic Press. classical rhetorical structures (Mann and Thompson 1987), Vallduví, E. (1992). The Informational Component. New York: connecting work on the role of such structures in interpreta- Garland. tion to the general notion of information structure. van Dijk, T. (1985). Handbook of Discourse Analysis, vol. 3: Dis- Finally, the information structure of discourse is course and Dialogue. London: Academic Press. reflected in a number of ways in the structure of linguistic Walker, M., A. Joshi, and E. Prince, Eds. (1998). Centering Theory constituents—sentences and sentence fragments. These in Discourse. Oxford: Oxford University Press. reflections involve the phenomena variously referred to as Welker, K. (1994). Plans in the Common Ground: Toward a Gen- topic and comment, theme and rheme, link, and focus, erative Account of Implicature. Ph.D. diss. Ohio State Univer- among others. Some have argued that (some subset of) sity. these notions play a role as primitive notions in syntactic structure (e.g., Sgall, Hajicova, and Panevová 1986; Vall- Further Readings duví 1992), while others have argued that they are, instead, only functional characterizations of the accidental role of Brown, G., and G. Yule. (1983). Discourse Analysis. Cambridge: Cambridge University Press. syntactic constituents in particular discourse contexts. Büring, D. (1994). Topic. In Peter Bosch and Rob van der Sandt, However, it is clear that particular sentential constructions Eds., Focus and Natural Language Processing. Heidelberg: (e.g., topicalization in English) and prosodic features (e.g., IBM, pp. 271–280. prosodic prominence) may be specially associated with Goffman, E. (1981). Forms of Talk. Oxford: Basil Blackwell. these functions via associated conventional presuppositions Halliday, M. A. K., and R. Hasan. (1976). Cohesion in English. about the information structure of the discourse in which London: Longman. they occur. Such linguistic structures, along with anaphora Kamp, H., and U. Reyle. (1993). From Discourse to Logic: An and ellipsis, are designed to increase discourse coherence Introduction to Model-Theoretic Semantics of Natural Lan- and to help interlocutors keep track of the common ground guage, Formal Logic and Discourse Representation Theory. and other features of the information structure of discourse. Dordrecht: Kluwer. Dissonance 233 transmission of rumors following disasters to the rational- Polanyi, L., and R. J. H. Scha. (1983). The syntax of discourse. Text 3: 261–270. ization of everyday decisions, the consequences of counter- Schiffrin, D. (1987). Discourse Markers. Cambridge: Cambridge attitudinal advocacy, selectivity in information search and University Press. interpretation, and responses to the disconfirmation of cen- Searle, J. R., F. Kiefer, and M. Bierwisch, Eds. (1980). Speech Act tral beliefs. Of the many new research directions produced Theory and Pragmatics. Dordrecht: Reidel. by the theory, three paradigms proved most influential. The first major paradigm involved the attitudinal conse- Dissonance quences of making a decision (Brehm 1956). Any choice among mutually exclusive options is postulated to produce dissonance, because any nonoverlapping bad features of the Festinger’s (1957) theory of cognitive dissonance is, by chosen alternative(s) or good features of the rejected alter- far, the most prominent of several social-psychological native(s) are dissonant with the choice itself. To reduce this theories based on the premise that people are motivated to postdecisional dissonance, the individual is likely to exag- seek consistency among their beliefs, attitudes, and actions gerate the advantages of the option(s) selected and to dispar- (Abelson et al. 1968). It asserts that people find inconsis- age the advantages of the option(s) rejected. Through this tency, or “dissonance,” among their cognitions to be emo- postdecisional reevaluation and “spreading apart” of the tionally aversive and seek to eliminate or reduce any alternatives, individuals come to rationalize their decisions. inconsistency. Such effects appear most strongly when the decision is both According to Festinger, cognitive dissonance is a tension difficult and irrevocable, and have proved to be highly repli- state that arises whenever an individual simultaneously cable across a variety of decision contexts (Festinger 1964; holds two or more cognitions that are mutually inconsistent Wicklund and Brehm 1976). with one another. In this model, any two cognitions, consid- A second popular paradigm concerned the selectivity of ered by themselves, stand in one of three relations to one information-seeking following a decision (Ehrlich et al. another—dissonant (contradictory or inconsistent), conso- 1957). If the decision is difficult to undo (and not all of the nant (consistent), or irrelevant. The total amount of disso- attendant dissonance has already been reduced through nance for a given person in any particular situation is reevaluation of the alternatives), the individual is moti- defined as the ratio of dissonant relations to total relevant vated both to avoid subsequent information that seems relations, with each relation weighted for its importance to likely to be dissonant with that decision and to seek out that person: subsequent information that seems likely to support that Total Dissonance = (Dissonant Relations) / (Dissonant + decision. Empirical evidence concerning this aspect of the Constant Relations) theory, however, has been much more mixed; and the pre- cise conditions under which such selective exposure When cognitive dissonance arises, the individual is moti- effects occur remain unclear (Freedman and Sears 1965; vated to reduce the amount of dissonance. This can be done Frey 1986). in many ways—by decreasing the number and/or the impor- Finally, the third, and most influential paradigm exam- tance of dissonant relations, or by increasing the number ined the effects of “forced compliance,” in which an indi- and/or the importance of consonant relations. Precisely how vidual is induced to engage in some counterattitudinal dissonance is reduced in any particular situation depends on action with minimal, “psychologically insufficient,” exter- the resistance to change of the various cognitions involved. nal coercion or incentive (e.g., Aronson and Carlsmith The resistance-to-change of any cognition in a particular 1963; Festinger and Carlsmith 1959). In this case, disso- context depends, in turn, on the extent to which a change nance derives from a conflict between the person’s action would produce new dissonant relations, the degree to which and attitudes. To the extent that an overt action is harder to the cognition is firmly anchored in reality, and the difficulty change than a personal opinion, attitudes are changed to of changing those aspects of reality. conform more closely to behavior. The less the external Consider a prototypic case: a cigarette smoker, circa pressure used to induce the behavior, the more such subse- 1960, encountering the first medical reports linking smok- quent justification of one’s actions occurs, because any ing to lung cancer, emphysema, and heart disease. His two external pressures provide added consonant relations. Such cognitions, “I smoke,” and, “Smoking causes serious dis- “insufficient justification” effects have been observed, under eases,” are dissonant because both cognitions are of sub- free-choice conditions, in a wide variety of situations, par- stantial importance to him and are inconsistent. To reduce ticularly when the person’s counterattitudinal behavior has this dissonance, he could quit smoking or, if this proved too aversive consequences for which he/she feels personally difficult, could cut back on cigarettes or switch to a brand responsible (Cooper and Fazio 1984; Harmon-Jones and with lower tar and nicotine. Similarly, he could question the Mills forthcoming; Lepper 1983). significance of “merely statistical” evidence regarding More recent research on cognitive dissonance has empha- smoking and disease processes, downplay the relevance or sized three additional issues. Some authors have focused on importance of such evidence to his personal situation, avoid the role of physiological arousal (e.g., Cooper and Fazio subsequent medical reports on the topic, and/or exaggerate 1984) and psychological discomfort (e.g., Elliot and Devine the pleasures and positive consequences of smoking (e.g., 1994) in the production and reduction of cognitive disso- how it helps him relax or control his weight). nance, showing the importance of the motivational factors Festinger (1957) used dissonance theory to account for a that distinguish dissonance theory from self-perception the- wide array of psychological phenomena, ranging from the 234 Distinctive Features ory (Bem 1967, 1972) and other nonmotivational alternative Festinger, L., Ed. (1964). Conflict, Decision, and Dissonance. explanations. Others have emphasized the importance of the Stanford, CA: Stanford University Press. self-concept in cognitive dissonance, arguing that disso- Festinger, L., and J. M. Carlsmith. (1959). Cognitive consequences of forced compliance. Journal of Abnormal and Social Psy- nance effects may depend on threats to one’s self-concept chology 58: 203–210. and may be alleviated by procedures that affirm the SELF Freedman, J. L., and D. O. Sears. (1965). Selective exposure. In L. (e.g., Steele 1988; Thibodeau and Aronson 1992). Berkowitz, Ed., Advances in Experimental Social Psychology, Most recently, computational models of dissonance vol. 2. New York: Academic Press, pp. 57–97. reduction have sought to quantify dissonance more precisely Frey, D. (1986). Recent research on selective exposure to informa- and have simulated many of the subtleties of psychological tion. In L. Berkowitz, Ed., Advances in Experimental Social findings (e.g., Read and Miller 1994; Shultz and Lepper Psychology, vol. 19. New York: Academic Press, pp. 41–80. 1996). These models use artificial NEURAL NETWORKS that Harmon-Jones, E. and J. Mills, Eds. (Forthcoming). Dissonance treat dissonance reduction as a gradual process of satisfying theory: Twenty-five years later. Washington, DC: American constraints imposed on the relationships among beliefs by a Psychological Association. Lepper, M. R. (1983). Social-control processes and the internaliza- motive for cognitive consistency. Their success suggests that tion of values: An attributional perspective. In E. T. Higgins, D. dissonance, rather than being exotic and unique, may have N. Ruble, and W. W. Hartup, Eds., Social Cognition and Social much in common with other psychological phenomena (e.g., Development. New York: Cambridge University Press, pp. memory retrieval or analogical reasoning) that can also be 294–330. understood in constraint-satisfaction terms. Read, S. J., and L. C. Miller. (1994). Dissonance and balance in The general success of dissonance theory—and the par- belief systems: The promise of parallel constraint satisfaction ticular power of the “reevaluation of alternatives” and processes and connectionist modeling approaches. In R. C. “insufficient justification” paradigms—seems to derive, in Schank and E. Langer, Eds., Beliefs, Reasoning, and Decision large part, from the breadth of the theory and from the ways Making: Psycho-logic in Honor of Bob Abelson. Hillsdale, NJ: that apparently “rational” consistency-seeking can, under Erlbaum, pp. 209–235. Shultz, T. R., and M. R. Lepper. (1996). Cognitive dissonance certain conditions, produce unexpectedly “irrational” reduction as constraint satisfaction. Psychological Review 103: changes in actions and attitudes. 219–240. See also ATTRIBUTION THEORY; DECISION MAKING; Steele, C. M. (1988). The psychology of self-affirmation: Sustain- MOTIVATION; MOTIVATION AND CULTURE; SOCIAL COGNI- ing the integrity of the self. In L. Berkowitz, Ed., Advances in TION Experimental Social Psychology, vol. 21. New York: Academic Press, pp. 261–302. —Mark R. Lepper and Thomas R. Shultz Thibodeau, R., and E. Aronson. (1992). Taking a closer look: Reasserting the role of the self-concept in dissonance theory. References Personality and Social Psychology Bulletin 18: 591–602. Wicklund, R. A., and J. Brehm. (1976). Perspectives on Cognitive Abelson, R. P., E. Aronson, W. J. McGuire, T. M. Newcomb, M. J. Dissonance. Hillsdale, NJ: Erlbaum. Rosenberg, and P. H. Tannenbaum, Eds. (1968). Theories of Cognitive Consistency: A Sourcebook. Chicago: Rand McNally. Distinctive Features Aronson, E. (1969). The theory of cognitive dissonance: A current perspective. In L. Berkowitz, Ed., Advances in Experimental Social Psychology, vol. 4. New York: Academic Press, pp. 1–34. Every speech sound shares some articulatory and acoustic Aronson, E., and J. M. Carlsmith. (1963). Effect of severity of properties with other speech sounds. For example, the con- threat on the devaluation of forbidden behavior. Journal of sonant [n] shares nasality with [m], complete oral closure Abnormal and Social Psychology 66: 584–588. with the set [pbmtdkg], and an elevated tongue-tip with the Bem, D. J. (1967). Self-perception: An alternative interpretation of set [tdsz]. cognitive dissonance phenomena. Psychological Review 74: Most contemporary theories of PHONOLOGY posit a uni- 183–200. Bem, D. J. (1972). Self-perception theory. In L. Berkowitz, Ed., versal set of distinctive features to encode these shared Advances in Experimental Social Psychology, vol. 6. New properties in the representation of the speech sounds them- York: Academic Press, pp. 1–62. selves. The hypothesis is that speech sounds are repre- Brehm, J. W. (1956). Post-decision changes in the desirability of sented mentally by their values for binary distinctive choice alternatives. Journal of Abnormal and Social Psychol- features, and that a single set of about twenty such fea- ogy 52: 384–389. tures suffices for all spoken languages. Thus, the distinc- Cooper, J., and R. H. Fazio. (1984). A new look at dissonance the- tive features, rather than the sounds built from them, are ory. In L. Berkowitz, Ed., Advances in Experimental Social the primitives of phonological description. The sound we Psychology, vol. 17. New York: Academic Press, pp. 229–266. write as [n] is actually a bundle of distinctive feature val- Ehrlich, D., I. Guttman, P. Schoenbach, and J. Mills. (1957). Post- ues, such as [+nasal], [–continuant] (complete oral clo- decision exposure to relevant information. Journal of Abnormal and Social Psychology 54: 98–102. sure), and [+coronal] (elevated tongue-tip). Elliot, A. J., and P. G. Devine. (1994). On the motivational nature Three principal arguments can be presented in support of of cognitive dissonance: Dissonance as psychological discom- this hypothesis: fort. Journal of Personality and Social Psychology 67: 382– 1. The union of the sound systems of all spoken languages 394. is a smaller set than the physical capabilities of the Festinger, L. (1957). A Theory of Cognitive Dissonance. Evanston, human vocal and auditory systems would lead one to IL: Row, Peterson. Distinctive Features 235 expect. The notion “possible speech sound” is defined by opments in specific areas. The most important is the higher-level cognitive requirements (the distinctive fea- emergence of autosegmental or nonlinear phonology, with tures) and not lower-level physiological considerations. its fundamental thesis that distinctive features are, like 2. Distinctive features help to explain the structure of sound TONES, independent objects not necessarily tied to any par- systems. For example, many languages have no sounds ticular speech sound. In the South American language Ter- from the set [bdg], but if a language has one of them, it is ena, the feature [+nasal] is, by itself, the first person prefix; likely to have all of them. These sounds are all [+voice] for example, [owoku] “house” becomes “my house” by (referring to the presence of vocal fold vibration); having attaching [+nasal] to the initial [owo] sequence. This freeing the full [bdg] set together in a language maximizes the of distinctive features from individual speech sounds has cross-classificatory effect of that distinctive feature. yielded new insights into the nature of the most common 3. PHONOLOGICAL RULES AND PROCESSES depend on the classes of sounds defined by distinctive feature values, phonological process, assimilation (where one sound takes and so the notion “possible phonological process” is, in on features from a nearby sound). part, determined by the universal feature theory. The A further evolution of the autosegmental view is the the- English plural suffix is a typical example. This suffix ory of feature geometry, which asserts that the distinctive agrees in the value of [voice] with the sound at the end of features are hierarchically organized into functionally the noun: [–voice] in caps, chiefs, cats, tacks versus related classes. The features that characterize states of the [+voice] in labs, shelves, pads, bags. This suffix is pro- larynx, for instance, appear to have a considerable degree of nounced with a vowel if the noun ends in a [+strident] functional cohesion in phonological systems. This leads to consonant, characterized by turbulent airflow and conse- the positing of a kind of metafeature [Laryngeal], which has quent [s]-like hissing noise: passes, roses, lashes, within its scope [voice] and other features. garages. Classes like these—[+voice], [–voice], and [+strident]—are frequently encountered in the phonolog- Along other lines, an improved understanding of feature ical processes of the world’s languages. In contrast, logi- theory has been achieved through the study of particular cally possible but featurally arbitrary classes like [pbsk] types of features (such as those pertaining to the larynx or to are rarely or never needed to describe phonological pro- degree of oral constriction) and of particular groups of cesses. speech sounds (such as the various [l]- and [r]-like sounds These considerations not only support the claim that of the world’s languages). Much has also been achieved by there must be some set of universal distinctive features; in considering alternatives to binary features, in the direction their particulars, they also serve as the principal basis for of both single-valued features (marked only by their pres- determining what is the correct set of distinctive features. ence) and ternary or higher-order features (which are partic- Primarily, arguments in support of a feature theory turn on ularly useful for characterizing some natural scales, like how well it explains the observed structure of sound sys- degree of oral constriction or tongue height). tems and of well-attested phonological processes. Second- Finally, research on SIGN LANGUAGES has showed that arily, the correct feature theory should support a plausible they too have representations composed of distinctive fea- interface between phonology on one hand and the PHONET- tures. Thus, while the distinctive features of spoken lan- ICS of ARTICULATION and SPEECH PERCEPTION on the other. guages are modality-specific, the existence of a featural This prioritization of phonological evidence over phonetic level of representation apparently is not. is appropriate because a theory of distinctive features is, See also AUDITION; INNATENESS OF LANGUAGE; PHO- above all, a claim about the mind and not about the mouth NOLOGY, ACQUISITION OF; PHONOLOGY, NEURAL BASIS OF or the ear. —John McCarthy The idea that speech sounds can be classified in phonolog- ically relevant ways goes back to antiquity, but the concept of References a universal classification is a product of the twentieth century. It emerges from the work of the prewar Prague School theo- Chomsky, N., and M. Halle. (1968). The Sound Pattern of English. New York: Harper and Row. (Reprinted MIT Press, 1991.) rists, principally N. S. Trubetzkoy and Roman JAKOBSON, Clements, G. N. (1985). The geometry of phonological features. who sought to explain the nature of possible phonological Phonology Yearbook 2: 225–252. contrasts in sound systems. The first fully elaborated theory Jakobson, R., C. G. M. Fant, and M. Halle. (1952). Preliminaries of distinctive features appeared with the publication in 1952 to Speech Analysis. Cambridge, MA: MIT Press. of Jakobson, Fant, and Halle’s Preliminaries to Speech Anal- Keating, P. (1987). A survey of phonological features. UCLA ysis. The Preliminaries features are defined in acoustic Working Papers in Phonetics 66: 124–150. terms; that is, they are descriptions of the spectral properties Ladefoged, P., and I. Maddieson. (1996). The Sounds of the of speech sounds. This model was largely superseded in 1968 World’s Languages. Oxford: Blackwell. by the distinctive feature system of Chomsky and Halle’s The McCarthy, J. J. (1988). Feature geometry and dependency: A Sound Pattern of English (SPE). Nearly all of the SPE fea- review. Phonetica 45: 84–108. tures are defined in articulatory terms; that is, they are Further Readings descriptions of vocal tract configurations during the produc- tion of speech sounds. Despite these differences of definition, Browman, C., and L. Goldstein. (1992). Articulatory Phonology: the empirical consequences of the SPE model do not differ An overview. Phonetica 49: 155–180. dramatically from the Preliminaries model. Clements, G. N., and E. Hume. (1995). The internal organization There has been no single broad synthesis of feature the- of speech sounds. In J. Goldsmith, Ed., The Handbook of Pho- ory since SPE, but there have been many significant devel- nological Theory. Oxford: Blackwell, pp. 245–306. 236 Distributed AI Farah 1994). Meanwhile, the concept of distribution has Gnanadesikan, A. (1997). Phonology with Ternary Scales. Ph.D. diss. University of Massachusetts, Amherst. found mathematical elaboration in fields such as optics and Goldsmith, J. A. (1990). Autosegmental and Metrical Phonology. psychology, and the rise of connectionist models has gener- Oxford: Blackwell. ated interest in a range of related technical and philosophi- Halle, M. (1983). On distinctive features and their articulatory cal issues. implementation. Natural Language and Linguistic Theory 1: In the most basic sense, a distributed representation is 91–105. one that is somehow “spread out” over some more-than- Hulst, H. v. d. (1989). Atoms of segmental structure: Components, minimal extent of the resources available for representing. gestures, and dependency. Phonology 6: 253–284. Unfortunately, however, this area is a semantic mess; the Lombardi, L. (1994). Laryngeal Features and Laryngeal Neutral- terms local and distributed are used in many different ways, ization. New York: Garland. often vaguely or ambiguously. Figure 1 sketches the most Padgett, J. (1995). Stricture in Feature Geometry. Stanford: CSLI Publications. common meanings. Sandler, W., Ed. (1993). Phonology: special issue on sign language Suppose that we have some quantity of resources avail- phonology. Phonology 10: 165–306. able for representing items, and that these resources are Schane, S. A. (1984). The fundamentals of particle phonology. naturally divisible into minimal chunks or aspects. Connec- Phonology Yearbook 1: 129–155. tionist neural processing units are obvious examples, but the Walsh, D. L. (1997). The Phonology of Liquids. Ph.D. diss. Uni- discussion here is pitched at a very abstract level, and the versity of Massachusetts, Amherst. term “unit” in what follows might just as well refer to bits in Williamson, K. (1977). Multivalued features for consonants. Lan- a digital computer memory, single index cards, synaptic guage 53: 843–871. interconnections, etc. • Strictly Local The item (in this case, the word “cat”) is Distributed AI represented by appropriately configuring a single dedi- cated unit. The state of the other units is irrelevant. See DISTRIBUTED VS. LOCAL REPRESENTATION; MULTI- • Distributed—basic notion The word is represented by a AGENT SYSTEMS distinctive configuration pattern over some subset or “pool” of the available resources (see Hinton, McClel- land, and Rumelhart 1986). A different word would be Distributed vs. Local Representation represented by an alternative pattern over that pool or another pool. Each unit in the pool participates in repre- A central problem for cognitive science is to understand senting the word; the state of units outside the pool are how agents represent the information that enables them to irrelevant. In a sparse (dense) distributed representation, a behave in sophisticated ways. One long-standing concern is small (large) proportion of units in the pool are config- whether representation is localized or distributed (roughly, ured in a non-default or “active” state (Kanerva 1988). “spread out”). Two centuries ago Franz Josef Gall claimed • Local The limiting case of a sparse distributed representa- that particular kinds of knowledge are stored in specific, dis- tion is one in which only a single unit in the pool is active. crete brain regions, whereas Pierre Flourens argued that all These representations are often also referred to as “local” knowledge is spread across the entire cortex (Flourens (e.g., Thorpe 1995). The key difference with strictly local 1824; Gall and Spurzheim 1809/1967). This debate has con- representations is that here it matters what state the other tinued in various guises through to the present day (e.g., units in the pool are in, viz., they must not be active. Figure 1. Seven ways to represent the word “cat,” illustrating varieties of local and distributed representation. Distributed vs. Local Representation 237 • Microfeatures Sometimes individual units are used to rep- In a famous critique of connectionist cognitive science, resent “microfeatures” of the domain in strictly local Fodor and Pylyshyn (1988) argued that connectionists must fashion. The pattern representing a given macro-level either implement “classical” architectures with their tradi- item is then determined by these microfeatural correspon- tional symbolic representations or fail to explain the alleged dences. In the example in Figure 1, individual units repre- “systematicity” of cognition. The standard connectionist sent the presence of a letter at a certain spot in the word; response has been to insist that they can in fact explain sys- the word “cat” is represented just in case the active units tematicity without merely implementing classical architec- are the ones for c in the first spot, a in the second spot, tures by using distributed representations encoding complex and t in the third spot. structures in a nonconcatenative fashion (e.g., Smolensky • Coarse Coding In these schemes the (micro or macro) 1991). features of the domain represented by individual units are Implicit in this connectionist response is the idea that dis- relatively broad, and overlapping. tributed representations and standard symbolic representa- tions are somehow deeply different in nature. For millennia, The reader seeking a detailed illustration of these ideas philosophers have attempted to develop a taxonomy of rep- may care to examine the well-known “verb-ending” paper resentations. At the highest level, they have usually distin- of Rumelhart and McClelland (1986). In that case, verb- guished just two major kinds—the generically linguistic or base and past-tense forms are represented by sparse distrib- symbolic, and the generically imagistic or pictorial. Is dis- uted patterns over pools of units. Individual units represent tribution just an accidental property of these more basic microfeatures (ordered triples of phonetic features) in kinds, or do distributed representations form a third funda- strictly local fashion. Because these triples overlap, the mental category? scheme is also coarse. Answers to questions like these obviously depend on exactly what we mean by “distributed.” The standard • Superimposition Two or more items are simultaneously approach, as exemplified in the preceding discussion, has represented by one and the same distributed pattern (Mur- been to define various notions of distribution in terms of dock 1979). For example, it is standard in feedforward structures of correspondence between the represented items connectionist networks for one and the same set of synap- and the representational resources (e.g., van Gelder 1992). tic weights to represent many associations between input This approach may be misguided; the essence of this alter- and output. native category of representation might be some other prop- • Equipotentiality In some cases, an item is represented by erty entirely. For example, Haugeland (1991) has suggested a pattern over a pool of units, and the pattern over any that whether a representation is distributed or not turns on subpool (up to some resolution limit) also suffices to rep- the nature of the knowledge it encodes. resent the item. Thus every part or aspect of the item is It has been argued that some of the most intransigent represented in superimposed fashion over the whole pool. problems confronting orthodox artificial intelligence are The standard example is the optical hologram (Leith and rooted in its commitment to representing knowledge by Uptanieks 1965); see also Plate’s “holographic reduced” means of digital symbol structures (Dreyfus 1992). If this is representations (Plate 1993). right, there must be some other form of knowledge repre- With these various distinctions on board, we can return sentation underlying human capacities. If distributed repre- to the central question: is human knowledge represented in sentation is indeed a fundamentally different form of distributed form? This question has been approached at a representation, it may be suited to playing this role (Hauge- number of levels, ranging from detailed neurophysiology to land 1978). pure philosophy of mind. Thus, neuroscientists have See also COGNITIVE ARCHITECTURE; COGNITIVE MODEL- debated whether the patterns of neural firing responsible for ING, CONNECTIONIST; COGNITIVE MODELING, SYMBOLIC; representing some external event are a matter of single cells CONNECTIONISM, PHILOSOPHICAL ISSUES; MENTAL REPRE- (Barlow 1972) or patterns of activity distributed over many SENTATION; NEURAL NETWORKS cells; if the latter, whether the patterns are sparse, dense, or —Tim van Gelder coarse-coded (e.g., Földiák and Young 1995). At a higher level, they have debated whether knowledge is distributed over large areas of the brain, perhaps in equipotential fash- References ion (LASHLEY 1929/1963), or whether at least some kinds of knowledge are restricted to tightly circumscribed regions Barlow, H. B. (1972). Single units and sensation. Perception 1: (Fodor 1983). 371–394. Dreyfus, H. L. (1992). What Computers Still Can’t Do: A Critique These issues have also been pursued in the context of of Artificial Reason. Cambridge MA: The MIT Press. computer-based cognitive modeling. Connectionists have Farah, M. (1994). Neuropsychological inference with an interac- paid considerable attention to the relative merits of distrib- tive brain. Behavioral and Brain Sciences 17: 43–61. uted versus local encoding in their networks. Advantages Flourens, P. (1824). Recherches Expérimentales sur les Propriétés of distribution are generally held to include greater repre- et les Fonctions du Systeme Nerveux. Paris: Grevot. sentational capacity, content addressibility, automatic gen- Fodor, J. A. (1983). The Modularity of Mind. Cambridge MA: eralization, fault tolerance, and biological plausibility. Bradford/MIT Press. Disadvantages include slow learning, catastrophic interfer- Fodor, J. A., and Z. Pylyshyn. (1988). Connectionism and cogni- ence (French 1992), and binding problems. tive architecture: A critical analysis. Cognition 28: 3–71. 238 Domain Specificity see Bates, Bretherton, and Snyder 1988). Other candidate Földiák, P., and M. P. Young. (1995). Sparse coding in the primate cortex. In M. A. Arbib, Ed., The Handbook of Brain Theory domains include (but are not limited to) number processing, and Neural Networks. Cambridge, MA: MIT Press, pp. 895– face perception, and spatial reasoning. The view that thought 898. is domain-specific contrasts with a long-held position that French, R. (1992). Semi-distributed representations and cata- humans are endowed with a general set of reasoning abilities strophic forgetting in connectionist networks. Connection Sci- (e.g., memory, attention, inference) that they apply to any ence 4: 365–377. cognitive task, regardless of specific content. For example, Gall, F. J., and J. G. Spurzheim. (1809/1967). Recherches sur le Jean PIAGET’s (1983) theory of cognitive development is a Systeme Nerveux. Amsterdam: Bonset. domain-general theory, according to which a child’s thought Haugeland, J. (1978). The nature and plausibility of cognitivism. at a given age can be characterized in terms of a single cog- Behavioral and Brain Sciences 1: 215–226. nitive level. In contrast, evidence for domain-specificity Haugeland, J. (1991). Representational genera. In W. Ramsey, S. P. Stich, and D. E. Rumelhart, Eds., Philosophy and Connection- comes from multiple sources, including variability in cogni- ist Theory. Hillsdale, NJ: Lawrence Erlbaum Associates, pp. tive level across domains within a given individual at a given 61–89. point in time (e.g., Gelman and Baillargeon 1983), neurop- Hinton, G. E., J. L. McClelland, and D. E. Rumelhart. (1986). Dis- sychological dissociations between domains (e.g., Baron- tributed representations. In D. E. Rumelhart and J. L. McClel- Cohen 1995), innate cognitive capacities in infants (Spelke land, Eds., Parallel Distributed Processing: Explorations in the 1994), evolutionary arguments (Cosmides and Tooby 1994), Microstructure of Cognition. Cambridge, MA: MIT Press, pp. ethological studies of animal learning (e.g., Marler 1991), 77–109. coherent folk theories (Gopnik and Wellman 1994), and Kanerva, P. (1988). Sparse Distributed Memory. Cambridge, MA: domain-specific performance in areas of expertise (Chase MIT Press. and Simon 1973). Lashley, K. S. (1929/1963). Brain Mechanisms and Intelligence: A Quantitative Study of Injuries to the Brain. New York: Dover. Domain-specificity is not a single, unified theory of the Leith, E. N., and J. Uptanieks. (1965). Photography by laser. Scien- mind. There are at least three distinct approaches to cogni- tific American 212(6): 24–35. tion that assume domain-specificity. These approaches in- Murdock, B. B. (1979). Convolution and correlation in perception clude modules, theories, and expertise (see Hirschfeld and and memory. In L. G. Nilsson, Ed., Perspectives on Memory Gelman 1994; Wellman and Gelman 1997). Research, pp. 609–626. Hillsdale, NJ: Lawrence Erlbaum The most powerful domain-specific approach is modu- Associates. larity theory, according to which the mind consists of “sep- Plate, T. A. (1993). Holographic recurrent networks. In C. L. Giles, arate systems [i.e., the language faculty, visual system, S. J. Hanson, and J. D. Cowan, Eds., Advances in Neural Pro- facial recognition module, etc.] with their own properties” cessing Systems 5 (NIPS92). San Mateo, CA: Morgan Kauf- (Chomsky 1988: 161). Proposals regarding modularity mann. Rumelhart, D. E., and J. L. McClelland. (1986). On learning the have varied in at least two respects: whether modularity is past tenses of English verbs. In J. L. McClelland, D. E. Rumel- restricted to perceptual processes or affects reasoning pro- hart, and The PDP Research Group, Eds., Parallel Distributed cesses as well, and whether modularity is innate or con- Processing: Explorations in the Microstructure of Cognition. structed. Modularity need not imply evolved innate vol. 2: Psychological and Biological Models. Cambridge, MA: modules (see Karmiloff-Smith 1992) but for most modular MIT Press, pp. 216–268. proponents it does. Nonetheless, all modularity views Smolensky, P. (1991). Connectionism, constituency, and the lan- assume domain-specificity. Chomsky’s focus was on lan- guage of thought. In B. Lower and G. Rey, Eds., Jerry Fodor guage, and more specifically SYNTAX or universal gram- and his Critics. Oxford: Blackwell. mar. Evidence for the status of syntax as a module was its Thorpe, S. (1995). Localized versus distributed representations. In innate, biologically driven character (evident in all and only M. A. Arbib, Ed., Handbook of Brain Theory and Neural Net- works. Cambridge, MA: MIT Press, pp. 549–552. humans), its neurological localization and breakdown (the van Gelder, T. J. (1990). Compositionality: A connectionist varia- selective impairment of syntactic competence in some tion on a classical theme. Cognitive Science 14: 355–384. forms of brain damage), its rapid acquisition in the face of van Gelder, T. J. (1991). What is the ‘D’ in ‘PDP’? An overview of meager environmental data (abstract syntactic categories the concept of distribution. In S. Stich, D. Rumelhart, and W. are readily acquired by young children), and the presence Ramsey, Eds., Philosophy and Connectionist Theory. Hillsdale, of critical periods and maturational timetables (see Pinker NJ: Lawrence Erlbaum Associates, pp. 33–59. 1994). van Gelder, T. J. (1992). Defining “distributed representation.” Fodor (1983) extended the logic of modules to cognitive Connection Science 4: 175–191. abilities more broadly. He distinguished between central log- ical processes and perceptual systems, arguing for modularity Domain Specificity of the latter. In Fodor’s analysis, modules are innately speci- fied systems that take in sensory inputs and yield necessary Cognitive abilities are domain-specific to the extent that the representations of them. The visual system as characterized mode of reasoning, structure of knowledge, and mechanisms by MARR (1982) provides a prototypical example: a system for acquiring knowledge differ in important ways across dis- that takes visual inputs and generates 2.5-dimensional repre- tinct content areas. For example, many researchers have con- sentations of objects and space. Like the visual system, by cluded that the ways in which language is learned and Fodor’s analysis, modules are innately specified, their pro- represented are distinct from the ways in which other cogni- cessing is mandatory and encapsulated, and (unlike central tive skills are learned and represented (Chomsky 1988; but knowledge and beliefs) their representational outputs are Domain Specificity 239 insensitive to revision via experience. Experience provides ways. They make different assumptions concerning what is specific inputs to modules, which yield mandatory represen- innate, the role of input, mechanisms of development, inter- tations of inputs. Certain experiential inputs may be neces- individual variability in performance, and what constitutes a sary to trigger working of the module in the first place, but the domain. For example, modular theories propose that mecha- processes by which the module arrives at its representations nisms of developmental change are biological constraints, are mandatory rather than revisable. theory theories propose that the relevant mechanisms are Extending Fodor, several writers have argued that certain causal-explanatory understandings, and expertise theories conceptual processes, not just perceptual ones, are modular propose that such mechanisms are information-processing (Karmiloff-Smith 1992; Sperber 1994) or supported by sys- skills. Nonetheless, they converge on the proposal that cog- tems of cognitive modules (e.g., Baron-Cohen 1995; Leslie nitive abilities are specialized to handle specific types of 1994). In these claims each module works independently, information. For critiques of domain-specificity, see Bates, achieving its own special representations. Thus, for the most Bretherton, and Snyder (1988) and Elman et al. (1996). part cognitive modules are like Fodor’s perceptual ones, See also FOLK PSYCHOLOGY; INNATENESS OF LAN- except that “perceptual processes have, as input, informa- GUAGE; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAM- tion provided by sensory receptors, and as output, a concep- MAR; MODULARITY OF MIND; NAIVE SOCIOLOGY; NATIVISM tual representation categorizing the object perceived . . . —Susan A. Gelman conceptual processes have conceptual representations both as input and as output” (Sperber 1994: 40). The claim that people ordinarily construct or possess folk References theories (as distinct from scientific theories) is a controver- Baron-Cohen, S. (1995). Mindblindness. Cambridge, MA: MIT sial one. However, everyday thought may be considered Press. theory-like in its resistance to counterevidence, ontological Bates, E., I. Bretherton, and L. Snyder. (1988). From First Words to commitments, attention to domain-specific causal princi- Grammar. New York: Cambridge University Press. ples, and coherence of beliefs (Carey 1985; Gopnik and Carey, S. (1985). Conceptual Change in Childhood. Cambridge, Wellman 1994). Like modules, folk theories are also MA: MIT Press. domain-specific. Folk theories make use of domain-specific Chase, W. G., and H. A. Simon. (1973). The mind’s eye in chess. ontologies (e.g., a folk theory of psychology concerns men- In W. G. Chase, Ed., Visual Information Processing. New York: tal entities such as beliefs and desires, whereas a folk theory Academic Press. of physics concerns physical entities such as objects and Chi, M. T. H. (1978). Knowledge structure and memory develop- substances). Folk theories also entail domain-specific causal ment. In R. Siegler, Ed., Children’s Thinking: What Develops? Hillsdale, NJ: Erlbaum, pp. 73–96. explanations (e.g., the law of gravity is not applied to mental Chi, M. T., J. E. Hutchinson, and A. F. Robin. (1989). How infer- states). However, in contrast to modules, which are gener- ences about novel domain-related concepts can be constrained ally assumed to be innately constrained, biologically deter- by structured knowledge. Merrill-Palmer Quarterly 35: 27–62. mined, and invariant, theories are thought to undergo radical Chomsky, N. (1988). Language and Problems of Knowledge. Cam- restructuring over time, and to be informed by knowledge bridge, MA: MIT Press. and cultural beliefs. On this construal of domain-specificity, Cosmides, L., and J. Tooby. (1994). Origins of domain specificity: candidate domains include psychology (also known as the- The evolution of functional organization. In L. A. Hirschfeld ory of mind; Wellman 1990), physics (McCloskey, Wash- and S. A. Gelman, Eds., Mapping the Mind: Domain Specificity burn, and Felch 1983), and biology (Keil 1994); see in Cognition and Culture. New York: Cambridge University Wellman and Gelman (1997). Press. Elman, J. L., E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Domain-specificity is also apparent in the remarkable Parisi, and K. Plunkett. (1996). Rethinking Innateness: A Con- pockets of skill that people develop as a result of extensive nectionist Perspective on Development. Cambridge, MA: MIT experience. With enough practice at a task (e.g., playing Press. chess, gathering factual knowledge about dinosaurs), an Fodor, J. A. (1983). Modularity of Mind. Cambridge, MA: MIT individual can develop extraordinary abilities within that Press. task domain. For example, experts can achieve unusual feats Gelman, R., and R. Baillargeon. (1983). A review of some Piagetian of MEMORY, reorganize knowledge into complex hierarchi- concepts. In J. H. Flavell and E. M. Markman, Eds., Handbook cal systems, and develop complex networks of causally of Child Psychology. Vol. 3. New York: Wiley, pp. 167–230. related information (Chi, Hutchinson, and Robin 1989). Gopnik, A., and H. M. Wellman. (1994). The theory theory. In L. A. These abilities are sufficiently powerful that child experts Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: Domain Specificity in Cognition and Culture. New York: Cambridge can even surpass novice adults, in contrast to the usual University Press. developmental finding of adults outperforming children Hirschfeld, L. A., and S. A. Gelman. (1994). Mapping the Mind: (e.g., Chi 1978). Importantly, EXPERTISE skills cannot be Domain Specificity in Cognition and Culture. New York: Cam- explained as individual differences in the general processing bridge University Press. talents of experts, because these achievements are limited to Karmiloff-Smith, A. (1992). Beyond Modularity. Cambridge, MA: the narrow task domain. For example, a chess expert dis- MIT Press. plays advanced memory for arrangements of pieces on a Keil, F. C. (1994). The birth and nurturance of concepts by domains: chessboard, but ordinary memory for digit strings. the origins of concepts of living things. In L. A. Hirschfeld and Modular, theory-theory, and expertise views of domain- S. A. Gelman, Eds., Mapping the Mind: Domain Specificity in specificity differ from one another in several fundamental Cognition and Culture. New York: Cambridge University Press. 240 Dominance in Animal Social Groups Leslie, A. M. (1994). ToMM, ToBy, and agency: Core architecture Hirschfeld, L. A., and S. A. Gelman. (1994). Toward a topography and domain specificity in cognition and culture. In L. A. Hirsch- of mind: An introduction to domain specificity. In L. A. Hirsch- feld and S. A. Gelman, Eds., Mapping the Mind: Domain Speci- feld and S. A. Gelman, Eds., Mapping the Mind: Domain Spec- ficity in Cognition and Culture. New York: Cambridge Univer- ificity in Cognition and Culture. New York: Cambridge sity Press. University Press. Marler, P. (1991). The instinct to learn. In S. Carey and R. Gelman, Kaiser, M. K., M. McCloskey, and D. R. Proffitt. (1986). Develop- Eds., The Epigenesis of Mind: Essays on Biology and Cogni- ment of intuitive theories of motion: Curvilinear motion in the tion. Hillsdale, NJ: Erlbaum. absence of external forces. Developmental Psychology 22: 67– Marr, D. (1982). Vision. New York: Freeman. 71. McCloskey, M., A. Washburn, and L. Felch. (1983). Intuitive phys- Karmiloff-Smith, A., and B. Inhelder. (1975). If you want to get ics: The straight-down belief and its origin. Journal of Experi- ahead, get a theory. Cognition 3: 195–211. mental Psychology: Learning, Memory, and Cognition 9: 636– Marini, Z., and R. Case. (1989). Parallels in the development of 649. preschoolers’ knowledge about their physical and social Piaget, J. (1983). Piaget’s theory. In W. Kessen, Ed., Handbook of worlds. Merrill-Palmer Quarterly 35: 63–86. Child Psychology. Vol. 1. New York: Wiley. Murphy, G. L., and D. L. Medin. (1985). The role of theories in Pinker, S. (1994). The Language Instinct. New York: Penguin conceptual coherence. Psychological Review 92: 289–316. Books. Sadock, J. M. (1991). Autolexical Syntax. Chicago: University of Spelke, E. S. (1994). Initial knowledge: Six suggestions. Cognition Chicago Press. 50: 431–445. Smith, L. B. (1995). Self-organizing processes in learning to learn Sperber, D. (1994). The modularity of thought and the epidemiol- words: Development is not induction. In C. A. Nelson, Ed., ogy of representations. In L. A. Hirschfeld and S. A. Gelman, Basic and Applied Perspectives on Learning, Cognition, and Eds., Mapping the Mind: Domain Specificity in Cognition and Development. Vol. 28. Mahwah, NJ: Erlbaum. Culture. New York: Cambridge University Press. Sternberg, R. J. (1989). Domain-generality versus domain- Wellman, H. M. (1990). The Child’s Theory of Mind. Cambridge, specificity: The life and impending death of a false dichotomy. Merrill-Palmer Quarterly 35: 115–130. MA: MIT Press. Thelen, E., and L. B. Smith. (1994). A Dynamic Systems Approach Wellman, H. M., and S. A. Gelman. (1997). Knowledge acquisi- to the Development of Cognition and Action. Cambridge, MA: tion in foundational domains. In D. Kuhn and R. S. Siegler, MIT Press. Eds., Handbook of Child Psychology. Vol. 2. New York: Wiley, Tomasello, M. (1995). Language: Not an instinct. Cognitive Devel- pp. 523–573. opment 10: 131–156. Turiel, E. (1989). Domain-specific social judgments and domain Further Readings ambiguities. Merrill-Palmer Quarterly 35: 89–114. Atran, S. (1995). Causal constraints on categores and categori- cal constraints on biological reasoning across cultures. In D. Dominance in Animal Social Groups Sperber, D. Premack, and A. J. Premack, Eds., Causal Cog- nition: A Multidisciplinary Debate. New York: Oxford Uni- versity Press, pp. 205–233. Social dominance refers to situations in which an individual Barkow, J. H., L. Cosmides, and J. Tooby, Eds. (1992). The or a group controls or dictates others’ behavior primarily in Adapted Mind: Evolutionary Psychology and the Generation of competitive situations. Generally, an individual or group is Culture. New York: Oxford University Press. said to be dominant when “a prediction is being made about Caramazza, A., A. Hillis, E. C. Leek, and M. Miozzo (1994). The the course of future interactions or the outcome of competi- organization of lexical knowledge in the brain: Evidence from tive situations” (Rowell 1974: 133). Criteria for assessing category- and modality-specific deficits. In L. A. Hirschfeld and assigning dominance relationships can vary from one and S. A. Gelman, Eds., Mapping the Mind: Domain Specificity in Cognition and Culture. New York: Cambridge University situation to another, even for studies of conspecifics (mem- Press. bers of the same species), and the burden is on researchers Chase, W. G., and K. A. Ericsson. (1981). Skilled memory. In J. R. to show that their methods are suitable for the situation at Anderson, Ed., Cognitive Skills and Their Acquisition. Hills- hand (Bekoff 1977; Chase 1980; Lehner 1996). It is difficult dale, NJ: Erlbaum. to summarize available data succinctly, but generally it has Cosmides, L. (1989). The logic of social exchange: Has natural been found that dominant individuals, when compared to selection shaped how humans reason? Studies with the Wason subordinate individuals, often have more freedom of move- selection task. Cognition 31: 187–276. ment, have priority of access to food, gain higher-quality Ericsson, K. A., Ed. (1996). The Road to Excellence: The Acquisi- resting spots, enjoy favorable grooming relationships, tion of Expert Performance in the Arts and Sciences, Sports, occupy more protected parts of a group, obtain higher- and Games. Mahwah, NJ: Erlbaum. Ericsson, K. A., and W. G. Chase. (1982). Exceptional memory. quality mates, command and regulate the attention of other American Scientist 70: 607–615. group members, and show greater resistance to stress and Gopnik, A., and A. N. Meltzoff. (1997). Words, Thoughts, and disease. Despite assertions that suggest otherwise, it really Theories. Cambridge, MA: MIT Press. is not clear how robust the relationship is between an indi- Gottlieb, G. (1991). Experiential canalization of behavioral vidual’s dominance status and its lifetime reproductive suc- developments: Results. Developmental Psychology 27: 35– cess (for comparative data see Dewsbury 1982; McFarland 39. 1982; Clutton-Brock 1988; Alcock 1993; Berger and Cun- Hermer, L., and E. Spelke. (1996). Modularity and development: ningham 1994; Altmann et al. 1996; Drickamer, Vessey and The case of spatial reorientation. Cognition 61: 195–232. Meikle 1996; Byers 1997; Frank 1997; Pusey, Williams, and Hirschfeld, L. A. (1996). Race in the Making. Cambridge, MA: Goodall 1997). There also can be costs associated with MIT Press. Dominance in Animal Social Groups 241 who to retaliate against, how and when to intervene, with dominance such that dominant individuals suffer because of whom to reciprocate; Tomasello and Call 1997) may involve stresses associated with the possibility of being overthrown having and using knowledge of others’ social ranks, observ- by more subordinate individuals or because while they are ing the outcomes of encounters between other individuals, defending their mates subordinates can sneak in and copu- and making deductions using this knowledge in the absence late with them (Wilson 1975; Hogstad 1987). of personal experience. Social knowledge of self and others In practice, the concept of social dominance has proven (in the absence of personal experience) may also be impor- to be ubiquitous but slippery (Rowell 1974; Bernstein tant in reconciliation, but detailed comparative data are scant 1981). Some researchers have questioned if dominance rela- (de Waal 1988, 1989; Harcourt and de Waal 1992; Silk, tionships are actually recognized by the animals themselves Cheney, and Seyfarth 1996). or if they are constructed by the human observers. Some All in all, insight and foresight (planning) seem to be also question if dominance hierarchies widely exist in important skills that are shown by a variety of nonhuman nature or if they are due to the stresses associated with liv- primates and nonprimates in their social encounters, but the ing in captivity (where much research is performed; see, for comparative database is too small to support any general example, Rowell 1974). Others also feel that a lack of corre- conclusions about whether individuals really do use insight lation between dominance in different contexts (for exam- and planning in their social interactions with others. Thus, ple, the possession of food, the acquisition or retention of a broadly comparative and detailed research on the cognitive mate or a resting place) or in different locations argues aspects of social dominance is sorely needed. These efforts against its conceptual utility (but see Hinde 1978). Nonethe- will also inform other areas in the cognitive arena, including less, many who have casually observed or carefully studied general questions about whether individuals have a theory various animals agree that social dominance exists in simi- of mind—whether they make attributions about the mental lar forms and serves many of the same functions in widely states of others and use this information in their own social diverse taxa, ranging from invertebrates to vertebrates encounters. including humans, and that dominance relationships among individuals are powerful organizing principles for animal See also COGNITIVE ETHOLOGY; COOPERATION AND social systems and population dynamics. COMPETITION; ETHOLOGY; PRIMATE COGNITION; SOCIAL Based on, and expanding from, the classical studies of COGNITION; SOCIAL COGNITION IN ANIMALS Schjelderup-Ebbe (1922) on dominance hierarchies in chick- —Marc Bekoff ens, three basic types of hierarchies are usually recognized: (i) linear hierarchies (pecking-orders), usually in groups of References fewer than ten individuals in which all paired relationships among individuals are transitive, such that if individual A Alcock, J. (1993). Animal Behavior: An Evolutionary Approach. 5th ed. Sunderland, MA: Sinauer Associates, Inc. dominates (>) individual B, and B > C, then A > C (wasps, Altmann, J., S. C. Alberts, S. A. Haines, J. Dubach, P. Muruthi, bumblebees, chaffinches, turkeys, magpies, cows, ponies, T. Cooter, E. Geffen, D. J. Cheeseman, R. S. Mututua, S. N. coyotes, various nonhuman primates); (ii) nonlinear hierar- Saiyalel, R. K. Wayne, R. C. Lacy, and M. W. Bruford. chies in which there is at least one nontransitive relationship; (1996). Behavior predicts genetic structure in a wild primate and (iii) despotisms in which one individual (the alpha) in a group. Proceedings of the National Academy of Sciences 93: group dominates all other individuals among whom domi- 5797–5801. nance relationships are indistinguishable. Many papers con- Bekoff, M. (1977). Quantitative studies of three areas of classical cerned with historical aspects of social dominance in animals ethology: Social dominance, behavioral taxonomy, and behav- are reprinted in Schein (1975). ioral variability. In B. A. Hazlett, Ed., Quantitative Methods in Although there has been little empirical experimental the Study of Animal Behavior. New York: Academic Press, pp. 1–46. research done on cognitive aspects of, for example, how Berger, J., and C. Cunningham. (1994). Bison: Mating and Con- dominance status is recognized and represented in animals’ servation in Small Populations. New York: Columbia Univer- minds, there are preliminary data that show that some ani- sity Press. mals have and use knowledge of other individuals’ social Bernstein, I. S. (1981). Dominance: The baby and the bathwater. ranks in their social interactions, and that individuals seem to Behavioral and Brain Sciences 4: 419–458. agree on their ranking of others (Cheney and Seyfarth 1990; Byers, J. A. (1997). American Pronghorn: Social Adaptations and de Waal 1996; Tomasello and Call 1997). For example, when the Ghost of Predators Past. Chicago: University of Chicago adult female vervet monkeys compete for grooming partners Press. in their social group, individuals appear to rank one another Chase, I. D. (1980). Social process and hierarchy formation in and to agree on the rankings of the most preferred grooming small groups: A comparative perspective. American Sociologi- cal Review 45: 905–924. partners. The understanding of dominance relationships— Cheney, D. L., and R. M. Seyfarth. (1990). How Monkeys See the having and using the social knowledge needed for making World: Inside the Mind of Another Species. Chicago: University evaluations and decisions—might entail constructing ordinal of Chicago Press. relationships and transitivity concerning the relationships Clutton-Brock, J. H., Ed. (1988). Reproductive Success: Studies of among individuals with whom one has and has not had per- Individual Variation in Contrasting Breeding Systems. Chi- sonal experience (Cheney and Seyfarth 1990; Tomasello and cago: University of Chicago Press. Call 1997), but the phylogenetic distribution of these skills de Waal, F. (1988). The reconciled hierarchy. In M. R. A. Chance, remains to be determined. Certainly, the formation of alli- Ed., Social Fabrics of the Mind. Hillsdale, NJ: Erlbaum, pp. ances and coalitions (who to recruit, how to solicit them, 105–136. 242 Dreaming de Waal, F. (1989). Dominance ‘style’ and primate social organi- zation. In V. Standen and R. A. Foley, Eds., Comparative Socioecology: The Behavioural Ecology of Humans and Other Animals. Oxford: Blackwell Scientific Publications, pp. 243– 263. de Waal, F. (1996). Good Natured: The Origins of Right and Wrong in Humans and Other Animals. Cambridge, MA: Har- vard University Press. Dewsbury, D. A. (1982). Dominance rank, copulatory behavior, and differential reproduction. Quarterly Review of Biology 57: 135–159. Drickamer, L. C., S. H. Vessey, and D. Meikle. (1996). Animal Behavior: Mechanisms, Ecology, and Evolution. Dubuque, IA: Wm. C. Brown Publishers. Frank, L. (1997). Evolution of genital masculinization: Why do female hyaenas have such a large ‘penis’? Trends in Ecology and Evolution 12: 58–62. Harcourt, A. H., and F. B. M. de Waal, Eds. (1992). Coalitions and Alliances in Humans and Other Animals. New York: Oxford University Press. Hinde, R. A. (1978). Dominance and role: Two concepts with dual meanings. Journal of Social and Biological Structure 1: 27–38. Hogstad, O. (1987). It is expensive to be dominant. Auk 104: 333– 336. Lehner, P. N. (1996). Handbook of Ethological Methods 2nd ed. New York: Cambridge University Press. McFarland, D., Ed. (1982). The Oxford Companion to Animal Behavior. New York: Oxford University Press. Pusey, A., J. Williams, and J. Goodall. (1997). The influence of dominance rank on the reproductive success of female chim- panzees. Science 277: 828–831. Rowell, T. (1974). The concept of social dominance. Behavioral Biology 11: 131–154. Schein, M. W., Ed. (1975). Social Hierarchy and Dominance. Figure 1. The Activation-Synthesis model. Systems and synaptic Stroudsberg, PA: Dowden, Hutchinson and Ross. model. As a result of disinhibition caused by cessation of Schjelderup-Ebbe, T. (1922). Contributions to the social psychol- aminergic neuronal firing, brainstem reticular systems ogy of the domestic chicken. Zeitschrift für Psychologie 88: autoactivate. Their outputs have effects including depolarization 225–252. Translated in Schein (1975). of afferent terminals causing phasic presynaptic inhibition and Silk, J. B., D. L. Cheney, and R. M. Seyfarth. (1996). The form and blockade of external stimuli, especially during the bursts of REM, function of post-conflict interactions between female baboons. and postsynaptic hyperpolarization causing tonic inhibition of Animal Behaviour 52: 259–268. motorneurons that effectively counteract concomitant motor Tomasello, M., and T. Call. (1997). Primate Cognition. New York: commands so that somatic movement is blocked. Only the Oxford University Press. oculomotor commands are read out as eye movements because Wilson, E. O. (1975). Sociobiology: The New Synthesis. Cam- these motorneurons are not inhibited. The forebrain, activated by bridge, MA: Harvard University Press. the reticular formation and also aminergically disinhibited, receives efferent copy or corollary discharge information about Dreaming somatic motor and oculomotor commands from which it may synthesize such internally generated perceptions as visual imagery and the sensation of movement, both of which typify Mental activity does not cease at the onset of sleep. Current dream mentation. The forebrain may, in turn, generate its own scientific evidence suggests instead that it is virtually con- motor commands that help to perpetuate the process via positive tinuous throughout sleep but that its level of intensity and its feedback to the reticular formation. formal characteristics change as the brain changes its state with the periodic recurrence of rapid eye movement (REM) and non-REM (NREM) sleep phases. whelm the mind and cause awakening. The discovery of the Until the discovery of REM sleep by Eugene Aserinsky association of dreaming with REM sleep allowed a quite dif- and Nathaniel Kleitman in 1953, interest in the psychology ferent approach. Emphasis suddenly shifted from the attempt of dreaming was restricted to speculative accounts of its dis- to analyze the content to an attempt to explain the formal tinctive phenomenology that were linked to schematic efforts aspects of the distinctive phenomenology in terms of under- to interpret dream content. The best known example of this lying brain activity. kind of theorizing is the psychoanalytic model of Sigmund This article gives a summary of how the cellular and FREUD, which held that dream bizarreness was the result of molecular changes in the brain which distinguish waking, the mind’s effort to disguise and censor unconscious wishes NREM and REM sleep can be used to account for the con- released in sleep that in their unaltered form would over- comitant shifts in mental state that result in the shift in con- Dreaming 243 the reciprocal interaction model of brain state control first sciousness from waking to dreaming. (See also SLEEP for advanced in 1975. relevant background information.) Using microelectrode recording techniques to sample Whether subjects are aroused from sleep in a laboratory individual cell activity during natural sleep and waking in setting or awaken spontaneously at home, they give reports animal models, it has been possible to show that the neuro- of preawakening mental experience that are quite different if modulatory systems of the brain stem behave very differ- their brain state is REM than if it is non-REM. REM-sleep ently in waking and REM sleep. These differences help to dream reports are seven times longer and are far more likely account for the distinctive psychological features of dream- to describe formed sensory perceptions and vivid visual ing, especially the bizarreness and recent memory loss. Dur- images than are the reports of NREM dreams, which tend to ing waking, cells of the noradrenergic locus coeruleus and be more thoughtlike and dull. REM sleep reports are also far the serotonergic raphe nuclei are tonically active, but in more likely to be animated, with descriptions of walking, REM they are shut off. This means that the activated brain running, playing sports, or even flying. Finally, the REM- of REM is aminergically demodulated so that it cannot pro- sleep dream scenarios are accompanied by strong emotions cess information in the same way as it does in waking. such as anxiety, elation, and anger, all of which bear a close Dream bizarreness and dream amnesia are both the result of relationship to details of the plot. this neuromodulatory defect. Compounding this difference, These formal features of dreaming correlate well with the pontine cholinergic neurones become reciprocally acti- changes in the activation level of the brain as measured by vated in REM, and their intense phasic activity conveys eye the degree of low-voltage, high-frequency power in the movement-related information to the visual sensory and sleep electroencephalogram, and they are negatively corre- motor areas of the brain (accounting for hallucinated dream lated with high voltage, slow EEG patterns. Because the vision and movement) and to the amygdala (accounting for high-voltage, slow-wave activity of NREM sleep is most the emotion of dreams). intense and prolonged in the first half of the night, reports The specification of these neurochemical differences from awakenings performed then are more likely to show enables a three-dimensional state space model to be con- differences from REM reports than are those from the sec- structed that integrates activation level (A), input-output ond half of the night. Brain activation is therefore an easily gating (I) with the brain modulatory factor (M). This hybrid understandable determinant of dream length and visual psychophysiological construct thus updates both activation intensity. Dreamlike mentation may also emerge at sleep synthesis and reciprocal interaction by representing the onset when the brain activation level is just beginning to energy level (A) information source (I) and processing fall. Sleep-onset dreaming is likely to be evanescent and mode (M) of the brain mind as a single point that continu- fragmentary, with less vivid imagery, less strong emotion, ously moves through the state space as a function of the val- and a less well developed story line than in REM-sleep ues of A, I, and M. dreaming. According to AIM, dreaming is most likely to occur Collaborating with the still high activation level to pro- when activation is high, when the information source shifts duce sleep-onset dreaming is the rapidly rising threshold to from external to internal, and when the neuromodulatory external sensory stimulation. This factor allows internal balance shifts from aminergic to cholinergic. Because these stimuli to dominate the brain. In REM sleep internal stimuli shifts may occur gradually or suddenly, it is not surprising also protect the brain from external sensory influence. If the that the correlation of physiology with psychology is also stimulus level is raised to sufficiently high levels, external statistical. REM is the most highly conducive to dreaming, information can be incorporated into dream plots, but the but it can also occur at sleep onset and NREM sleep, both of critical window for such incorporation is narrow and exter- which fulfill some of the necessary physiological condi- nal stimuli more commonly interrupt dreaming by causing tions. awakening. When dreams are interrupted in this way, recall Our natural skepticism about the relevance of animal of dreaming is markedly enhanced to levels as high as 95 model data for human psychophysiology has been partially percent if the subject is aroused from REM sleep during a dispelled by recent POSITRON EMISSION TOMOGRAPHY (PET) cluster of rapid eye movements. The strong correlation between dreaming and REM sleep studies of the human brain, which reveal significant regional has encouraged attempts to model the brain basis of dream- changes in activation level during REM sleep compared to ing at the cellular and molecular level. The activation- waking. The subjects of these studies all reported dreams synthesis hypothesis, first put forward in 1977, ascribed after awakening from REM-sleep in the scanner. First and dreaming to activation of the brain in REM sleep by a well- foremost is activation of the pontine brain stem, the pre- specified pontine brain stem mechanism. sumed organizer of the REM-sleep brain. Second is the Such distinctive aspects of dream mentation as vivid selective activation of the limbic forebrain and paralimbic visual hallucinations, a constant sense of movement, and cortex, the supposed mediator of dream emotion. Third is the strong emotion were ascribed to internal stimulation of selective inactivation of the dorsal prefrontal cortex, a brain visual, motor, and limbic regions of the upper brain by sig- region essential to self-reflective awareness and to execu- nals of brain stem origin. The bizarreness of dream cogni- tively guided thought, judgment, and action. Both of these tion, with its characteristic instability of time, place, and cognitive functions are markedly deficient in dreaming. person, was thought to be due to the chaotic nature of the Unfortunately, imaging techniques do not have the spa- autoactivation process and to the failure of short-term mem- tial or molecular resolution necessary to confirm the neuro- ory caused by the chemical changes in REM described by modulatory hypothesis of AIM. But an extensive body of 244 Dynamic Approaches to Cognition References and Further Readings 2A Aserinsky, E., and N. Kleitman. (1953). Regularly occurring peri- ods of eye motility and concomitant phenomena during sleep. Science 118: 273–274. Foulkes, D. (1985). Dreaming: A Cognitive-Psychological Analy- sis. Mahwah, NJ: Erlbaum. Freud, S. (1900). The Interpretation of Dreams. Trans. J. Strachey. New York: Basic Books. Hobson, J. A. (1988). The Dreaming Brain. New York: Basic Books. Hobson, J. A. (1990). Activation, input source, and modulation: A neurocognitive model of the state of the brain-mind. In R. Bootzin, J. Kihlstrom, and D. Schacter, Eds., Sleep and Cogni- tion. Washington, DC: American Psychological Association, pp. 25–40. Hobson, J. A. (1994). The Chemistry of Conscious States. Boston: Little Brown. Hobson, J. A., and R. W. McCarley. (1977). The brain as a dream- state generator: An activation-synthesis hypothesis of the dream process. Am. J. Psychiat. 134: 1335–1348. Hobson, J. A., E. Hoffman, R. Helfand, and D. Kostner. (1987). 2B Dream bizarreness and the activation-synthesis hypothesis. Hu- man Neurobiology 6: 157–164. Llinas, R., and D. Pare. (1991). Commentary on dreaming and wakefulness. Neuroscience 44: 521–535. Solms, M. (1997). The Neuropsychology of Dreams: A Clinico- Anatomical Study. Mahwah, NJ: Erlbaum. Dynamic Approaches to Cognition The dynamical approach to cognition is a confederation of research efforts bound together by the idea that natural cog- nition is a dynamical phenomenon and best understood in dynamical terms. This contrasts with the “law of qualitative structure” (Newell and Simon 1976) governing orthodox or Figure 2a. Three-dimensional state space defined by the values for “classical” cognitive science, which holds that cognition is a brain activation (A), input source and strength (I), and mode of form of digital COMPUTATION. processing (M). It is theoretically possible for the system to be at The idea of mind as dynamical can be traced as far back any point in the state space, and an infinite number of state as David HUME, and it permeates the work of psychologists conditions is conceivable. In practice the system is normally such as Lewin and Tolman. The contemporary dynamical constrained to a boomerang-like path from the back upper right in approach, however, is conveniently dated from the early cy- waking (high A, I, and M), through the center in NREM (intermediate A, I, and M) to the front lower right in REM sleep bernetics era (e.g., Ashby 1952). In subsequent decades dynamical work was carried out within programs as diverse Figure 2b. (A) Movement through the state space during the sleep as ECOLOGICAL PSYCHOLOGY, synergetics, morphodynam- cycle. (B) Segments of the state space associated with some normal, ics, and neural net research. In the 1980s, three factors— pathological, and artificial conditions of the brain. growing dissatisfaction with the classical approach, devel- opments in the pure mathematics of nonlinear dynamics, human psychopharmacological data is consonant with the and increasing availability of computer hardware and soft- basic assumptions of the model. Drugs that act as aminergic ware for simulation—contributed to a flowering of dynami- agonists (or reuptake blockers) first suppress REM and cal research, particularly in connectionist form (Smolensky REM-sleep dreaming. When they are later withdrawn, a 1988). By the 1990s, it was apparent that the dynamical ap- marked and unpleasant intensification of dreaming and even proach has sufficient power, scope, and cohesion to count as psychosis may occur. If those drugs also possess anticholin- a research paradigm in its own right (Port and van Gelder ergic actions, the effects on dreaming are even more pro- 1995). nounced. Finally, and most significantly, human REM-sleep In the prototypical case, the dynamicist focuses on some dreaming is potentiated by some of the same cholinergic particular aspect of cognition and proposes an abstract dy- agonist drugs that experimentally enhance REM sleep in namical system as a model of the processes involved. The animals. behavior of the model is investigated using dynamical sys- See also CONSCIOUSNESS; CONSCIOUSNESS, NEUROBIOL- tems theory, often aided by simulation on digital computers. OGY OF; LIMBIC SYSTEM; NEUROTRANSMITTERS A close match between the behavior of the model and em- pirical data on the target phenomenon confirms the hypothe- —J. Allan Hobson Dynamic Approaches to Cognition 245 sis that the target is itself dynamical in nature, and that it can sicists typically set such considerations aside (Clark 1997). be understood in the same dynamical terms. Dynamicists, by contrast, tend to see cognitive processes as Consider, for example, how we make decisions. One collective achievements of brains in bodies in contexts. possibility is that in our heads there are symbols represent- Their language—dynamics—can be used to describe change ing various options and the probabilities and values of in the environment, bodily movements, and neurobiological their outcomes; our brains then crank through an ALGO- processes (e.g., Bingham 1995; Wright and Liley 1996). RITHM for determining a choice (see RATIONAL DECISION- This enables them to offer integrated accounts of cognition MAKING). But this classical approach has difficulty as a dynamical phenomenon in a dynamical world. accounting for the empirical data, partly because it cannot In classical cognitive science, symbolic representations accommodate temporal issues and other relevant factors and their algorithmic manipulations are the basic building such as affect and context. Dynamical models treat the blocks. Dynamical models usually also incorporate repre- process of DECISION-MAKING as one in which numerical sentations, but reconceive them as dynamical entities (e.g., variables evolve interactively over time. Such models, it is system states, or trajectories shaped by attractor landscapes). claimed, can explain a wider range of data and do so more Representations tend to be seen as transient, context- accurately (see, e.g., Busemeyer and Townsend 1993; dependent stabilities in the midst of change, rather than as Leven and Levine 1996). static, context-free, permanent units. Interestingly, some A better understanding of dynamical work can be gained dynamicists claim to have developed wholly representation- by highlighting some of its many differences with classical free models, and they conjecture that representation will turn cognitive science. Most obviously, dynamicists take cogni- out to play much less of a role in cognition than has tradi- tive agents to be dynamical systems as opposed to digital tionally been supposed (e.g., Skarda 1986; Wheeler forth- conputers. A dynamical system for current purposes is a set coming). of quantitative variables changing continually, concurrently, The differences between the dynamical and classical and interdependently over quantitative time in accordance approaches should not be exaggerated. The dynamical with dynamical laws described by some set of equations. approach stands opposed to what John Haugeland has called Hand in hand with this first commitment goes the belief that “Good Old Fashioned AI” (Haugeland 1985). However, dynamics provides the right tools for understanding cogni- dynamical systems may well be performing computation in tive processes. Dynamics in this sense includes the tradi- some other sense (e.g., analog computation or “real” com- tional practice of dynamical modeling, in which scientists putation; Blum, Shub, and Smale 1989; Siegelmann and attempt to understand natural phenomena via abstract dy- Sontag 1994). Also, dynamical systems are generally effec- namical models; such modeling makes heavy use of calcu- tively computable. (Note that something can be computable lus and differential or difference equations. It also includes without being a digital computer.) Thus, there is consider- dynamical systems theory, a set of concepts, proofs, and able middle ground between pure GOFAI and an equally methods for understanding the behavior of systems in gen- extreme dynamicism (van Gelder 1998). eral and dynamical systems in particular. A central insight How does the dynamical approach relate to connection- of dynamical systems theory is that behavior can be under- ism? In a word, they overlap. Connectionist networks are stood geometrically, that is, as a matter of position and generally dynamical systems, and much of the best dynami- change of position in a space of possible overall states of the cal research is connectionist in form (e.g., Beer 1995). How- system. The behavior can then be described in terms of ever, the way many connectionists structure and interpret attractors, transients, stability, coupling, bifurcations, their systems is dominated by broadly computational precon- chaos, and so forth—features largely invisible from a clas- ceptions (e.g., Rosenberg and Sejnowski 1987). Conversely, sical perspective. many dynamical models of cognition are not connectionist Dynamicists and classicists also diverge over the general networks. Connectionism is best seen as straddling a more nature of cognition and cognitive agents. The pivotal issue fundamental opposition between dynamical and classical here is probably the role of time. Although all cognitive sci- cognitive science. entists understand cognition as something that happens over Chaotic systems are a special sort of dynamical system, time, dynamicists see cognition as being in time, that is, as and chaos theory is just one branch of dynamics. So far, an essentially temporal phenomenon. This is manifested in only a small proportion of work in dynamical cognitive sci- many ways. The time variable in dynamical models is not a ence has made any serious use of chaos theory. Therefore mere discrete order, but a quantitative, sometimes continu- the dynamical approach should not be identified with the ous approximation to the real time of natural events. Details use of chaos theory or related notions such as fractals. Still, of timing (durations, rates, synchronies, etc.) are taken to be chaotic dynamics surely represents a frontier of fascinating essential to cognition itself rather than incidental details. possibilities for cognitive science (Garson 1996). Cognition is seen not as having a sequential cyclic (sense- The dynamical approach stands or falls on its ability to think-act) structure, but rather as a matter of continuous and deliver the best models of particular aspects of cognition. In continual coevolution. The subtlety and complexity of cog- any given case its ability to do this is a matter for debate nition is found not at a time in elaborate static structures, among the relevant specialists. Currently, many aspects of but rather in time in the flux of change itself. cognition—e.g., story comprehension—are well beyond the Dynamicists also emphasize SITUATEDNESS/EMBEDDED- reach of dynamical treatment. Nevertheless, a provisional consensus seems to be emerging that some significant range NESS. Natural cognition is always environmentally embed- of cognitive phenomena will turn out to be dynamical, and ded, corporeally embodied, and neurally “embrained.” Clas- 246 Dynamic Programming that a dynamical perspective enriches our understanding of Haken, H., and M. Stadler, Eds. (1990). Synergetics of Cognition. Berlin: Springer. cognition more generally. Horgan, T. E., and J. Tienson. (1996). Connectionism and the Phi- See also COGNITIVE MODELING, CONNECTIONIST; COM- losophy of Psychology. Cambridge, MA: MIT Press. PUTATION AND THE BRAIN; COMPUTATIONAL THEORY OF Jaeger, H. (1996). Dynamische systeme in der kognitionswissen- MIND; CONNECTIONIST APPROACHES TO LANGUAGE; NEU- schaft. Kognitionswissenschaft 5: 151–174. RAL NETWORKS; RULES AND REPRESENTATIONS Kelso, J. A. S. (1995). Dynamic Patterns: The Self-Organization of Brain and Behavior. Cambridge, MA: MIT Press. —Tim van Gelder Port, R., and T. J. van Gelder. (1995). Mind as Motion: Explora- tions in the Dynamics of Cognition. Cambridge, MA: MIT References Press. Sulis, W., and A. Combs, Eds. (1996). Nonlinear Dynamics in Ashby, R. (1952). Design for a Brain. London: Chapman and Hall. Human Behavior. Singapore: World Scientific. Beer, R. D. (1995). A dynamical systems perspective on agent- Thelen, E., and L. B. Smith. (1993). A Dynamics Systems Ap- environment interaction. Artificial Intelligence 72: 173–215. proach to the Development of Cognition and Action. Cam- Bingham, G. (1995). Dynamics and the problem of event recogni- bridge, MA: MIT Press. tion. In R. Port and T. van Gelder, Eds., Mind as Motion: Vallacher, R., and A. Nowak, Eds. (1993). Dynamical Systems in Explorations in the Dynamics of Cognition. Cambridge, MA: Social Psychology. New York: Academic Press. MIT Press. Wheeler, M. (Forthcoming). The Next Step: Beyond Cartesianism Blum, L., M. Shub, and S. Smale. (1989). On a theory of computa- in the Science of Cognition. Cambridge MA: MIT Press. tion and complexity over the real numbers: NP completeness, recursive functions and universal machines. Bulletin of the Dynamic Programming American Mathematical Society 21: 1–49. Busemeyer, J. R., and J. T. Townsend. (1993). Decision field theory: A dynamic-cognitive approach to decision making in Some problems can be structured into a collection of small an uncertain environment. Psychological Review 100: 432– problems, each of which can be solved on the basis of the 459. solution of some of the others. The process of working a Clark, A. (1997). Being There: Putting Brain, Body and World solution back through the subproblems in order to reach a Together Again. Cambridge MA: MIT Press. final answer is called dynamic programming. This general Garson, J. (1996). Cognition poised at the edge of chaos: a com- algorithmic technique is applied in a wide variety of areas, plex alternative to a symbolic mind. Philosophical Psychology from optimizing airline schedules to allocating cell-phone 9: 301–321. bandwidth to justifying typeset text. Its most common and Haugeland, J. (1985). Artificial Intelligence: The Very Idea. Cam- bridge MA: MIT Press. relevant use, however, is for PLANNING optimal paths Leven, S. J., and D. S. Levine. (1996). Multiattribute decision through state-space graphs, in order, for example, to find the making in context: A dynamic neural network methodology. best routes between cities in a map. Cognitive Science 20: 271–299. In the simplest case, consider a directed, weighted graph Newell, A., and H. Simon. (1976). Computer science as empirical < S, A, T, L>, where S is the set of nodes or “states” of the enquiry: Symbols and search. Communications of the Associa- graph, and A is a set of arcs or “actions” that may be taken tion for Computing Machinery 19: 113–126. from each state. The state that is reached by taking action a Port, R., and T. J. van Gelder. (1995). Mind as Motion: Explora- in state s is described as T(s,a); the positive length of the a tions in the Dynamics of Cognition. Cambridge, MA: MIT arc from state s is written L(s,a). Let g ∈ S be a desired goal Press. state. Given such a structure, we might want to find the Rosenberg, C. R., and T. J. Sejnowski. (1987). Parallel networks that learn to pronounce English text. Complex Systems 1. shortest path from a particular state to the goal state, or even Siegelmann, H. T., and E. D. Sontag. (1994). Analog computation to find the shortest paths from each of the states to the goal. via neural networks. Theoretical Computer Science 131: 331– In order to make it easy to follow shortest paths, we will 360. use dynamic programming to compute a distance function, Skarda, C. A. (1986). Explaining behavior: Bringing the brain D(s), that gives the distance from each state to the goal state. back in. Inquiry 29: 187–202. The ALGORITHM is as follows: Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences 11: 1–74. D(s): = large van Gelder, T. J. (1998). The dynamical hypothesis in cognitive D(g): = 0 science. Behavioral and Brain Sciences 21: 1–14. Loop |S| times Wheeler, M. (Forthcoming). The Next Step: Beyond Cartesianism Loop for s in S in the Science of Cognition. Cambridge MA: MIT Press. D(s): = mina ∈ AL(s,a) + D(T(s,a)) Wright, J. J., and D. T. J. Liley. (1996). Dynamics of the brain at end loop global and microscopic scales: Neural networks and the EEG. end loop Behavioral and Brain Sciences 19. We start by initializing D(s) = large to be an overestimate of Further Readings the distance between s and g (except in the case of D(g), for which it is exact). Now, we want to improve iteratively the Giunti, M. (1997). Computation, Dynamics, and Cognition. New estimates of D(s). The inner loop updates the value for each York: Oxford University Press. state s to be the minimum over the outgoing arcs of L(s,a) + Gregson, R. A. M. (1988). Nonlinear Psychophysical Dynamics. D(T(s,a)); the first term is the known distance of the first arc Hillsdale, NJ: Erlbaum. Dynamic Semantics 247 and the second term is the estimated distance from the are described in excellent recent texts by Puterman (1994) resulting state to the goal. The outer loop is executed as and Bertsekas (1995). many times as there are states. See also COMPUTATION; HIDDEN MARKOV MODELS The character of this algorithm is as follows: Initially, —Leslie Pack Kaelbling only D(g) is correct. After the first iteration, D(s) is correct for all states whose shortest path to the goal is one step long. After |S| iterations, it is correct for all states. Note that if L References was uniformly 1, then all the states that are i steps from the Bellman, R. (1957). Dynamic Programming. Princeton, NJ: Prince- goal would have correct D values after the ith iteration; ton University Press. however, it may be possible for some state s to be one step Bertsekas, D. P. (1995). Dynamic Programming and Optimal Con- from g with a very long arc, but have a much shorter path trol, vols. 1–2. Belmont, MA: Athena Scientific. with more steps, in which case the D value after the iteration Howard, R. A. (1960). Dynamic Programming and Markov Pro- would still be an overestimate. cesses. Cambridge, MA: The MIT Press. Once D has been computed, then the optimal path can be Puterman, M. L. (1994). Markov Decision Processes. New York: described by, at any state s, choosing the action a that mini- John Wiley and Sons. mizes D(a,s). Rather than just a single plan, or trajectory of states, we actually have a policy, mapping every state to its Dynamic Semantics optimal action. A generalization of the shortest-paths problem is the problem of finding optimal policies for Markov decision The term dynamic interpretation refers to a number of ap- processes (MDPs). An MDP is a tuple < S, A, T, R>, where S proaches in formal semantics of natural language that arose and A are state and action sets, as before; T(s,a) is a stochas- in the 1980s and that distinguish themselves from the pre- tic state-transition function, mapping a state and action into ceding paradigm by viewing interpretation as an inherently a probability distribution over next states (we will write dynamic concept. The phrase dynamic semantics is used to T(s,a,s´) as the probability of landing in state s´ as a result of denote a specific implementation of this idea, which locates taking action a in state s); and R(s,a) is a reward function, the dynamic aspect in the concept of linguistic meaning describing the expected immediate utility resulting from proper. taking action a in state s. In the simplest case, we seek a pol- The dominant view on meaning from the origins of logi- icy π that will gain the maximum expected total reward over cally oriented semantics at the beginning of the twentieth some finite number of steps of execution, k. century until well into the 1980s is aptly summarized in the In the shortest-paths problem, we computed a distance slogan “Meaning equals truth conditions.” This formulates a function D that allowed us to derive an optimal policy static view on what MEANING is: it characterizes the mean- cheaply. In MDPs, we seek a value function V k(s), which is ing relation between sentences and the world as a descrip- the expected utility (sum of rewards) of being in state s and tive relation, which is static in the sense that, although the executing the optimal policy for k steps. This value function meaning relation itself may change over time, it does not can be derived using dynamic programming, first solving bring about a change itself. The slogan focuses on sen- the problem for the situation when there are t steps remain- tences, but derivatively the same holds for subsentential ing, then using that solution to solve the problem for the sit- expressions: their meanings consist in the contribution they uation when there are t + 1 steps remaining. If there are no make to the truth conditions of sentences, a contribution that steps remaining, then clearly V0(s) = 0 for all s. If we know is usually formalized in terms of a static relation of refer- V t–1, then we can express V(t) as ence. Interpretation is the recovery of the meaning of an utterance, and is essentially sentence-based. This static view ∑ s′ ∈ ST ( s, a, s′ )V on meaning and interpretation derives from the development t t–1 V ( s ) = max a ∈ A R ( s, a ) + ( s′ ) of formal LOGIC, and lies at the basis of the framework of Montague grammar, the first attempt to apply systematically formal semantics to natural language. The t-step value of state s is the maximum over all actions Although dominant, the static view did not go unchal- (we get to choose the best one) of the immediate value of lenged. The development of speech act theory (Austin, the action plus the expected t – 1-step value of the next state. Searle) and work on PRESUPPOSITION (Stalnaker) and IMPLI- Once V has been computed, then the optimal action for state CATURE (GRICE) stressed the dynamic nature of interpreta- s with t steps to go is the action that was responsible for the maximum value of V t(s). tion. However, at first this just led to a division of labor between SEMANTICS and PRAGMATICS, the latter being Solving MDPs is a kind of planning problem, because it viewed as something that works on top of the results of the is assumed that a model of the world, in the form of the T former. This situation began to change in the beginning of and R functions, is known. When the world model is not the 1980s when people started to realize that certain empiri- known, the solution of MDPs becomes the problem of REIN- cal problems could be solved only by viewing meaning as FORCEMENT LEARNING, which can be thought of as stochas- an integrated notion that accounts for the dynamic aspects tic dynamic programming. of interpretation right from the start and that is essentially The theory of dynamic programming, especially as ap- concerned with DISCOURSE (or texts), and not with sen- plied to MDPs, was developed by Bellman (1957) and tences. Howard (1960). More recent extensions and developments 248 Dynamic Semantics A simple but illustrative example is provided by cross- dynamic interpretation one step further and locates the sentential ANAPHORA. In a discourse such as “A man walked dynamics in the concept of meaning itself. The basic start- into the bar. He was wearing a black velvet hat,” the pronoun ing point of dynamic semantics can be formulated in a slo- “he” is naturally interpreted as bound by the indefinite noun gan: “Meaning is context-change potential.” In other words, phrase “a man.” If interpretation proceeds on a sentence-by- the meaning of a sentence is the change that an utterance of sentence basis, this can not be accounted for. And “delayed” it brings about. And the meanings of subsentential expres- interpretation, that is, linking quantified noun phrases and sions consist in their contribution to the context-change pronouns only when the discourse is finished, makes empir- potential of the sentences in which they occur. Unlike dis- ically wrong predictions in other cases, such as: “One man course representation theory, it tries to do away with seman- was sitting in the bar. He was wearing a black velvet hat.” tic representations but assigns various expressions, such as Such examples rather suggest that interpretation has to be the existential quantifier associated with indefinite noun viewed as a dynamic process, which takes place incremen- phrases, a dynamic meaning, which allows it to extend its tally as a discourse or text proceeds. binding force beyond its ordinary syntactic scope. Further development of this idea received both an inter- The slogan “Meaning is context-change potential” is gen- nal and an external stimulus. The main external influence eral in at least two respects: it does not tell us what it is that came from natural language research within the context of is changed, and it does not say how the change is brought artificial intelligence, which favored a definitely procedural about. The latter question is answered by giving analyses of view and was oriented toward units larger than sentences. concrete linguistic structures. As to the former issue, it is The interpretation of utterances is modeled as the execution commonly assumed that one of the primary functions of lan- of procedures that change the state of a system as it pro- guage use is that of information exchange and that, hence, ceeds. It took some time before this idea caught on, mainly information is what is changed by an utterance. Primary because it seemed hard to reconcile with the core goal of focus is the information state of the hearer, but in dialogical formal semantics, viz., to account for logical relationships. situations that of the speaker also has to be taken into However, the emergence of formal models within the AI account. Depending on the empirical domain, information paradigm, in particular the development of NONMONOTONIC concerns different kinds of entities. If the subject is ana- LOGICS, provided the necessary link. Also, work on the phoric relations, information is about entities which are semantics of programming languages turned out to be con- introduced and their properties; for temporal expressions cerned with a conceptual machinery that could be applied one needs information about events and their location on a successfully to natural language. time axis; in the case of default reasoning expectation pat- In the beginning of the 1980s the dynamic view on inter- terns become relevant. In other cases (question-answer dia- pretation was formulated explicitly in discourse representa- logues, presuppositions) “higher order” information of the tion theory (Kamp 1981; see also Kamp and Reyle 1993) speech participants about each other is also at stake. and file change semantics (Heim 1982). The work of Kamp A change in the notion of meaning brings along a change and Heim constitutes an extension and transformation of the in other semantic concepts, such as entailment. In static framework of Montague grammar. In his original paper semantics truth plays a key role in defining meaning and Kamp explicitly describes his theory as an attempt to wed entailment. In dynamic semantics it becomes a limit case. the static approach of the logical tradition to the procedural The central notion here is that of support: roughly, an infor- mation state s supports a sentence Φ iff an utterance of Φ view of the AI paradigm. Within different settings similar ideas developed, for example within the theory of semantic does not bring about a change in s. Entailment can then be syntax (Seuren 1985), and that of game theoretical seman- defined as follows (alternative definitions are possible as well): Φ1 . . . Φ n entails Ψ iff for every state s it holds that tics (Hintikka 1983). updating s with Φ1 . . . Φn consecutively leads to a state that Discourse representation theory is a dynamic theory of supports Ψ. (Cf. van Benthem 1996 for discussion of vari- interpretation, not of meaning. The dynamics is located in the process of building up representational structures, so-called ous alternatives.) discourse representations. These structures are initiated by Dynamic semantics of natural language can be seen as incoming utterances and added to or modified by subsequent part of a larger enterprise: the study of how information in utterances. The structures themselves are interpreted in a general is structured and exchanged. Such a study brings static way by evaluating them with respect to a suitable together results from diverse fields such as computer sci- model. For example, anaphoric relations across sentence ence, cognitive psychology, logic, linguistics, and artificial boundaries are analyzed as follows. A sentence containing a intelligence. Language is one particular means to structure referential expression (such as a proper name, or a quantified and exchange information, along with others such as visual term; see QUANTIFIERS) introduces a so-called discourse ref- representations, databases, and so on. The dynamic view- erent along with restrictions on its interpretation. A subse- point has considerable merit here, and, conversely, draws on quent sentence containing an anaphoric expression (such as a results that have been developed with an eye to other appli- pronoun) can “pick up” this referent if certain descriptive and cations. structural conditions are met, and thus be linked to the ante- See also CONTEXT AND POINT OF VIEW; DYNAMIC AP- cedent referential expression. The semantics of the discourse PROACHES TO COGNITION; LOGICAL FORM IN LINGUISTICS; representation then takes care of the coreference. POSSIBLE WORLDS SEMANTICS; REFERENCE, THEORIES OF Dynamic semantics (Groenendijk and Stokhof 1991; —Martin Stokhof and Jeroen Groenendijk Groenendijk, Stokhof, and Veltman 1996) takes the idea of Dyslexia 249 both genetic and environmental factors play a role in its References clinical manifestations. Benthem, J. F. A. K. van. (1996). Exploring Logical Dynamics. The term dyslexia is used in the United States to refer to Stanford: CSLI. a developmental disorder of reading, whereas in the United Groenendijk, J. A. G., and M. J. B. Stokhof. (1991). Dynamic Kingdom acquired disorders of reading may also be called Predicate Logic. Linguistics and Philosophy 14: 39–100. dyslexias. Whereas dyslexia appears as an entry in ICD-9- Groenendijk, J. A. G., M. J. B. Stokhof, and F. J. M. M. Veltman. CM for Neurologists (Neurology 1994) to represent either (1996). Coreference and modality. In S. Lappin, Ed., Handbook developmental or acquired disorders of reading, DSM-IV of Contemporary Semantic Theory. Oxford: Blackwell, pp. (Association 1994) does not have an entry for dyslexia alto- 179–213. gether and instead has one for reading disorder. In this arti- Heim I. (1982). The Semantics of Definite and Indefinite Noun Phrases. Ph.D. diss., University of Massachusetts. (Published cle only the developmental form is considered. in 1989 by Garland, New York.) Some researchers have been unhappy with the term dys- Hintikka, J. (1983). The Game of Language. Dordrecht: Reidel. lexia and prefer to use developmental reading disorder Kamp, J. A. W. (1981). A theory of truth and semantic representa- instead, even though the term dyslexia means reading disor- tion. In J. A. G. Groenendijk, T. M. V. Janssen, and M. J. B. der. In DSM-IV the definition for reading disorder includes Stokhof, Eds., Formal Methods in the Study of Language. reading achievement (accuracy, speed, and/or comprehen- Amsterdam: Mathematical Centre, pp. 277–322. sion as measured by individually administered standardized Kamp, J. A. W., and U. Reyle. (1993). From Discourse to Logic. tests) that falls substantially below that expected given the Dordrecht: Kluwer. individual’s chronological age, measured intelligence, and Seuren, P. A. M. (1985). Discourse Semantics. Oxford: Blackwell. age-appropriate education. Sensory-perceptual, cognitive, Further Readings psychiatric, or neurological problems may coexist with the reading disorder, but should not be sufficient to explain the Beaver, D. (1997). Presupposition. In J. F. A. K. van Benthem and reading underachievement. A. T. M. ter Meulen, Eds., Handbook of Logic and Linguistics. Other researchers (see Shaywitz et al. 1992) do not con- Amsterdam: Elsevier, pp. 939–1008. sider that intelligence should be a factor in the diagnosis, and Benthem, J. F. A. K. van, R. M. Muskens, and A. Visser. (1997). prefer to include all individuals with reading difficulties, Dynamics. In J. F. A. K. van Benthem and A. T. M. ter Meulen, even those who are frankly mentally retarded. Still others Eds., Handbook of Logic and Linguistics. Amsterdam: Elsevier, insist that the reading disorder should be the consequence of pp. 587–648. disturbances of language function (see Vellutino 1987), spe- Blutner, R. (1993). Dynamic generalized quantifiers and existential sentences in natural languages. Journal of Semantics 10: 33– cifically phonological processing (see PHONOLOGY), and 64. that a reading disorder resulting from other mechanisms not Chierchia, G. (1995). Dynamics of Meaning. Chicago: University be included. of Chicago Press. It is generally accepted that the reading disorder is often Dekker, P. (1993). Existential disclosure. Linguistics and Philoso- accompanied by problems with writing, arithmetic, verbal phy 16: 561–588. memory, and subtle motor dysfunction. Sometimes there is Groeneveld, W. (1994). Dynamic semantics and circular proposi- also coexistence of anomalous HEMISPHERIC SPECIALIZA- tions. Journal of Philosophical Logic 23: 267–306. TION, ATTENTION deficits, and emotional and personality Krifka, M. (1993). Focus and presupposition in dynamic interpre- disorders. Subtle problems with oral language are also com- tation. Journal of Semantics 10: 269–300. monly seen, but the presence of more severe disturbances in Veltman, F. J. M. M. (1996). Defaults in update Semantics. Journal of Philosophical Logic 25: 221–261. oral communication gives rise to the diagnosis of develop- Vermeulen, C. J. M. (1994). Incremental semantics for proposi- mental language impairment, albeit with associated distur- tional texts. Notre Dame Journal of Formal Logic 35: 243–271. bances in reading and writing. Zeevat, H. J. (1994). Presupposition and accommodation in update Dyslexia is the most commonly recognized form of Semantics. Journal of Semantics 12: 379–412. learning disorder. The prevalence of dyslexia is accepted by most to be in the order of 4–5 percent of the school-age Dynamical Systems population. Depending on how the condition is defined, the prevalence figures range between 1 percent and 35 percent. Although problems may persist into adulthood, no clear See ARTIFICIAL LIFE; DYNAMIC APPROACHES TO COGNITION; figures about clinical prevalence in adult age groups exist. SELF-ORGANIZING SYSTEMS Most studies have shown a male prevalence in excess of that for females in the range of three to four males to one Dyslexia female. Some of this discrepancy may relate to reporting bias (the argument is that dyslexic girls are better behaved Dyslexia is a developmental disorder of READING that is and go unnoticed). However, even when this is taken into based on abnormal brain development. The brain changes consideration, a significant male preponderance still exist from before birth and persist throughout life, although remains. As with other complex behaviors, normal and they do not usually manifest themselves clinically until the abnormal, that have their origin in a genetic background early school years, and many sufferers of this disorder com- with added environmental influences, the prevalence of pensate significantly by the time they reach adult life. The dyslexia depends in part on its definition, severity, and etiology of dyslexia remains unknown, but it is clear that sociocultural attitudes. 250 Dyslexia The most common form of dyslexia is associated with def- icits in phonological processing (Morais, Luytens, and Ale- gria 1984; Shankweiler et al. 1995), but other varieties, some based on disturbances affecting the visual system, have also been identified (Lovegrove 1991; Stein 1994). A difficulty with processing rapidly changing stimuli, affecting at least visual and auditory functions has also been implicated, which blurs the distinction between perceptual and cognitive mecha- nisms in the etiology of dyslexia (Merzenich et al. 1996a, 1996b; Tallal et al. 1995). The temporal processing hypothe- sis states that sensory-perceptual temporal processing deficits impede the development of normal phonological representa- tions in the brain, which in turn produce the reading disorder. The main idea is that reading requires knowledge of the nor- mal sound structure of the language before appropriate sound-sight associations can be made. Sensory-perceptual temporal processing deficits may lead to difficulties repre- senting some language sounds that require rapid processing, which in turn leads to an incomplete or ambiguous sound rep- ertoire and consequently difficulty with reading. The main objection to this theory is based on observations such as the presence of normal reading in many deaf people, and the counterargument is that the reading disorder rather reflects a deficit in semiconscious parsing (metalinguistic) of the sound stream into phonemes, as a prerequisite for mapping the parsed elements onto visual words. Neurophysiological and psychophysical studies have shown abnormalities in visual perception and eye movements in dyslexics (Cornelissen et al. 1991). These findings are consistent with dysfunction of the magnocellular pathway of the visual system, which among other functions deals with rapid temporal processing (Greatrex and Drasdo 1995; Liv- ingstone et al. 1991). Psychophysical evidence indicated that language-impaired children exhibit slow processing in the Figure 1. Example of an ectopia found in the brain of a dyslexic. In auditory system, too (Tallal and Piercy 1975). More recently the upper part of the photomicrograph there is an extrusion of studies employing functional MAGNETIC RESONANCE IMAG- neurons and glia into the molecular layer (uppermost layer of the ING showed that the area involved in motion perception, MT, cortex). This is one example of a neuronal migration anomaly. does not activate normally in dyslexics, an area that forms neurons and glia in the frontal, parietal, and temporal cor- part of the magnocellular system (Eden et al. 1996). In sum, tex. These anomalies, called ectopias (see figure 1), reflect therefore, there is increasing evidence to suggest that rapid disturbances in neuronal migration to the cerebral cortex processing is impaired in dyslexics, which may help account during fetal brain development. The fundamental cause of for the phonological disorder and hence the reading disorder. the migration disturbance, though suspected to be genetic, is The ongoing debate has to do with the question of whether not known (for a review of recent work on the genetics of the type and degree of temporal processing perceptual diffi- dyslexia, see Pennington 1995). culty seen in dyslexics is sufficient for explaining their lan- Associated with the ectopias in the cerebral cortex are guage problems (for instance, see Paulesu et al. 1996). changes in the sizes of neurons of some thalamic sensory Dyslexia is associated with anatomic changes in the nuclei, including the visually linked lateral geniculate nu- brain. The normal human brain often shows asymmetry in cleus (LGN) (Livingstone et al. 1991) and the auditory the planum temporale, a region concerned with language medial geniculate nucleus (MGN; Galaburda, Menard, and function. Normal brains that are not asymmetric in the Rosen 1994). These are structures close to the input chan- planum temporale show two large planums, rather than two nels for visual and auditory experience and are not involved small ones or two medium-sized ones. Dyslexic brains fail in cognitive functions but rather in sensory perceptual func- to show the standard asymmetric or symmetric pattern in the tions. In the LGN, the neurons comprising the magnocellu- planum temporale (Galaburda 1993), presumably indicating lar layers are smaller in dyslexic than in control brains, and a disturbance in the development of hemispheric specializa- in the MGN there is a shift toward an excess of small neu- tion for language (see Annett, Eglinton, and Smythe 1996). rons and a paucity of large neurons in the left hemisphere. There are also subtle changes in the lamination of the Ectopias have been induced in newborn rats, and the dis- CEREBRAL CORTEX, which are focal in nature and affect the placed neurons exhibit abnormal connections with neurons left cerebral hemisphere more than the right in most cases in the THALAMUS as well as with other cortical areas in the (Galaburda 1994). They consist mostly of displaced nests of Ebbinghaus, Hermann 251 ipsilateral and contralateral cerebral hemispheres (Rosen Herman, A. E., A. M. Galaburda, H. R. Fitch, A. R. Carter, and G. D. Rosen. (1997). Cerebral thalamic cell size and microgyria audi- and Galaburda 1996). This provides a possible conduit for tory temporal processing in male and female rats. Cerebral Cor- the propagation of changes from the ectopias to the thalamus tex 7: 453–464. and/or vice versa. Additional research has shown that induc- Livingstone, M., G. Rosen, F. Drislane, and A. Galaburda. (1991). tion of cortical malformations related to ectopias lead to sec- Physiological and anatomical evidence for a magnocellular ondary changes in the thalamus, namely the appearance of defect in developmental dyslexia. Proc. Natl. Acad. Sci. 88: excessive numbers of small neurons and a paucity of large 7943–7947. neurons (Herman et al. 1997). The animals with the induced Lovegrove, W. J. (1991). Is the question of the role of visual defi- malformations also exhibit slow temporal processing involv- cits as a cause of reading disabilities a closed one? Comments ing rapidly changing sounds. There are sex differences in on Hulme. Cognitive Neuropsychology 8: 435–441. these findings, such that induction of cortical malformations Merzenich, M. M., W. M. Jenkins, P. Johnston, C. Schreiner, S. L. Miller, and P. Tallal. (1996a). Temporal processing deficits of produce both behavioral changes and changes in thalamic language-learning impaired children ameliorated by training. neuronal sizes only in treated males. Females demonstrate Science 271: 77–80. the anatomic changes in the cortex, but no changes in the Merzenich, M. M., W. M. Jenkins, P. Johnston, C. Schreiner, S. L. thalamus and no abnormal slowing in auditory processing. Miller, and P. Tallal. (1996b). Temporal processing deficits of Moreover, administration of testosterone to pregnant rat language-learning impaired children ameliorated by training. mothers in the perinatal period produces masculinization of Science 271: 77–81. the female offspring complete with thalamic neuronal Morais, J., M. Luytens, and J. Alegria. (1984). Segmentation abili- changes (Rosen, Herman, and Galaburda 1997). ties of dyslexics and normal readers. Percep. Motor Skills 58: In summary, animal models for the brain changes seen in 221–222. association with developmental dyslexia indicate that Neurology, T. A. A. o. (1994). ICD-9-CM for Neurologists. 3rd ed. Minneapolis: The American Academy of Neurology. abnormal cortical development can lead to abnormal devel- Paulesu, E., U. Frith, M. Snowling, A. Gallagher, J. Morton, R. S. J. opment of the thalamus, and that it is likely that brain areas Frackowiak, and C. D. Frith. (1996). Is developmental dyslexia that deal with cognitive tasks and brain areas that deal with a disconnection syndrome? Evidence from PET scanning. Brain sensory-perceptual tasks are both affected in dyslexia. 119: 143–157. Moreover, the research indicates that multiple modalities, as Pennington, B. F. (1995). Genetics of learning disabilities. J. Chi. well as multiple stages of processing, are involved, which Neurol. 10: S69–S77. may limit the ability of the developing brain to compensate. Rosen, G. D., and A. M. Galaburda. (1996). Efferent and afferent On the other hand, because of the relative discreteness of the connectivity of induced neocortical microgyria. Soc. Neurosci. neural connections even during development, not all cortical Abstr. 22: 485. and thalamic areas are affected, setting up the possibility for Rosen, G. D., A. E. Herman, and A. M. Galaburda. (1997). MGN neuronal size distribution following induced neocortical mal- a relatively delimited form of learning disorder. formations: The effect of perinatal gonadal steroids. Soc. Neu- See also APHASIA; GESCHWIND; LANGUAGE IMPAIRMENT, rosci. Abstr. 23: 626. DEVELOPMENTAL; MODELING NEUROPSYCHOLOGICAL DEFI- Shankweiler, D., S. Crain, L. Katz, A. E. Fowler, A. M. Liberman, CITS; VISUAL WORD RECOGNITION; WRITING SYSTEMS S. A. Brady, R. Thornton, E. Lundquist, L. Dreyer, J. M. Fletcher, K. K. Stuebing, S. E. Shaywitz, and B. A. Shaywitz. —Albert M. Galaburda (1995). Cognitive profiles of reading-disabled children: Com- parison of language skills in morphology, phonology, and syn- References tax. Psychological Science 6: 149–156. Annett, M., E. Eglinton, and P. Smythe. (1996). Types of dylexia and Shaywitz, B., J. Fletcher, J. Holahan, and S. Shaywitz. (1992). Dis- the shift to dextrality. J. Child Psychol. Psychiat. 37: 167–180. crepancy compared to low achievement definitions of reading Association, A. P. (1994). Diagnostic and Statistical Manual of disability: Results from the Connecticut longitudinal study. Mental Disorders—DSM-IV. Washington, DC: 4th ed. Ameri- Journal of Learning Disabilities 25: 639–648. can Psychiatric Association. Stein, J. F. (1994). Developmental neural timing and dyslexia Cornelissen, P., L. Bradley, S. Fowler, and J. Stein. (1991). What hemispheric lateralisation. Int. J. Psychophysiol. 18: 241–249. children see affects how they read. Dev. Med. Child Neurol. 33: Tallal, P., S. Miller, R. H. Fitch, J. F. Stein, K. McAnally, A. J. 755–762. Richardson, A. J. Fawcett, C. Jacobson, and R. I. Nicholson. Eden, G. F., J. W. Vanmeter, J. M. Rumsey, J. M. Maisog, R. P. (1995). Dyslexia update. The Irish Journal of Psychology 16: Woods, and T. A. Zeffiro. (1996). Abnormal processing of vi- 194–268. sual motion in dyslexia revealed by functional brain imaging. Tallal, P., and M. Piercy. (1975). Developmental aphasia: The per- Nature 382: 66–69. ception of brief vowels and extended stop consonants. Neuro- Galaburda, A. (1993). Neuroanatomic basis of developmental dys- psychologia 13: 69–74. lexia. Behavioral Neurology 11: 161–173. Vellutino, F. R. (1987). Dyslexia. Sci. Amer. 256: 34– 41. Galaburda, A. (1994). Developmental dyslexia and animal studies: At the interface between cognition and neurology. Cognition Ebbinghaus, Hermann 50: 133–149. Galaburda, A. M., M. T. Menard, and G. D. Rosen. (1994). Evi- dence for aberrant auditory anatomy in developmental dyslexia. Hermann Ebbinghaus (1850–1909) was the first psycholo- Proc. Natl. Acad. Sci. USA 91: 8010–8013. gist to apply experimental methods to the study of human Greatrex, J. C., and N. Drasdo. (1995). The magnocellular deficit MEMORY. His groundbreaking book summarizing his exper- hypothesis in dyslexia: A review of the reported evidence. Oph- imental work, Über das Gedächtnis, was published in 1885. thalmic Physiol. Opt. 15: 501–506. 252 Ebbinghaus, Hermann The English translation appeared in 1913 as Memory: A of a list, he saved one repetition in relearning it a week later. Contribution to Experimental Psychology and is still in print He also discovered the logarithmic nature of the forgetting and well worth reading today. function; great forgetting occurred soon after learning, with Ebbinghaus was born in Barmen, Germany, studied at the the rate of forgetting slowing over time. In addition, he fit- University of Bonn, and began his pioneering research on ted an equation to the forgetting function. He also discov- memory in Berlin in 1878. His work is notable for its many ered the advantage of spaced repetitions of lists to massed original features. In addition to performing the first experi- repetition, when he found that “38 repetitions, distributed in ments on memory, he provided an authoritative review of a certain way over the three preceding days, has just as probability and statistics, an elegant command of experimen- favorable an effect as 68 repetitions made on the day just tal design, a mathematical model of the forgetting function, previous” (page 89). an enlightened discussion of problems of experimenter Ebbinghaus asked the question of whether associations bias and demand characteristics in research, and a set of were only formed directly, between adjacent nonsense syl- experimental results that has stood the test of time. All the lables, or whether in addition remote associations were experiments reported by Ebbinghaus have been replicated. formed between syllables that were not adjacent. Using the No one knows how he created his ingenious methods, symbols A, B, C, D, E, F, G, and H to represent syllables in although historians have speculated that his purchase of a a list to be learned, he asked whether there are only associ- copy of Fechner’s book (in English) (1860/1966) and his ations between A and B, B and C, and so on, or whether reading about psychophysical methods may have been the there are also associations (albeit presumably weaker ones) source of his own clever methodology (see PSYCHOPHYSICS). between A and C, A and D, and so on. Ebbinghaus devel- Ebbinghaus solved the three problems faced by all cognitive/ oped a clever transfer of training design to answer the ques- experimental psychologists in their work: to convert unob- tion. He derived lists for relearning that had associations of servable mental processes into observable behavior; to mea- varying remoteness, which can be symbolized as ACEG . . . sure the behavior reliably; and to show how the behavior is BDFH (for one degree of remoteness) or ADG . . . BEH . . . systematically affected by relevant factors and conditions. CF for two degrees of remoteness, and so on. He discov- Ebbinghaus solved these problems by creating long lists ered that he did show savings in relearning these derived of nonsense syllables (ZOK, VAM, etc.) to be memorized. He lists relative to control lists (that had no associations), and hoped that using these materials would permit him to study he concluded that the savings were the result of remote formation of new associations with relatively homogeneous associations. In reviewing Ebbinghaus’s work, William materials. He learned the lists by reciting them in time to a JAMES (1890) noted that “Dr. Ebbinghaus’s attempt is as metronome and measuring the amount of time or the number successful as it is original, in bringing two views, which of repetitions taken until he could recite a list perfectly. He seem at first sight inaccessible to proof, to a direct and discovered quickly that the longer the list, the more repeti- practical test, and giving the victory to one of them” (page tions were required to effect a perfect recitation. Although 677). The derived list experiments might be the first case of this was hardly a surprising finding, Ebbinghaus plotted the competitive hypothesis testing between two theories in exact relation between the length of the series and the amount experimental psychology. of time (or number of repetitions) to recall it once perfectly, a Ebbinghaus was the only subject in all of his experi- measure known as trials to criterion. He then had to deter- ments, and this fact might give rise to doubt about the mine how to measure retention of the series at some later results. But he was a meticulous scientist, employing LOGIC, point in time. Ebbinghaus’s clever idea was to have himself controls, and precise techniques far ahead of this time. All relearn the list to the same criterion (of one perfect recita- his results have stood the test of time. His particular meth- tion); he could then obtain the savings (in time or repetitions) ods of studying memory were rather quickly supplanted by in relearning the series and use it as his measure of list reten- other techniques—the introspective techniques of recall and tion. The greater the savings (the fewer trials to relearn the recognition that he had wished to avoid—but his great series), the greater is retention; conversely, if the same num- achievements live on. He was the pioneer in showing how ber of trials is needed to relearn the series as was originally complex and unconscious mental processes could be studied required to learn it, then its forgetting was complete. through objective means by careful, systematic observation. The beauty of Ebbinghaus’s relearning and savings As such, he helped pave the way for modern cognitive/ method is that measures of retention could be obtained even experimental psychology. when recall of the list items was absent. This is one reason See also BARTLETT; EPISODIC VS. SEMANTIC MEMORY; Ebbinghaus preferred his objective savings technique over IMPLICIT VS. EXPLICIT MEMORY; INTROSPECTION what he called introspective techniques, such as recall or —Henry L. Roediger recognition. In a sense, ten years before FREUD proposed his ideas of unconscious mentation, Ebbinghaus had already References devised a method whereby they could be studied. Even if someone failed to bring information to mind consciously, Ebbinghaus, H. (1964). Memory: A Contribution to Experimental the unconscious residue could be examined through his Psychology. Trans. H. A. Ruber and C. E. Bussenius. New relearning and savings technique. York: Dover. Original work published 1885. Ebbinghaus made many discoveries with his new meth- Fechner, G. (1860/1966). Elements of Psychophysics. Vol. 1. H. E. ods. He obtained a relatively precise relation between num- Adler, D. H. Howes, and E. G. Boring, Eds. and Trans. New ber of repetitions and forgetting: For every three repetitions York: Holt, Rinehart, and Winston. Echolocation 253 Once an animal detects a sonar target, it must localize the James, W. (1890). Principles of Psychology. New York: Holt. object in three-dimensional space. In bats, the horizontal Postman, L. (1968). Hermann Ebbinghaus. American Psychologist 23: 149–157. location of the target influences the features of the echo at Roediger, H. L. (1985). Remembering Ebbinghaus. Contemporary the two ears, and these interaural cues permit calculation of Psychology 30: 519–523. a target’s azimuthal position in space (Shimozowa et al. Tulving, E. (1992). Ebbinghaus, Hermann. In L. R. Squire, Ed., 1974). Laboratory studies of target tracking along the hori- Encyclopedia of Learning and Memory. New York: Macmillan. zontal axis in bats suggest an accuracy of approximately 1 deg (Masters et al. 1985). The vertical location of a target results in a distinctive travel path of the echo into the bat’s Echolocation external ear, producing spectral changes in the returning sound that can be used to code target elevation (Grinnell and Echolocation, a term first coined by Donald Griffin in Grinnell 1965). Accuracy of vertical localization in bats is 1944, refers to the use of sound reflections to localize approximately 3 deg (Lawrence and Simmons 1982). The objects and orient in the environment (Griffin 1958). third dimension, target distance, depends on the time delay Echolocating animals transmit acoustic signals and process between the outgoing sound and returning echo (Hartridge information contained in the reflected signals, permitting 1945; Simmons 1973). Psychophysical studies of distance the detection, localization and identification of objects. The discrimination in FM bats report thresholds of about 1 cm, use of echolocation has been documented in bats (e.g., corresponding to a difference in echo arrival time of approx- Griffin 1958), marine mammals (e.g., Norris et al. 1961; Au imately 60 microseconds. Experiments that require the bat 1993), some species of nocturnal birds (e.g., Griffin 1953) to detect a change in the distance (echo delay) of a jittering and to a limited extent in blind or blindfolded humans (e.g., target report thresholds of less than 0.1 mm, corresponding Rice 1967). Only in bats and dolphins have specialized per- to a temporal jitter in echo arrival time of less than 1 micro- ceptual and neural processes for echolocation been second. Successful interception of insect prey by bats detailed. requires accuracy of only 1–2 cm (summarized in Moss and Acoustic signals for echolocation in bats and marine Schnitzler 1995). In marine mammals, psychophysical data mammals are primarily in the ultrasonic range, above 20 show that the dolphin can discriminate a target range differ- kHz and the upper limit of human hearing. The short wave- ence of approximately 1 cm, performance similar to that of lengths of these ultrasound signals permit reflections from the echolocating bat (Murchison 1980). small objects in the environment. All bat species of the sub- Many bats that use CF-FM signals are specialized to order Microchiroptera produce echolocation calls, either detect and process frequency and amplitude modulations in through the open mouth or through a nose-leaf, depending the returning echoes that are produced by fluttering insect on the species. The signal types used by different bat species prey. The CF components of these signals are relatively long vary widely, but all contain some frequency modulated in duration (up to 100 ms), sufficient to encode target move- (FM) components, which are well suited to carry informa- ment from a fluttering insect over one or more wingbeat tion about the arrival time of target echoes. Constant fre- cycles. The CF-FM greater horseshoe bat, Rhinolophus fer- quency (CF) signal components are sometimes combined rumequinum, can discriminate frequency modulations in the with FM components, and these signals are well suited to returning echo of approximately 30 Hz (less than 0.5% of carry information about target movement through Doppler the bat’s 83 kHz CF signal component), and can discrimi- shifts in the returning echoes. There is evidence that species nate fluttering insect species with different echo signatures using both FM and CF signals show individual variations in (von der Emde and Schnitzler 1990). Several bat species signal structure that could facilitate identification of self- that use CF-FM signals for echolocation exhibit Doppler produced echoes (see Suga et al. 1987; Masters, Jacobs, and shift compensation behavior: the bat adjusts the frequency Simmons 1991). One species of echolocating bat of the sub- of its sonar transmission to offset a Doppler shift in the order Megachiropetera, Rosettus aegyptiacus, produces returning echo, the magnitude of which depends on the bat’s clicklike sounds with the tongue for echolocation (Novick flight velocity (Schnitzler and Henson 1980). Doppler shift 1958). The most widely studied echolocating marine mam- compensation allows the bat to isolate small amplitude and mal, the bottlenose dolphin (Tursiops truncatus), emits brief frequency modulations in sonar echoes that are produced by clicks, typically less than 50 µs in duration, with spectral fluttering insects. energy from 20 kHz to over 100 kHz, depending on the High-level perception by sonar has been examined in acoustic environment in which the sounds are produced (Au some bat species. Early work by Griffin et al. (1965) demon- 1993). strated that FM-bats can discriminate between mealworms In echolocating animals, detection of a sonar target and disks tossed into the air. Both mealworms and disks pre- depends on the strength of the returning echo (Griffin 1958). sented changing surface areas as they tumbled through the Large sonar targets reflecting strong echoes are detected at air, and this study suggested that FM-bats use complex echo greater distances than small sonar targets (Kick 1982; Au features to discriminate target shape. The acoustic basis for 1993). Psychophysical studies of echo detection in bats and target shape discrimination by FM-bats has been considered dolphins indicate a strong dependence of performance on in detail by Simmons and Chen (1989); however, researchers the acoustic environment. Forward and backward masking, have not yet determined whether FM bat species develop background noise level, and reverberation can all influence three-dimensional representations of objects using sonar (see sonar target detection (Au 1993; Moss and Schnitzler 1995). Moss and Schnitzler 1995). Three-dimensional recognition 254 Echolocation of fluttering insects has been reported in the greater horse- Dear, S. P., J. Fritz, T. Haresign, M. Ferragamo, and J. A. Sim- mons. (1993). Tonotopic and functional organization in the shoe bat, a species that uses a CF-FM signal for echolocation auditory cortex of the big brown bat, Eptesicus fuscus. Journal (von der Emde and Schnitzler 1990). of Neurophysiology 70: 1988–2009. Successful echolocation depends on specializations in Emde, G. v. d., and H-V. Schnitzler. (1990). Classification of in- the auditory receiver to detect and process echoes of the sects by echolocating greater horseshoe bats. Journal of Com- transmitted sonar signals. Central to the conclusive demon- parative Physiology A 167: 423–430. stration of echolocation in bats was data on hearing sensitiv- Griffin, D. R. (1944). Echolocation in blind men, bats and radar. ity in the ultrasonic range of the biological sonar signals Science 100: 589–590. (Griffin 1958), and subsequent research has detailed many Griffin, D. R. (1953). Acoustic orientation in the oilbird, Steator- interesting specializations for the processing of sonar ech- nis. Proc. Nat. Acad. Sci. USA 39: 884–893. oes in the auditory receiver of bats. In dolphins, studies of Griffin, D. R. (1958). Listening in the Dark. New Haven: Yale Uni- versity Press. the central auditory system have been limited, but early Griffin, D. R., J. H Friend, and F. A. Webster. (1965). Target dis- work clearly documents high frequency hearing in the ultra- crimination by the echolocation of bats. Journal of Experimen- sonic range of echolocation calls (e.g., Bullock et al. 1968). tal Zoology 158: 155–168. In some CF-FM bat species, there are specializations in Grinnell, A. D., and V. S. Grinnell. (1965). Neural correlates of the peripheral and central auditory systems for processing vertical localization by echolocating bats. Journal of Physiol- echoes in the frequency range of the CF sonar component. ogy 181: 830–851. The greater horseshoe bat, for example, adjusts the fre- Hartridge, H. (1945). Acoustic control in the flight of bats. Nature quency of its sonar emissions to receive echoes at a refer- 156: 490–494. ence frequency of approximately 83 kHz. The auditory Kick, S. A. (1982). Target-detection by the echolocating bat, Eptesi- system of this species shows a large proportion of neurons cus fuscus. Journal of Comparative Physiology A 145: 431–435. Kössl, M., and M. Vater. (1995). Cochlear structure and function devoted to processing this reference frequency, and this in bats. In R. R. Fay and A. N. Popper, Eds., Springer Hand- expanded representation of 83 kHz can be traced to mechan- book of Auditory Research. Hearing by Bats. Berlin: Springer- ical specializations of this bat’s cochlea (Kössl and Vater Verlag. 1995). Lawrence, B. D., and J. A. Simmons. (1982). Echolocation in bats: There are other specializations in the bat central auditory The external ear and perception of the vertical positions of tar- system for echo processing that may play a role in the per- gets. Science 218: 481–483. ception of target distance. In bat species that utilize CF-FM Masters, W. M., A. J. M. Moffat, and J. R. Simmons. (1985). Sonar signals and those that utilize FM sonar components alone, tracking of horizontally moving targets by the big brown bat, there are neurons in the midbrain, THALAMUS and cortex Eptesicus fuscus. Science 228: 1331–1333 that respond selectively to pairs of FM sounds separated by Masters, W. M., S. C. Jacobs, and J. A. Simmons. (1991). The structure of echolocation sounds used by the big brown bat a delay (e.g., Yan and Suga 1996). The pairs of FM sounds Eptesicus fuscus: Some consequences for echo processing. simulate the bat’s sonar transmissions and returning echoes, Journal of the Acoustical Society of America 89: 1402–1413. and the time delay separating the two corresponds to a par- Moss, C. F., and H.-U. Schnitzler. (1995). Behavioral studies of ticular target distance. The pulse-echo delay evoking the auditory information processing. In R. R. Fay and A. N. Pop- largest facilitated response, referred to as the best delay per, Eds., Springer Handbook of Auditory Research. Hearing (BD), is in some CF-FM bat species topographically orga- by Bats. Berlin: Springer-Verlag, pp. 87–145. nized (Suga and O’Neill 1979). Neural BD’s fall into a bio- Murchison, A. E. (1980). Maximum detetion range and range reso- logically relevant range of 2–40 ms, corresponding to target lution in echolocating bottlenose porpoise (Tursiops tuncatus). distances of approximately 34 to 690 cm. Such topography In R. G. Busnel and J. F. Fish, Eds., Animal Sonar Systems. has not been demonstrated in FM-bat species (e.g., Dear et New York: Plenum Press, pp. 43–70. Norris, K. S., J. W. Prescott, P. V. Asa-Dorian, and P. Perkins. al. 1993). (1961). An experimental demonstration of echolocation behav- Many specializations in behavior and central auditory ior in the porpoise, Tursiops truncatus (Montagu). Biological processing appear in echolocating animals; however, Bulletin 120: 163–176. research findings suggest that echolocation builds on the Novick, A. (1958). Orientation in palaeotropical bats. II. Megachi- neural and perceptual systems that evolved for hearing in roptera. Journal of Experimental Zoology 137: 443–462. less specialized animals. Rice, C. (1967). Human echo perception. Science 155: 656–664. See also ANIMAL COMMUNICATION; AUDITION; AUDI- Schnitzler, H.-U., and W. Henson Jr. (1980). Performance of air- TORY PHYSIOLOGY; PSYCHOPHYSICS; SPATIAL PERCEPTION; borne animal sonar systems: 1. Microchiroptera. In R. G. Bus- nel and J. F. Fish, Eds., Animal sonar systems New York: WHAT-IT’S-LIKE Plenum Press, pp. 109–181. —Cynthia F. Moss Shimozawa, T., N. Suga, P. Hendler, and S. Schuetze. (1974). Directional sensitivity of echolocation system in bats producing frequency-modulated signals. Journal of Experimental Biology References 60: 53–69. Au, W. L. (1993). The Sonar of Dolphins. New York: Springer. Simmons, J. A. (1973). The resolution of target range by echolo- Bullock, T. H., A. D. Grinnell, E. Ikenzono, K. Kameda, Y. Kat- cating bats. Journal of the Acoustical Society of America 54: suki, M. Nomoto, O. Sato, N. Suga, and K. Yanagisawa. 157–173. (1968). Electrophysiological studies of central auditory mecha- Simmons, J. A., and L. Chen. (1989). The acoustic basis for target nisms in cetaceans. Zeitschrift für Vergleichende Physiologie discrimination by FM echolocating bats. Journal of the Acous- 59: 117–316. tical Society of America 86: 1333–1350. Ecological Psychology 255 and especially AFFORDANCES, possibilities for effective Suga, N., H. Niwa, I. Taniguchi, and D. Margoliash. (1987). The personalized auditory cortex of the mustached bat: Adaptation action. These things are perceivable because they are speci- for echolocation. Journal of Neurophysiology 58: 643–654. fied by information available to appropriately tuned percep- Suga, N., and W. E. O’Neill. (1979). Neural axis representing tar- tual systems. The task of ecological psychology is to get range in the auditory cortex of the mustache bat. Science analyze that information, and to understand how animals 206: 351–353. regulate their encounters with the environment by taking Yan, J., and N. Suga. (1996). The midbrain creates and the thala- advantage of it. mus sharpens echo-delay tuning for the cortical representation The ecological analysis of vision begins not with the ret- of target-distance information in the mustached bat. Hearing inal image but with the optic array. At any point of observa- Research 93: 102–110. tion to which an eye might come there is already an optical structure, formed by ambient light reflected to the point Further Readings from all directions. Even a static array of such points is rich Busnel, R. G., and J. F. Fish. (1980). Animal Sonar Systems. New in information, but the transformations generated by York: Plenum Press. observer motion are richer still: they specify the layout of Dror, I. E., M. Zagaeski, and C. F. Moss. (1995). Three-dimen- the environment and the perceiver’s path of motion sional target recognition via sonar: A neural network model. uniquely. The visual system that evolved to take advantage Neural Networks 8: 143–154. of this information consists of a hierarchy of active organs: a Fay, R. R., and A. N. Popper. (1995). Springer Handbook of Audi- pair of movable eyes, each with its lens and chamber and tory Research. Hearing by Bats. Berlin: Springer. retina, stabilized in their orbits by the ocular muscles and set Nachtigall, P. E., and P. W. B. Moore. (1988). Animal Sonar: Pro- in a mobile head on a moving body (Gibson 1966). Note cesses and Performance. New York: Plenum Press. that the brain does not appear in this definition: the special- Pollak, G. D., and J. H. Casseday. (1989). The Neural Basis of Echolocation in Bats. Berlin: Springer. ized neural mechanisms essential to vision have been of rel- Rice, C. E., S. H. Feinstein, and R. J. Schusterman. (1965). Echo- atively little interest to ecological psychologists. Vision detection ability of the blind: Size and distance factors. Journal would be impossible without the brain, but it would be of Experimental Psychology 70: 246–251. equally impossible without the optic array and the mobile Suga, N. (1988). What does single unit analysis in the auditory organs of vision. cortex tell us about information processing in the auditory sys- Much of the early research in ecological psychology tem? In P. Rakic and W. Singer, Eds., Neurobiology of Neocor- focused on movement-produced information, which had tex. New York: Wiley. been largely neglected in other approaches to perception. Supa, M., M. Cotzin, and K. M. Dallenbach. “Facial vision,” the Kinetic occlusion, for example, occurs when nearby objects perception of obstacles by the blind. American Journal of Psy- hide (or reveal) others beyond them as a result of observer or chology 57: 133–183. object motion. In occlusion, visible elements of surface tex- ture are systematically deleted at one side of the occluding Ecological Psychology edge but not at the other. The result is a compelling impres- sion of relative depth as well as a perceptual form of object The term ecological has been used to characterize several permanence: one sees that the occluded object is going out theoretical positions in psychology, but only one—that of of sight without going out of existence (see also DEPTH PER- James J. GIBSON (1966, 1979) and his successors—is clearly CEPTION). Movement-produced information (including relevant to cognitive science. (The others are Barker’s deformations of shading, highlights, etc.) also plays a signif- descriptions of social behavior settings and Bronfenbren- icant role in the perception of object shape (Norman, Todd, ner’s analysis of the many contexts that influence the devel- and Phillips 1995). oping child; see ECOLOGICAL VALIDITY.) Gibson’s views— Another form of kinetic structure is optic flow, the char- and their subsequent development by other theorists—are acteristic streaming of the array produced by observer the focus of the International Society for Ecological Psy- motion. Such flows have powerful effects on posture and chology. The Society’s journal Ecological Psychology has can create vivid illusions of self-motion. Optic flow also been published since 1989; the ecologically oriented Inter- enables perceivers to determine their heading (i.e., the direc- national Conference on Event Perception and Action has tion in which they are moving), but the details of this pro- met every two years since 1981. cess are presently controversial (Cutting 1996; Warren The ecological approach rejects the cognitivist assump- 1995). For more on optic flow, see MID-LEVEL VISION. tion of the poverty of the stimulus, that is, that perceivers Looming is the rapid magnification of a sector of the must rely on MENTAL REPRESENTATIONS because they have array that occurs when an object approaches the eye or the access only to fragmentary and fallible sense-data. It also eye approaches a surface. Looming specifies impending rejects many of the conventional variables that are usually collision, and animals of many different species—humans, regarded as the objects of perception: (absolute) distance, monkeys, chickens, crabs—respond appropriately. The (absolute) size, (two-dimensional) form, etc. What people time remaining before collision (assuming unchanged and animals actually perceive includes the layout of the velocity) is optically specified by a variable called tau (Lee environment (the arrangement of objects and surfaces, rela- 1980). (If X is the visual angle subtended by the approach- ing object or any of its parts, the tau-function is τ (X) = X/ tive to one another and to the ground), the shapes of objects, the self (the perceiver’s own situation in and motion through [dX/dt].) A considerable body of evidence suggests that the layout), events (various types of movement and change), humans and other animals use tau-related information in 256 Ecological Psychology the control of action (e.g., Lee 1993), although the issue is muscles spanning several different joints, and capable of not closed. contracting independently of each other, can become func- Given their focus on information structures rather than tionally linked so as to perform as a single task-specific stimuli, ecological psychologists have been especially unit” (Turvey 1990: 940). Turvey and his collaborators have interested in amodal invariants available to more than one developed this concept in a series of studies of coordinated perceptual system. It is easy, for example, to match the movements. shapes of seen objects with shapes felt with the hand Other perceptual systems have been studied as well: as a (Gibson 1962)—easy not just for humans but for chim- first step toward an ecological analysis of hearing, Gaver panzees (Davenport, Rogers, and Russell 1973). Runeson (1993) has recently outlined a descriptive framework for the and Frykholm (1981) have shown that one can judge the sounds of everyday events. Fowler (1986) has advanced an weight of a box just as well by watching someone else lift ecological approach to SPEECH PERCEPTION, which can be it as by lifting it oneself, even if one’s view of the lifter is regarded as a special case of the perception of events (spe- just a “point-light” display. Even infants can pick up cifically, the movements of the articulatory organs). Stoffre- many types of tactile-visual and audiovisual invariants, gen and Riccio (1988) have proposed an ecological analysis matching what they see to what they hear or feel. At any of the vestibular system and related phenomena such as age, the perceived unity of environmental events—a per- motion sickness. son seen and heard as she walks by, the breaking of a Since J. J. Gibson’s death in 1979, theory development in glass that falls on the floor—depends on our sensitivity to ecological psychology has taken two principal forms. On amodal invariants. the one hand is the development of increasingly sophisti- Ecological psychologists have been among the leaders in cated formal descriptions of environmental structure and the the study of infant perception and PERCEPTUAL DEVELOP- information that specifies it (e.g., Bingham 1995); more MENT. A case in point is the discovery, cited above, that generally, of animal/environment mutuality (Turvey and infants are sensitive to amodal invariants. Another example Shaw 1995). On the other hand are various attempts to concerns infant locomotion: reinterpreting her classical broaden the enterprise, using ecological concepts to address studies of the “visual cliff,” E. J. Gibson has shown that a range of classical psychological issues. In this vein are babies’ willingness to venture onto a surface depends on Eleanor Gibson’s (1994) elaboration of her theory of devel- their perception of its affordances. A sharp dropoff affords opment, Walker-Andrews’s (1997) account of the percep- falling but not crawling; an undulating waterbed affords tion of emotion, my own analysis of self-perception crawling but not walking (Gibson et al. 1987); sloping sur- (Neisser 1993), and the wide-ranging theoretical work of faces afford various modes of exploration and locomotion Edward Reed. Reed’s book The Necessity of Experience (Adolph, Eppler, and Gibson 1993). (1996b) is a philosophical and political critique of the J. J. Gibson’s (1966) concept of a perceptual system (as assumptions underlying standard cognitive science; its com- opposed to a sensory modality) is particularly useful in the panion volume Encountering the World (1996a) presents study of HAPTIC PERCEPTION and dynamic touch. (The older ecological analyses of many topics in psychology. “Cogni- term tactile perception suggests a more passive form of tion,” says Reed, “is neither copying nor constructing the experience.) The haptic system includes a rich complex of world. Cognition is, instead, the process that keeps us afferent and efferent nerves as well as the skin, underlying active, changing creatures in touch with an eventful, chang- tissues, muscles, digits, and joints. This system is capable of ing world” (1996a: 13). remarkable feats: one can, for example, determine a great See also AFFORDANCES; GIBSON, JAMES JEROME; PER- deal about the length, shape, and other properties of an CEPTUAL DEVELOPMENT (unseen) rigid rod simply by wielding it with one hand. —Ulric Neisser Michael Turvey (1996) and his associates have shown that the rotational/mechanical invariants on which this form of References perception is based can be summarized in the inertia tensor Iij, which represents the moments of inertia specific to a Adolph, K. E., M. A. Eppler, and E. J. Gibson. (1993). Crawling given object rotated around a fixed point. Because we versus walking infants’ perception of affordances for locomo- “wield” our own limbs in much the same sense, the inertia tion over sloping surfaces. Child Development 64: 1158–1174. tensor may provide a partial basis for self-perception as Bingham, G. P. (1995). Dynamics and the problem of visual event well. “Simply put, moving one’s limbs can be considered a recognition. In R. Port and T. Van Gelder, Eds., Mind as case of dynamic touch” (Pagano and Turvey 1995: 1081). Motion: Explorations in the Dynamics of Cognition. Cam- bridge, MA: MIT Press. Ecological psychologists have also made substantial Cutting, J. E. (1996). Wayfinding from multiple sources of local contributions to the study of motor control. Effective action information in retinal flow. Journal of Experimental Psychol- requires the coordination of many simultaneously moving ogy: Human Perception and Performance 22: 1299–1313. body parts, each with its own inertia and other physical Davenport, R. K., C. M. Rogers, and I. S. Russell. (1973). Cross- attributes. That coordination must be matched to the spe- modal perception in apes. Neuropsychologia 11: 21–28. cific affordances of the immediate environment, and hence Fowler, C. A. (1986). An event approach to the study of speech cannot be achieved by any centrally programmed pattern of perception from a direct-realist perspective. Journal of Phonet- impulses. This problem has been widely recognized (see ics 14: 3–28. MOTOR CONTROL); part of the solution may be the forma- Gaver, W. W. (1993). How do we hear in the world? Explorations tion of task-specific coordinative structures. “A group of in ecological acoustics. Ecological Psychology 5: 285–313. Ecological Validity 257 Gibson, E. J. (1994). Has psychology a future? Psychological Sci- Turvey, M. T., R. E. Shaw, E. S. Reed, and W. M. Mace. (1981). ence 5: 69–76. Ecological laws of perceiving and acting: In reply to Fodor and Gibson, E. J., G. Riccio, M. A. Schmuckler, T. A. Stoffregen, D. Pylyshyn (1981). Cognition 9: 237–304. Rosenberg, and J. Taormina. (1987). Detection of the travers- Warren, W. H. Jr., and R. E. Shaw, Eds. (1985). Persistence and ability of surfaces by crawling and walking infants. Journal of Change: Proceedings of the First International Conference on Experimental Psychology: Human Perception and Performance Event Perception. Hillsdale, NJ: Erlbaum. 13: 533–544. Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Ecological Validity Boston: Houghton Mifflin. Gibson, J. J. (1979). The Ecological Approach to Visual Percep- tion. Boston: Houghton Mifflin. The term ecological validity refers to the extent to which Lee, D. N. (1980). The optic flow field: The foundation of vision. Philosophical Transactions of the Royal Society of London B behavior indicative of cognitive functioning sampled in one 290: 169–179. environment can be taken as characteristic of an individual’s Lee, D. N. (1993). Body-environment coupling. In U. Neisser, Ed., cognitive processes in a range of other environments. Conse- The Perceived Self. New York: Cambridge University Press, pp. quently, it is a central concern of cognitive scientists who 43–67. seek to generalize their findings to questions about “how the Neisser, U., Ed. (1993). The Perceived Self: Ecological and Inter- mind works” on the basis of behavior exhibited in specially personal Sources of Self-Knowledge. New York: Cambridge designed experimental or diagnostic settings. This concern University Press. was provocatively expressed by Urie Bronfenbrenner (1979), Norman, J. F., J. T. Todd, and F. Phillips. (1995). The perception of who complained that too much of the study of child develop- surface orientation from multiple sources of information. Per- ment depended on the study of children in strange circum- ception and Psychophysics 57: 629–636. Pagano, C. C., and M. T. Turvey. (1995). The inertia tensor as a stances for short periods of time, in contrast with the basis for the perception of limb orientation. Journal of Experi- ecologies of their everyday lives. mental Psychology: Human Perception and Performance 21: Discussions of the problem of ecological validity first 1070–1087. came to prominence in cognitive research in the United Reed, E. S. (1996a). Encountering the World: Toward an Ecologi- States owing to the work of Egon Brunswik and Kurt Lewin, cal Psychology. New York: Oxford University Press. two German scholars who emigrated to the United States in Reed, E. S. (1996b). The Necessity of Experience. New Haven: the 1930s. Other important sources of ideas about ecological Yale University Press. validity include Roger Barker (1978), whose work on the Runeson, S., and G. Frykholm. (1981). Visual perception of lifted influence of social setting on behavior retains its influence to weight. Journal of Experimental Psychology: Human Percep- the present day, and J. J. GIBSON (1979), who argued that the tion and Performance 7: 733–740. Stoffregen, T. A., and G. E. Riccio. (1988). An ecological theory of crucial questions in the study of perception are to be orientation and the vestibular system. Psychological Review 95: resolved not so much by an attention to the perceiver as by 3–14. the description of how the environment in particular every- Turvey, M. T. (1990). Coordination. American Psychologist 45: day life arrangements “affords” a person perceptual infor- 938–953. mation; the issue was given further prominence in Ulric Turvey, M. T. (1996). Dynamic touch. American Psychologist 51: Neisser’s influential Cognition and Reality in 1976. 1134–1152. Brunswik (1943) proposed an ECOLOGICAL PSYCHOLOGY Turvey, M. T., and R. E. Shaw. (1995). Toward an ecological phys- in which psychological observations would be made by ics and a physical psychology. In R. L. Solso and D. W. Mas- sampling widely the environments within which particular saro, Eds., The Science of the Mind: 2001 and Beyond. New “proximal” tasks are embedded. Brunswik’s overall goal York: Oxford University Press, pp. 144–169. Walker-Andrews, A. S. (1997). Infants’ perception of expressive was to prevent psychology from being restricted to artifi- behaviors: Differentiation of multimodal information. Psycho- cially isolated proximal or peripheral circumstances that are logical Bulletin 121: 437–456. not representative of the “larger patterns of life.” In order to Warren, W. H. (1995). Self-motion: Visual perception and avoid this problem, he suggested that situations, or tasks, visual control. In W. Epstein and S. Rogers, Eds., Perception rather than people, should be considered the basic units of of Space and Motion. New York: Academic Press, pp. 263– psychological analysis. In addition, these situations or tasks 325. must be “carefully drawn from the universe of the require- ments a person happens to face in his commerce with the Further Readings physical and social environment” (p. 263). To illustrate his Gibson, E. J. (1982). The concept of affordances in development: approach, Brunswik studied size constancy by accompany- The renascence of functionalism. In W. A.Collins, Ed., The ing an individual who was interrupted frequently in the Concept of Development: Minnesota Symposium on Child course of her normal daily activities and asked to estimate Development vol. 15. Hillsdale, NJ: Erlbaum, pp. 55–81. the size of some object she had just been looking at. This Lee, D. N., and C. von Hofsten. (1985). Dialogue on perception person’s size estimates correlated highly with physical size and action. In W. Warren and R. Shaw, Eds, Persistence and of the objects and not with their retinal image size. This Change. Hillsdale, NJ: Erlbaum. result, Brunswik claimed, “possesses a certain generality Reed, E. S. (1988). James J. Gibson and the Psychology of Percep- with regard to normal life conditions” (p. 265). tion. New Haven: Yale University Press. Lewin proposed a “psychological ecology,” as a way of Reed, E., and R. Jones. (1982). Reasons for Realism: Selected “discovering what part of the physical or social world will Essays of James J. Gibson. Hillsdale, NJ: Erlbaum. 258 Ecological Validity determine, during a given period, the ‘boundary zone’ of the other is not. If common processes are implicated in both tasks, then we should be able to produce in each task environment phenomena life space” of an individual (1943: 309). By life space, that give evidence of workings of the same basic cognitive Lewin meant “the person and the psychological environ- mechanisms that appear in the other. (1976: 258) ment as it exists for him” (p. 306). He argued that behavior at time t is a function of the situation at time t only, and However, a variety of contemporary research indicates that hence we must find ways to determine the properties of the the requirements for establishing ecological validity place lifespace “at a given time.” This requirement amounts to an enormous analytical burden on cognitive scientists (see what ethnographers refer to as “taking the subject’s point of Cole 1996: ch. 8–9 for an extended treatment of the associ- view.” It seeks to unite the subjective and the objective. ated issues). Once we move beyond the laboratory in search If one agrees that understanding psychological processes of representativeness, the ability to identify tasks is mark- in terms of the life space of the subject is important, follow- edly weakened. Failure to define the parameters of the ana- ing the logic of Lewin’s argument, Brunswik’s approach lyst’s task or failure to insure that the task-as-discovered is was inadequate. His experimental procedures did not allow the subject’s task can vitiate the enterprise. This point was him to observe someone fulfilling a well-specified task in a made clearly by Schwartz and Taylor (1978). Their particu- real-life environment; rather, it amounted to making experi- lar interest was the representativeness of standardized ments happen in a nonlaboratory environment. His proce- achievement and IQ tests, but their specification of the dures, in Lewin’s terminology, changed the subject’s life issues involved has broad applicability in cognitive science. space to fit the requirements of his predefined set of obser- They queried, “Does the test elicit the same behavior as vation conditions. would the same tasks embedded in a real noncontrived situ- Ulric Neisser (1976) also pointed out marked discontinu- ation? . . . Even to speak of the same task across contexts ities between the “spatial, temporal, and intermodal conti- requires a model of the structure of the task. In the absence nuities of real objects and events” and the objects and events of such a model, one does not know where the equivalence characteristic of laboratory-based research as a fundamental lies (p. 54).” shortcoming of cognitive psychology, going so far as to sug- As Valsiner and Benigni (1986) point out, standardized gest, “It is almost as if ecological invalidity were a deliber- cognitive experimental procedures are meant to embody ate feature of the experimental design” (1976: 34). closed analytic systems (the point of the experimental/test Urie Bronfenbrenner’s (1979, 1993) advocacy of ecologi- procedures being to achieve precisely this closure). Conse- cally valid research has greatly influenced the study of cog- quently, attempting to establish task equivalence in order to nitive social development (Cole and Cole 1996). There are, generalize beyond the experimental circumstances amounts he writes, three conditions that ecologically valid research to imposing a closed system on a more open behavioral sys- must fulfill: (1) maintain the integrity of the real-life situa- tem. To the degree that behavior conforms to the prescripted tions it is designed to investigate; (2) be faithful to the larger analytic categories, one achieves ecological validity in social and cultural contexts from which the subjects come; Brunswik’s sense. Yet a variety of research (reviewed in (3) be consistent with the participants’ definition of the situa- Cole 1996) has shown that even psychological tests and tion, by which he meant that the experimental manipulations other presumably “closed system” cognitive tasks are more and outcomes must be shown to be “perceived by the partici- permeable and negotiable than analysts ordinarily take pants in a manner consistent with the conceptual definitions account of. Insofar as the cognitive scientist’s closed system explicit and implicit in the research design” (1979: 35). does not capture veridically the elements of the open system Note that there is a crucial difference between Lewin, it is presumed to model, experimental results systematically Neisser, and Bronfenbrenner’s interpretations of how to misrepresent the life process from which they are derived. conduct ecologically valid research and the procedures The issue of ecological validity then becomes a question of proposed by Brunswik. Neisser, Bronfenbrenner, and oth- the violence done to the phenomenon of interest owing to ers do not propose that we carry around our laboratory the analytic procedures employed (Sbordone and Long task and make it happen in a lot of settings. They propose 1996). that we discover and directly observe the ways that tasks See also BEHAVIORISM; CULTURAL VARIATION occur (or don’t occur) in nonlaboratory settings. More- —Michael Cole over, in Bronfenbrenner’s version of this enterprise we must also discover the equivalent of Lewin’s “life space,” References for example how the task and all it involves appear to the subject. Barker, R. (1978). Ecological Psychology. Stanford, CA: Stanford University Press. Two decades ago, the idea that such discovery proce- Bronfenbrenner, U. (1979). The Ecology of Human Development. dures are possible was quite widespread among researchers Cambridge, MA: Harvard University Press. who used experimental procedures and were cognizant of Brunswik, E. (1943). Organismic achievement and environmental questions of ecological validity. Herbert Simon was echoing probability. The Psychological Review 50: 255–272. common opinion when he asserted that there is Cole, M. (1996). Cultural Psychology: A Once and Future Disci- a general experimental paradigm that can be used to test the pline. Cambridge, MA: Harvard University Press. commonality of cognitive processes over a wide range of task Cole, M., and S. R. Cole. (1996). The Development of Children. domains. The paradigm is simple. We find two tasks that have the 3rd ed. New York: Freeman. same formal structure (e.g., they are both tasks of multi-dimensional Gibson, J. J. (1979). An Ecological Approach to Visual Perception. judgment), one of which is drawn from a social situation and the Boston: Houghton-Mifflin. Economics and Cognitive Science 259 amounts, people exhibit risk aversion for gains and risk Lewin, K. (1943). Defining the “field at a given time.” Psychologi- cal Review 50: 292–310. seeking for losses (except for very low probabilities, where Sbordone, R. J., and C. J. Long, Eds. (1996). Ecological Validity of these can reverse). Prospects can often be framed either as Neuropsychological Testing. Delray Beach, FL: GR Press/St. gains or as losses relative to some reference point, which Lucie Press, Inc. can trigger opposing risk attitudes and can lead to discrep- Schwartz, J. L., and E. F. Taylor. (1978). Valid assessment of com- ant preferences with respect to the same final outcomes plex behavior. The Torque approach. Quarterly Newsletter of (Tversky and Kahneman 1986). People are also loss averse: the Laboratory of Comparative Human Cognition 2: 54–58. the loss of utility associated with giving up a good is greater Simon, H. A. (1976). Discussion: Cognition and social behavior. In than the utility associated with obtaining it (Tversky and J. S. Carroll and J. W. Payne, Eds., Cognition and Social Kahneman 1991). Loss aversion yields “endowment Behavior. Hillsdale, NJ: Erlbaum. effects,” wherein the mere possession of a good can lead to higher valuation of it than if it were not in one’s possession Economics and Cognitive Science (Kahneman, Knetsch, and Thaler 1990), and also creates a general reluctance to trade or to depart from the status quo, Economics is concerned with the equilibria reached by large because the disadvantages of departing from it loom larger systems, such as markets and whole economies. The units than the advantages of the alternatives (Knetsch 1989; Sam- that contribute to these collective outcomes are the individ- uelson and Zeckhauser 1988). In further violation of stan- ual participants in the economy. Consequently, assumptions dard value maximization, decisional conflict can lead to a about individual behavior play an important role in eco- greater tendency to search for alternatives when better nomic theorizing, which has relied predominantly on a pri- options are available but the decision is hard than when rela- ori considerations and on normative assumptions about tively inferior options are present and the decision is easy individuals and institutions. (Tversky and Shafir 1992). In economics, it is assumed that every option has a sub- When a multiattribute option is evaluated, in consumer jective “utility” for the individual, a well-established posi- choice for example, each attribute must be weighted in tion in his or her preference ordering (von Neumann and accord with its contribution to the option’s attractiveness. Morgenstern 1947; see RATIONAL DECISION MAKING). The standard economic assumption is that such evaluation Because preferences are clear and stable they are expected of options is stable and does not depend, for example, on the to be invariant across normatively equivalent assessment method of evaluation. Behavioral research, in contrast, has methods (procedure invariance), and across logically equiv- shown that the weight of an attribute is enhanced by its alent ways of describing the options (description invari- compatibility with a required response. Compatibility ance). In addition, people are assumed to be good Bayesians effects are well known in domains such as perception and (see BAYESIAN LEARNING and BAYESIAN NETWORKS), who motor performance. In line with compatibility, a gamble’s hold coherent (and dynamically consistent) preferences potential payoff is weighted more heavily in a pricing task through time. Economics also makes a number of secondary (where both the price and the payoff are expressed in the assumptions: economic agents are optimal learners, whose same monetary units) than in choice. Consistent with this is selfish focus is on tangible assets (e.g., consumer goods the preference reversal phenomenon (Slovic and Lichten- rather than goodwill), who ignore sunk costs, and who do stein 1983), wherein subjects choose a lottery that offers a not let good opportunities go unexploited. greater chance to win over another that offers a higher pay- Coinciding with the advent of cognitive science, Simon off, but then price the latter higher than the former. This pat- (1957) brought into focus the severe strain that the hypothe- tern has been observed in numerous experiments, including sis of rationality put on the computing abilities of economic one involving professional gamblers in a Las Vegas casino agents, and proposed instead to consider agents whose ratio- (Lichtenstein and Slovic 1973), and another offering the nality was bounded (see BOUNDED RATIONALITY). Over the equivalent of a month’s salary to respondents in the People’s Republic of China (Kachelmeier and Shehata 1992). last three decades, psychologists, decision theorists, and, People’s representation of money also systematically more recently, experimental and behavioral economists have departs from what is commonly assumed in economics. explored people’s economic decisions in some detail. These According to the fungibility assumption, which plays a cen- studies have emphasized the role of information processing tral role in theories of consumption and savings such as the in people’s decisions. The evidence suggests that people life-cycle or the permanent income hypotheses, “money has often rely on intuitive heuristics that lead to non-Bayesian no labels”; all components of a person’s wealth can be col- judgment (see JUDGMENT HEURISTICS), and that probabili- lapsed into a single sum. Contrary to this assumption, people ties have nonlinear impact on decision (Kahneman and appear to compartmentalize wealth and spending into distinct TVERSKY 1979; Wu and Gonzalez 1996). Preferences, more- budget categories, such as savings, rent, and entertainment, over, appear to be formed, not merely revealed, in the elici- and into separate mental accounts, such as current income, tation process, and their formation depends on the framing assets, and future income (Thaler 1985, 1992). These mental of the problem, the method of elicitation, and the valuations accounting schemes lead to differential marginal propensities and attitudes that these trigger. to consume (MPC) from one’s current income (where MPC Contrary to the assumption of utility maximization, evi- is high), current assets (where MPC is intermediate), and dence suggests that the psychological carriers of value are future income (where MPC is low). Consumption functions gains and losses, rather than final wealth (see DECISION thus end up being overly dependent on current income, and MAKING). Because of diminishing sensitivity to greater 260 Economics and Cognitive Science people find themselves willing to save and borrow (at a Dawes, R. M., and R. H. Thaler. (1988). Cooperation. Journal of Economic Perspectives 2: 187–197. higher interest rate) at the same time (Ausubel 1991). In Friedman, M. (1953). The methodology of positive economics. In addition, people often fail to ignore sunk costs (Arkes and Essays in Positive Economics. Chicago: University of Chicago Blumer 1985), fail to consider opportunity costs (Camerer et Press. al. 1997), and show money illusion, wherein the nominal Kachelmeier, S. J., and M. Shehata. (1992). Examining risk prefer- worth of money interferes with a representation of its real ences under high monetary incentives: Experimental evidence worth (Shafir, Diamond, and Tversky 1997). from the People’s Republic of China. American Economic Economic agents are presumed to have a good sense of Review 82: 1120–1141. their tastes and to be consistent through time. People, how- Kahneman, D. (1994). New challenges to the rationality assump- ever, often prove weak at predicting their future tastes or at tion. Journal of Institutional and Theoretical Economics 150 learning from past experience (Kahneman 1994), and their (1): 18–36. Kahneman, D., J. L. Knetsch, and R. H. Thaler. (1986). Fairness as intertemporal choices exhibit high discount rates for future a constraint on profit seeking: Entitlements in the market. as opposed to present outcomes, yielding dynamically American Economic Review 76(4): 728–741. inconsistent preferences (Loewenstein and Thaler 1992). In Kahneman, D., J. L. Knetsch, and R. H. Thaler. (1990). Experi- further contrast with standard economic assumptions, peo- mental tests of the endowment effect and the Coase theorem. ple show concern for fairness and cooperation, even when Journal of Political Economy 98(6): 1325–1348. dealing with unknown others in limited encounters, where Kahneman, D., and A. Tversky. (1979). Prospect theory: An analy- long-term strategy and reputation are irrelevant (see, e.g., sis of decision under risk. Econometrica 47: 263–291. Dawes and Thaler 1988; Kahneman, Knetsch and Thaler Knetsch, J. L. (1989). The endowment effect and evidence of non- 1986; Rabin 1993). reversible indifference curves. American Economic Review 79: The foregoing partial list of empirical observations and 1277–1284. Lichtenstein, S., and P. Slovic. (1973). Response-induced reversals psychological principles does not approach a unified theory of preference in gambling: An extended replication in Las comparable to that proposed by economics. The empirical Vegas. Journal of Experimental Psychology 101: 16–20. evidence suggests that Homo sapiens is significantly more Loewenstein, G., and R. H. Thaler. (1992). Intertemporal choice. difficult to model than Homo economicus. Some have In R. H. Thaler, Ed., The Winner’s Curse: Paradoxes and argued that the descriptive adequacy of the economic Anomalies of Economic Life. New York: Free Press. assumptions is unimportant as long as the theory is able to Rabin, M. (1993). Incorporating fairness into game theory and eco- predict observed behaviors. Friedman (1953), for example, nomics. American Economic Review 83: 1281–1302. has proposed the analogy of an expert billiards player who, Samuelson, W., and R. Zeckhauser. (1988). Status quo bias in deci- without knowing the relevant rules of physics or geometry, sion making. Journal of Risk and Uncertainty 1: 7–59. is able to play as if he did. Nonetheless, as the preceding list Shafir, E., P. Diamond, and A. Tversky. (1997). Money illusion. The Quarterly Journal of Economics 112 (2): 341–374. suggests, the tension between economics and the cognitive Simon, H. A. (1957). Models of Man. New York: Wiley. sciences appears to reside in the actual predictions, not only Slovic, P., and S. Lichtenstein. (1983). Preference reversals: A in the assumptions. Others have argued that individual broader perspective. American Economic Review 73: 596–605. errors are less important when one is ultimately interested in Thaler, R. H. (1985). Mental accounting and consumer choice. explaining aggregate behavior. The observed discrepancies, Marketing Science 4: 199–214. however, are systematic and predictable, and if the majority Tversky, A., and D. Kahneman. (1986). Rational choice and the errs in the same direction there is no reason to expect that framing of decisions. Journal of Business 59(4,2): 251–278. the discrepancies should disappear in the aggregate (Akerlof Tversky, A., and D. Kahneman. (1991). Loss aversion in riskless and Yellen 1985). Cognitive scientists and experimental and choice: A reference dependent model. Quarterly Journal of behavioral economists are trying better to understand and Economics (November): 1039–1061. Tversky, A., and E. Shafir. (1992). Choice under conflict: The model systematic departures from standard economic the- dynamics of deferred decision. Psychological Science 3 (6): ory. The aim is to bring to economics a theory populated 358–361. with psychologically more realistic agents. von Neumann, J., and O. Morgenstern. (1947). Theory of Games See also COOPERATION AND COMPETITION; RATIONAL and Economic Behavior. 2nd ed. Princeton, NJ: Princeton Uni- AGENCY; RATIONAL CHOICE THEORY versity Press. Wu, G., and R. Gonzalez. (1996). Curvature of the probability —Eldar Shafir weighting function. Management Science 42(12): 1676–1690. References Further Readings Akerlof, G. A., and J. Yellen. (1985). Can small deviations from Benartzi, S., and R. Thaler. (1995). Myopic loss aversion and the rationality make significant differences to economic equilibria? equity premium puzzle. Quarterly Journal of Economics 110 American Economic Review 75(4): 708–720. (1): 73–92. Arkes, H. R., and C. Blumer. (1985). The psychology of sunk cost. Camerer, C. F. (1995). Individual decision making. In J. H. Kagel Organizational Behavior and Human Performance 35: 129–140. and A. E. Roth, Eds., Handbook of Experimental Economics. Ausubel, L. M. (1991). The failure of competition in the credit Princeton, NJ: Princeton University Press. card market. American Economic Review 81: 50–81. Heath, C., and A. Tversky. (1990). Preference and belief: Ambigu- Camerer, C., L. Babcock, G. Loewenstein, and R. Thaler. (1997). ity and competence in choice under uncertainty. Journal of Risk A target income theory of labor supply: Evidence from cab and Uncertainty 4(1): 5–28. drivers. Quarterly Journal of Economics 112(2). Education 261 used for both explanatory and predictive purposes. Research Johnson, E. J., J. Hershey, J. Meszaros, and H. Kunreuther. (1993). Framing, probability distortions, and insurance decisions. Jour- in general skill learning includes psychometric analyses of nal of Risk and Uncertainty 7: 35–51. high-level aptitudes (e.g., spatial cognition), and topics such Kagel, J. H., and A. E. Roth, Eds. (1995). Handbook of Experi- as INDUCTION, DEDUCTIVE REASONING, abduction (hypothe- mental Economics. Princeton, NJ: Princeton University Press. sis generation and evaluation), experimentation, critical or Loewenstein, G., and J. Elster, Eds. (1992). Choice over Time. coherent reasoning, CAUSAL REASONING, comprehension, New York: Russell Sage Foundation. and PROBLEM SOLVING. Some of these skills are analyzed March, J. G. (1978). Bounded rationality, ambiguity and the engi- into more specific skills and malskills such as heuristics, neering of choice. Bell Journal of Economics 9: 587–610. organizing principles, bugs, and reasoning fallacies (cf. Plott, C. R. (1987). Psychology and economics. In J. Eatwell, M. Increasingly, metacognition Milgate, and P. Newman, Eds., The New Palgrave: A Dictio- JUDGMENT HEURISTICS). research focuses on an individual’s learning style, reflec- nary of Economics. New York: Norton. Rabin, M. (1998). Psychology and economics. Journal of Eco- tions, motivation, and belief systems. Research on learning nomic Literature. can often be readily applied predictively (i.e., a priori). For Simon, H. A. (1978). Rationality as process and as product of example, Case (1985) predicted specific cognitive perfor- thought. Journal of the American Economic Association 68: 1– mance in balance-beam problem solving within defined 16. stages of development. Thaler, R. H. (1991). Quasi Rational Economics. New York: Rus- Prescriptive elements of education are quite diverse. sell Sage Foundation. Some liken such elements to the engineering, as opposed to Thaler, R. H. (1992). The Winner’s Curse: Paradoxes and Anoma- the science, of learning. Products of prescriptive education lies of Economic Life. New York: Free Press. include modest reading modules, scientific microworlds, lit- Tversky, A., and D. Kahneman. (1992). Advances in prospect the- eracy standards, and assessment-driven curricular systems ory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty 5: 297–323. (e.g., Reif and Heller 1982; Resnick and Resnick 1992). Tversky, A., S. Sattath, and P. Slovic. (1988). Contingent weight- The advent of design experiments (Brown 1992; Collins ing in judgment and choice. Psychological Review 95(3): 371– 1992) represents a kind of uneasy compromise between the 384. rigorous control of laboratory research and the potential of Tversky, A., P. Slovic, and D. Kahneman. (1990). The causes of greater relevance from classroom interventions. preference reversal. American Economic Review 80: 204–217. Educational proponents of situated cognition generally Tversky, A., and P. Wakker. (1995). Risk attitudes and decision highlight the notion that individuals always learn and per- weights. Econometrica 63(6): 1255–1280. form within rather narrow situations or contexts, but such proponents are often reticent to offer specific pedagogical Education recommendations. Situated cognition variably borrows pieces of activity theories, ECOLOGICAL VALIDITY, group In its broadest sense, education spans the ways in which cul- interaction, hermeneutic philosophies, direct perception, tures perpetuate and develop themselves, ranging from BEHAVIORISM, distributed cognition, cognitive psychology, infant-parent communications to international bureaucracies and social cognition. It generally focuses on naturalistic, and sweeping pedagogical or maturational movements (e.g., apprentice-oriented, artifact-laden, work-based, and even the constructivist movement attributed to PIAGET). As a dis- culturally exotic settings. This focus is often represented as cipline of cognitive science, education is a body of theoreti- a criticism of traditional school-based learning—even cal and applied research that draws on most of the other though some situated studies are run in schools (which are cognitive science disciplines, including psychology, philos- arguably natural in our society). Situated cognition’s critics ophy, computer science, linguistics, neuroscience, and see it as an unstructured, unfalsifiable melange with near- anthropology. Educational research overlaps with the cen- infinite degrees of explanatory freedom and generally vague tral part of basic cognitive psychology that considers prescriptions. Recent disputes between the situated and LEARNING. Such research may be idealized as primarily mainstream camps seem to center on the questions, “What either descriptive or prescriptive in nature, although many is a symbol?”, “How can we separate a learner from a social research ventures have aspects of both. situation?”, and “Is transfer of training common or rare?” Descriptively, educational research focuses on observing (e.g., Vera and Simon 1993, and commentaries). The dis- human learning. Specific areas of study include expert- putes mirror many core issues from other cognitive science novice approaches, CONCEPTUAL CHANGE and misconcep- disciplines, as well as questions about the goals of social tion research, skill learning, and METACOGNITION. Expert- science. novice research typically explicitly contrasts the extremes Several cognitive theories have descriptive, predictive, of a skill to infer an individual’s changes in processes and and prescriptive applications to education. For instance, the representations. Misconception research in domain-based ACT-based computational models of cognition (Anderson education, such as NAIVE PHYSICS, NAIVE MATHEMATICS, 1993) attempt to account for past data, predict learning out- writing, and computer programming, implicitly contrasts comes, and serve as the basis for an extended family of intel- expert knowledge with that of nonexperts; a person’s current ligent tutoring systems (ITSs). These sorts of models might understanding may be thought of in terms of SCHEMATA, incorporate proposition-based semantic networks, “adap- frames, scripts, MENTAL MODELS, or analogical or meta- tive” or “learning” production systems, economic or rational phorical representations. Child development research often analyses, and representations of individual students’ involves studying misconceptions. These constructs are strengths and weaknesses. The contrasts among various 262 Electric Fields computer-based categories of learning-enhancement sys- Collins, A. (1992). Toward a design science of education. In E. Scanlon and T. O’Shea, Eds., Proceedings of the NATO tems have not been sharp (Wenger 1987). These categories Advanced Research Workshop On New Directions In Advanced include ITSs, computer-aided instruction, interactive learn- Educational Technology. Berlin: Springer, pp. 15–22. ing environments, computer coaches, and guided discovery Cuban, L. (1989). Neoprogressive visions and organizational reali- environments. Some distinctions among these categories ties. Harvard Educational Review 59: 217–222. include (a) whether a model of student knowledge or skill is Merrill, D. C., B. J. Reiser, M. Ranney, and J. G. Trafton. (1992). employed, (b) whether a relatively generative knowledge Effective tutoring techniques: A comparison of human tutors base for a chosen domain is involved, (c) whether feedback and intelligent tutoring systems. The Journal of the Learning comes via hand-coded (or compiled) buggy rules (and Sciences 2: 277–305. lookup tables) or via the interpreted semantics of a knowl- Ranney, M., P. Schank, and C. Diehl. (1995). Competence and per- edge base, and (d) whether a novel, more effective represen- formance in critical reasoning: Reducing the gap by using Con- vince Me. Psychology Teaching Review 4: 153–166. tation is introduced for a traditional one. Superior ITSs Reif, F., and J. Heller. (1982). Knowledge structure and problem demonstrate great effectiveness relative to many forms of solving in physics. Educational Psychologist 17: 102–127. standard instruction, but currently have limited interactional Resnick, L. and D. Resnick. (1992). Assessing the thinking curric- sophistication compared to human tutoring (Merrill et al. ulum: New tools for educational reform. In B. Gifford and M. 1992). Specific ITSs often spawn the following question O’Connor, Eds., Cognitive Approaches to Assessment. Boston: from both within and without cognitive science: “Where is Kluwer-Nijhoff. the intelligence, or the semantics, in this system?” Vera, A. H. and H. A. Simon. (1993). Situated action: A symbolic Distributed cognition systems also face this question, interpretation. Cognitive Science 17: 7–48. although many proponents are unconcerned about philo- Wenger, E. (1987). Artificial Intelligence and Tutoring Systems: sophical semantics-from-syntax queries. Constraint-based Computational and Cognitive Approaches to the Communica- tion of Knowledge. Los Altos, CA: Morgan Kaufman. and connectionist models are not yet commonly employed in educational ventures (cf. Ranney, Schank, and Diehl 1995), which seems surprising, given the efforts focused on Electric Fields learning in parallel distributed processing models of cogni- tion, BAYESIAN NETWORKS, artificial neural or fuzzy net- See ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC EVOKED works, and the like. FIELDS As with some ITSs, cognitive science approaches to edu- cation, in general, often focus on improving students’ Electrophysiology, Electric and Magnetic knowledge representations or on providing more generative or transparent representations. Many such representational Evoked Fields systems have evolved with computational technology, par- ticularly as graphical user interfaces supplant text-based, Electric and magnetic evoked fields are generated in the command-line interactions. Clickable, object-oriented inter- brain as a consequence of the synchronized activation of faces have become the norm, although the complexity of neuronal networks by external stimuli. These evoked fields such features sometimes overwhelms and inhibits learners. may be associated with sensory, motor, or cognitive events, Most recently, the Internet and World Wide Web have and hence are more generally termed event-related poten- spawned many research ventures, for instance, involving tials ERPs) and event-related magnetic fields (ERFs), collaborative learning environments that include the integra- respectively. Both ERPs and ERFs consist of precisely timed tion of technology and curricula. However, an ongoing dan- sequences of waves or components that may be recorded ger to education is the proliferation of well-funded research noninvasively from the surface of the head to provide infor- projects developing potentially promising technologies that, mation about spatio-temporal patterns of brain activity asso- relative to the vast majority of classrooms, (a) require intol- ciated with a wide variety of cognitive processes (Heinze, erable levels of equipment upgrades or technical and sys- Münte, and Mangun 1994; Rugg and Coles 1995). temic support, (b) are unpalatable to classroom teachers, Electric and magnetic field recordings provide comple- and (c) simply do not “scale up” to populations of nontrivial mentary information about brain function with respect to size (cf. Cuban 1989). other neuroimaging methods that register changes in See also COGNITIVE ARTIFACTS; COGNITIVE DEVELOP- regional brain metabolism or blood flow, such as POSITRON MENT; HUMAN-COMPUTER INTERACTION EMISSION TOMOGRAPHY (PET) and functional MAGNETIC —Michael Ranney and Todd Shimoda RESONANCE IMAGING (fMRI). Although PET and fMRI pro- vide a detailed anatomical mapping of active brain regions References during cognitive performance, these methods cannot track the time course of neural events with the high precision of Anderson, J. R. (1993). Rules of the Mind. Hillsdale, NJ: Erlbaum. ERP and ERF recordings. Studies that combine ERP/ERF Brown, A. L. (1992). Design experiments: Theoretical and meth- and PET/fMRI methodologies are needed to resolve both odological challenges in creating complex interventions in the spatial and temporal aspects of brain activity patterns classroom settings. The Journal of the Learning Sciences 2: that underlie cognition. 141–178. At the level of SINGLE-NEURON RECORDING, both ERPs Case, R. (1985). Intellectual Development. Orlando, FL: Academic and ERFs are generated primarily by the flow of ionic currents Press. Electrophysiology, Electric and Magnetic Evoked Fields 263 Figure 1. The characteristic time-voltage waveform of the auditory generated in the auditory brainstem pathways, while subsequent ERP in response to a brief stimulus such as a click or a tone. To negative (N) and positive (P) waves are generated in different extract the ERP from the ongoing noise of the electroencephalogram, subregions of primary and secondary auditory cortex. (From it is necessary to signal average the time-locked waves over many Hillyard, S.A. (1993). Electrical and magnetic brain recordings: stimulus presentations. The individual waves or components of the contributions to cognitive neuroscience. Current Opinion in ERP are triggered at specific time delays or latencies after the Neurobiology 3: 217–224.) stimulus (note logarithmic time scale). The earliest waves (I–VI) are across nerve cell membranes during synaptic activity. ERPs The processing of sensory information in different arise from summed field potentials produced by synaptic cur- modalities is associated with characteristic sequences of rents passing into the extracellular fluids surrounding active surface-recorded ERP/ERF components (figure 1). In each neurons. In contrast, ERFs are produced by the concentrated modality, components at specific latencies represent evoked intracellular flow of synaptic currents through elongated neu- activity in subcortical sensory pathways and in primary and ronal processes such as dendrites, which gives rise to concen- secondary receiving areas of the CEREBRAL CORTEX. Corti- tric magnetic fields surrounding the cells. When a sufficiently cal components with latencies of 50–250 msec have been large number of neurons having a similar anatomical position associated with perception of specific classes of stimuli and orientation are synchronously activated, their summed (Allison et al. 1994) and with short-term sensory memory fields may be strong enough to be detectable as ERPs or ERFs processes. Altered sensory experience (e.g., congenital at the surface of the head. The detailed study of scalp-recorded deafness, blindness, or limb amputation) produces marked ERPs became possible in the 1960s following the advent of changes in ERP/ERF configurations that reflect the NEURAL digital signal-averaging computers, whereas analysis of ERFs PLASTICITY and functional reorganization of cortical sen- required the further development in the 1980s of highly sensi- sory systems (Neville 1995). tive, multichannel magnetic field sensors (Regan 1989). Recordings of ERPs and ERFs have revealed both the The anatomical locations of the neuronal populations timing and anatomical substrates of selective ATTENTION that generate ERPs and ERFs may be estimated on the basis operations in the human brain (Näätänen 1992; Hillyard et of their surface field configurations. This requires applica- al. 1995). In dichotic listening tasks, paying attention to tion of algorithms and models that take into account the sounds in one ear while ignoring sounds in the opposite ear geometry of the generator neurons and the physical proper- produces a marked enhancement of short-latency ERP/ERF ties of the biological tissues. Active neural networks may be components to attended-ear sounds in auditory cortex. This localized in the brain more readily by means of ERF than by selective modulation of attended versus unattended inputs surface ERP recordings, because magnetic fields pass during AUDITORY ATTENTION begins as early as 20–50 msec through the brain, skull, and scalp without distortion, poststimulus, which provides strong support for theories of whereas ERPs are attenuated by the resistivity of interven- attention that postulate an “early selection” of stimuli prior ing tissues. Both ERP and ERF data have been used suc- to full perceptual analysis. In visual attention tasks, stimuli cessfully to reveal the timing of mental operations with a presented at attended locations in the visual field elicit high degree of precision (of the order of milliseconds) and enlarged ERP/ERF components in secondary (extrastriate) to localize brain regions that are active during sensory and cortical areas as early as 80–100 msec poststimulus. This perceptual processing, selective attention and discrimina- suggests that visual attention involves a sensory gain control tion, memory storage and retrieval, and language compre- or amplification mechanism that selectively modulates the hension (Hillyard 1993). flow of information through extrastriate cortex. Paying 264 Electrophysiology, Electric and Magnetic Evoked Fields attention to nonspatial features such as color or shape is Gazzaniga, Ed., The Cognitive Neurosciences. Cambridge, MA: MIT Press, pp. 665–681. manifested by longer latency components that index the Kutas, M., and C. K. Van Petten. (1994). Psycholinguistics electri- time course of feature analyses in different visual-cortical fied: Event-related brain potential investigations. In M. Gerns- areas. bacher, Ed., Handbook of Psycholinguistics. New York: ERPs and ERFs provide a converging source of data Academic Press, pp. 83–143. about the timing and organization of information processing McCarthy, G., A. C. Nobre, S. Bentin, and D. D. Spence. (1995). stages that intervene between a stimulus and a discrimina- Language-related field potentials in the anterior-medial tempo- tive response. Whereas short-latency components demarcate ral lobe: 1. Intracranial distribution and neural generators. Jour- the timing of early sensory feature analyses, longer latency nal of Neuroscience 15: 1080–1089. components (“N200” and “P300” waves) are closely cou- Näätänen, R. (1992). Attention and Brain Function. Hillsdale, NJ: pled with processes of perceptual discrimination, OBJECT Erlbaum. Neville, H. (1995). Developmental specificity in neurocognitive RECOGNITION, and classification. ERP components gener- development in humans. In M. S. Gazzaniga, Ed., The Cogni- ated in motor cortex index the timing of response selection tive Neurosciences. Cambridge, MA: MIT Press, pp. 219–234. and MOTOR CONTROL processes. Studies using these ERP Regan, D. (1989). Human Brain Electrophysiology. New York: measures have provided strong support for “cascade” theo- Elsevier. ries that posit a continuous flow of partially analyzed infor- Rugg, M. D. (1995). ERP studies of memory. In M. D. Rugg and mation between successive processing stages during M. G. H. Coles, Eds., Electrophysiology of Mind: Event- sensory-motor tasks (Coles et al. 1995). Related Brain Potentials and Cognition. Oxford: Oxford Uni- Long-latency ERPs have been linked with MEMORY versity Press, pp. 132–170. encoding, updating, and retrieval processes (Rugg 1995). Rugg, M. D., and M. G. H. Coles, Eds. (1995). Electrophysiology ERPs elicited during LEARNING can reliably predict accu- of Mind: Event-Related Brain Potentials and Cognition. Oxford: Oxford University Press. racy of recall or recognition on subsequent testing. Some components appear to index conscious recognition of previ- Further Readings ously learned items, whereas others are sensitive to contex- tual priming effects. These memory-related components Coles, M. G. H., E. Donchin, and S. W. Porges, Eds. (1986). Psy- have been recorded both from the scalp surface and from chophysiology: Systems, Processes, and Applications. Vol. 1, implanted electrodes in hippocampus and adjacent temporal Systems. New York: Guilford Press. lobe structures in neurosurgical patients. Donchin, E., Ed. (1984). Cognitive Psychophysiology. Hillsdale, ERP and ERF recordings are also being used effectively NJ: Erlbaum. to investigate the NEURAL BASIS OF LANGUAGE, including Gaillard, A. W. K., and W. Ritter, Eds. (1983). Tutorials in ERP phonetic, lexical, syntactic, and semantic levels of process- Research: Endogenous Components. Amsterdam: North- Holland. ing (Kutas and Van Petten 1994). Alterations in specific Hämäläinen, M., R. Hari, R. J. Ilmoniemi, J. Knuutila, and O. V. ERP/ERF components have been linked to syndromes of Lounasmaa. (1993). Magnetoencephalography: Theory, LANGUAGE IMPAIRMENT. A late negative ERP (“N400”) instrumentation, and applications to noninvasive studies of provides a graded, on-line measure of word expectancy and the working human brain. Reviews of Modern Physics 65: semantic priming during sentence comprehension. Studies 413–497. of N400 have contributed to understanding the organization Hillyard, S. A., L. Anllo-Vento, V. P. Clark, H. J. Heinze, S. J. of semantic networks in the brain (McCarthy et al. 1995). Luck, and G. R. Mangun. (1996). Neuroimaging approaches to See also ATTENTION IN THE HUMAN BRAIN; NEURAL the study of visual attention: A tutorial. In M. Coles, A. Kramer, and G. Logan, Eds., Converging Operations in the PLASTICITY Study of Visual Selective Attention. Washington, DC: American —Steven A. Hillyard Psychological Association, pp. 107–138. Hillyard, S., and T. W. Picton. (1987). Electrophysiology of cogni- tion. In F. Plum, Ed., Handbook of Physiology Section 1: The References Nervous System. Vol. 5, Higher Functions of the Brain. Bethesda: American Physiological Society, pp. 519–584. Allison, T., G. McCarthy, A. C. Nobre, A. Puce, and A. Belger. John, E. R., T. Harmony, L. Prichep, M. Valdés, and P. Valdés, Eds. (1994). Human extrastriate visual cortex and the perception of (1990). Machinery of the Mind. Boston: Birkhausen. faces, words, numbers, and colors. Cerebral Cortex 5: 544– Näätänen, R. (1992). Attention and Brain Function. Hillsdale, NJ: 554. Erlbaum. Coles, M. G. H., G. O. Henderikus, M. Smid, M. K. Scheffers, and Näätänen, R. (1995). The mismatch negativity: A powerful tool for L. J. Otten. (1995). Mental chronometry and the study of cognitive neuroscience. Ear and Hearing 16: 6–18. human information processing. In M. D. Rugg and M. G. H. Nunez, P. L., Ed. (1981). Electric Fields of the Brain. New York: Coles, Eds., Electrophysiology of Mind: Event-Related Brain Oxford University Press. Potentials and Cognition. Oxford: Oxford University Press, pp. Picton, T. W., O. G. Lins, and M. Scherg. (1994). The recording 86–131. and analysis of event-related potentials. In F. Boller and J. Graf- Heinze, H. J., T. F. Münte, and G. R. Mangun, Eds. (1994). Cogni- man, Eds., Handbook of Neuropsychology. Vol. 9, Event- tive Electrophysiology. Boston: Birkhauser. Related Potentials. Amsterdam: Elsevier, pp.429–499. Hillyard, S. A. (1993). Electrical and magnetic brain recordings: Scherg, M. (1990). Fundamentals of dipole source potential analy- Contributions to cognitive neuroscience. Current Opinion in sis. In F. Grandori, M. Hoke, and G. L. Romans, Eds., Auditory Neurobiology 3: 217–224. Evoked Magnetic Fields and Electric Potentials, Advances in Hillyard, S. A., G. R. Mangun, M. G. Woldorff, and S. J. Luck. Audiology. Basel: Karger, pp. 40–69. (1995). Neural systems mediating selective attention. In M. S. Eliminative Materialism 265 episodes. But if this is right, one eliminativist argument Eliminative Materialism continues, then either nonhuman animals and prelinguistic children do not have beliefs and thoughts, or they must Eliminative materialism, or “eliminativism” as it is some- think in some nonpublic LANGUAGE OF THOUGHT, and both times called, is the claim that one or another kind of mental of these options are absurd (P. S. Churchland 1980). Oppo- state invoked in commonsense psychology does not really nents of the argument fall into two camps. Some, following exist. Eliminativists suggest that the mental states in ques- Donald Davidson (1975), argue that children and nonhu- tion are similar to phlogiston or caloric fluid, or perhaps to man animals do not have beliefs or thoughts, whereas oth- the gods of ancient religions: they are the nonexistent posits ers, most notably Jerry Fodor (1975), argue that children of a seriously mistaken theory. The most widely discussed and higher animals do indeed think in a nonpublic “lan- version of eliminativism takes as its target the intentional guage of thought.” Another argument that relies on the Sell- states of commonsense psychology, states like beliefs, arsian account of beliefs and thoughts notes that thoughts and desires (P. M. Churchland 1981; Stich 1983; neuroscience has thus far failed to find syntactically struc- Christensen and Turner 1993). The existence of conscious tured, quasi-linguistic representations in the brain and pre- mental states such as pains and visual perceptions has also dicts that the future discovery of such quasi-linguistic states occasionally been challenged (P. S. Churchland 1983; Den- is unlikely (Van Gelder 1991). nett 1988; Rey 1983). Many authors have challenged the claim that common- Though advocates of eliminativism have offered a wide sense psychology is committed to a quasi-linguistic account variety of arguments, most of the arguments share a com- of intentional states (see, for example, Loar 1983; Stalnaker mon structure (Stich 1996). The first premise is that beliefs, 1984), and a number of arguments for the eliminativist’s thoughts, desires, or other mental states whose existence the second premise rely on less controversial claims about com- argument will challenge can be viewed as posits of a widely monsense psychology. One of these arguments (Ramsey, shared commonsense psychological theory, which is often Stich, and Garon 1990) maintains only that, according to called “folk psychology.” FOLK PSYCHOLOGY, this premise commonsense psychology, a belief is a contentful state that maintains, underlies our everyday discourse about mental can be causally involved in some cognitive episodes while it states and processes, and terms like “belief,” “thought,” and is causally inert in others. It is not the case that all of our “desire” can be viewed as theoretical terms in this common- beliefs are causally implicated in all of our inferences. How- sense theory. The second premise is that folk psychology is ever, there is a family of connectionist models of proposi- a seriously mistaken theory because some of the central tional memory in which information is encoded in a claims that it makes about the states and processes that give thoroughly holistic way. All of the information encoded in rise to behavior, or some of the crucial presuppositions of these models is causally implicated in every inference the these claims, are false or incoherent. This second premise model makes. Thus, it is claimed, there are no contentful has been defended in many ways, some of which will be states in these models which can be causally involved in considered in the following discussion. Both premises of the some cognitive episodes and causally inert in others. eliminativist argument are controversial. Indeed, debate Whether or not connectionist models of this sort will pro- about the plausibility of the second premise, and thus about vide the best psychological account of human propositional the tenability of commonsense psychology, has been one of memory is a hotly disputed question. But if they do, the the central themes in the philosophy of mind for several eliminativist argument maintains, then folk psychology will decades. From these two premises eliminativists typically turn out to be pretty seriously mistaken. A second argument draw a pair of conclusions. The weaker conclusion is that (due to Davies 1991) that relies on connectionist models the cognitive sciences that ultimately give us a correct begins with the claim that commonsense psychology is account of the workings of the human mind/brain will not committed to a kind of “conceptual modularity.” It requires refer to commonsense mental states like beliefs and desires; that there is “a single inner state which is active whenever a these states will not be part of the ontology of a mature cog- cognitive episode involving a given concept occurs and nitive science. The stronger conclusion is that these com- which can be uniquely associated with the concept con- monsense mental states simply do not exist. Most of the cerned” (Clark 1993). In many connectionist models, by discussion of eliminativism has focused on the plausibility contrast, concepts are represented in a context-sensitive of the premises, but several authors have argued that even if way. The representation of coffee in a cup is different from the premises are true, they do not give us good reason to the representation of coffee in a pot (Smolensky 1988). accept either conclusion (Lycan 1988; Stich 1996) Thus, there is no state of the model that is active whenever a Arguments in defense of the second premise typically cognitive episode involving a given concept occurs and that begin by making some claims about the sorts of states or can be uniquely associated with the concept concerned. If mechanisms that folk psychology invokes, and then arguing these models offer the best account of how human concepts that a mature cognitive science is unlikely to countenance are represented, then once again we have the conclusion that states or mechanisms of that sort. One family of arguments folk psychology has made a serious mistake. follows Wilfrid Sellars (1956) in maintaining that folk psy- Still another widely discussed family of arguments chology takes thoughts and other intentional states to be aimed at showing that folk psychology is a seriously mis- modeled on overt linguistic behavior. According to this taken theory focus on the fact that commonsense psychol- Sellarsian account, common sense assumes that beliefs are ogy takes beliefs, desires, and other intentional states to quasi-linguistic states and that thoughts are quasi-linguistic have semantic properties—truth or satisfaction conditions— 266 Eliminative Materialism and that commonsense psychological explanations seem to sciousness and Self-Regulation, vol. 3. New York: Plenum, pp. 1–39. attribute causal powers to intentional states that they have in Sellars, W. (1956). Empiricism and the philosophy of mind. In H. virtue of their semantic content. A number of reasons have Feigl and M. Scriven, Eds., The Foundations of Science and the been offered for thinking that this reliance on semantic con- Concepts of Psychology and Psychoanalysis: Minnesota Stud- tent will prove problematic. Some authors argue that seman- ies in the Philosophy of Science, vol. 1. Minneapolis: Univer- tic content is “wide”—it depends (in part) on factors outside sity of Minnesota Press, pp. 253–329. the head—and that this makes it unsuitable for the scientific Smolensky, P. (1988). On the proper treatment of connectionism. explanation of behavior (Stich 1978; Fodor 1987). Others Behavioral and Brain Sciences 11: 1–74. argue that semantic content is “holistic”—it depends on the Stalnaker, R. (1984). Inquiry. Cambridge, MA: Bradford Books/ entire set of beliefs that a person has—and that useful scien- MIT Press. tific generalizations cannot be couched in terms of such Stich, S. (1978). Autonomous psychology and the belief-desire thesis. The Monist 61: 573–591. holistic properties (Stich 1983). Still others argue that Stich, S. (1983). From Folk Psychology to Cognitive Science. Cam- semantic properties cannot be reduced to physical proper- bridge, MA: Bradford Books/MIT Press. ties, and that properties that cannot be reduced to physical Stich, S., and S. Laurence. (1994). Intentionality and naturalism. In properties cannot have causal powers. If this is right, then, Peter A. French and Theodore E. Uehling Jr., Eds., Midwest contrary to what folk psychology claims, semantic proper- Studies in Philosophy. Vol. 19, Naturalism. University of Notre ties are causally irrelevant (Van Gulick 1993). Finally, some Dame Press. Reprinted in Stich (1996). authors have urged that the deepest problem with common- Stich, S. (1996). Deconstructing the Mind. New York: Oxford Uni- sense psychology is that semantic properties cannot be “nat- versity Press. uralized”—there appears to be no place for them in our Van Gelder, T. (1991). What is the ‘D’ in ‘PDP’? A survey of the evolving, physicalistic view of the world (Fodor 1987; Stich concept of distribution. In W. Ramsey, S. Stich, and D. Rumel- hart, Eds., Philosophy and Connectionist Theory. Hillsdale, NJ: and Laurence 1994). Erlbaum, pp. 33–59. See also AUTONOMY OF PSYCHOLOGY; CONNECTIONISM; Van Gulick, R. (1993). Who’s in charge here? And who’s doing all PHILOSOPHICAL ISSUES; INDIVIDUALISM; MIND-BODY PROB- the work? In J. Heil and A. Mele, Eds., Mental Causation. LEM; MODULARITY OF MIND; PHYSICALISM Oxford: Clarendon Press, pp. 233–256. —Stephen Stich Further Readings References Baker, L. (1987). Saving Belief. Princeton: Princeton University Christensen, S., and D. Turner, Eds. (1993). Folk Psychology and Press. the Philosophy of Mind. Hillsdale, NJ: Erlbaum. Baker, L. (1995). Explaining Attitudes. Cambridge: Cambridge Churchland, P. M. (1981). Eliminative materialism and the propo- University Press. sitional attitudes. Journal of Philosophy 78: 67–90. Burge, T. (1986). Individualism and psychology. Philosophical Churchland, P. S. (1980). Language, thought and information pro- Review 95: 3–45. cessing. Nous 14: 147–170. Churchland, P. M. (1970). The logical character of action explana- Churchland, P. S. (1983). Consciousness: The transmutation of a tions. Philosophical Review 79: 214–236. concept. Pacific Philosophical Quarterly 64: 80–95. Churchland, P. M. (1989). Folk psychology and the explanation of Clark, A. (1993). Associative Engines. Cambridge, MA: Bradford human behavior. In P. M. Churchland, A Neurocomputational Books/MIT Press. Perspective. Cambridge, MA: MIT Press, pp. 111–127. Davidson, D. (1975). Thought and talk. In S. Guttenplan, Ed., Clark, A. (1989). Microcognition. Cambridge, MA: Bradford Mind and Language. Oxford: Oxford University Press. Books/MIT Press. Davies, M. (1991). Concepts, connectionism and the language of Clark, A. (1989/90). Connectionist minds. Proceedings of the Aris- thought. In W. Ramsey, S. Stich, and D. Rumelhart, Eds., Phi- totelian Society 90: 83–102. losophy and Connectionist Theory. Hillsdale, NJ: Erlbaum, pp. Clark, A. (1991). Radical ascent. Proceedings of the Aristotelian 229–257. Society Supplementary Volume 65: 211–227. Dennett, D. (1988). Quining qualia. In A. Marcel and E. Bisiach, Dretske, F. (1988). Explaining Behavior. Cambridge, MA: Brad- Eds., Consciousness in Contemporary Science. New York: ford Books/MIT Press. Oxford University Press. Egan, F. (1995). Folk psychology and cognitive architecture. Phi- Fodor, J. (1975). The Language of Thought. New York: Thomas Y. losophy of Science 62: 179–196. Crowell. Feyerabend, P. (1963). Materialism and the mind-body problem. Fodor, J. (1987). Psychosemantics. Cambridge, MA: Bradford Review of Metaphysics 17: 49–66. Books/MIT Press. Fodor, J. (1989). Making mind matter more. Philosophical Topics Loar, B. (1983). Must beliefs be sentences? In P. Asquith and T. 17: 59–80. Nickles, Eds., PSA 1982. Proceedings of the 1982 Biennial Horgan, T. (1982). Supervenience and microphysics. Pacific Philo- Meeting of the Philosophy of Science Association, vol. 2. East sophical Quarterly 63: 29–43. Lansing, MI: Philosophy of Science Association, pp. 627–643. Horgan, T. (1989). Mental quausation. Philosophical Perspectives Lycan, W. (1988). Judgement and Justification. Cambridge: Cam- 3: 47–76. bridge University Press. Horgan, T. (1993). From supervenience to superdupervenience: Ramsey, W., S. Stich, and J. Garon. (1990). Connectionism, elimi- Meeting the demands of a material world. Mind 102: 555– nativism and the future of folk psychology. Philosophical Per- 586. spectives 4: 499–533. Reprinted in Stich (1996). Horgan, T., and G. Graham. (1990). In defense of southern funda- Rey, G. (1983). A reason for doubting the existence of conscious- mentalism. Philosophical Studies 62: 107–134. Reprinted in ness. In R. Davidson, G. Schwartz, and D. Shapiro, Eds., Con- Christensen and Turner (1993), pp. 288–311. Emergentism 267 effects of two or more causes acting in the mechanical Horgan, T., and J. Woodward. (1985). Folk psychology is here to stay. Philosophical Review 94: 197–226. Reprinted in Chris- mode “homopathic effects,” and effects of two or more tensen and Turner (1993), pp. 144–166. causes acting in the chemical mode “heteropathic effects.” Jackson, F., and P. Pettit. (1990). In defense of folk psychology. Lewes called heteropathic effects emergents and homo- Philosophical Studies 59: 31–54. pathic ones resultants (McLaughlin 1992). Kim, J. (1989). Mechanism, purpose and explanatory exclusion. Mill’s work launched a tradition, British Emergentism, Philosophical Perspectives 3: 77–108. that flourished through the first third of the twentieth century O’Brien, G. (1991). Is connectionism common sense? Philosophi- (McLaughlin 1992). The main works in this tradition are cal Psychology 4: 165–178. Alexander Bain’s Logic (1843), George Henry Lewes’s O’Leary-Hawthorne, J. (1994). On the threat of elimination. Philo- Problems of Life and Mind (1875), Samuel Alexander’s sophical Studies 74: 325–346. Space, Time, and Deity (1920), Lloyd Morgan’s Emergent Rey, G. (1991). An explanatory budget for connectionism and eliminativism. In T. Horgan and J. Tienson, Eds., Connection- Evolution (1923), and C. D. Broad’s The Mind and Its Place ism and the Philosophy of Mind. Dordrecht, The Netherlands: in Nature (1925). There were also prominent American Kluwer Academic Publishers, pp. 219–240. emergentists: William JAMES, Arthur Lovejoy, and Roy Rorty, R. (1965). Mind-body identity, privacy, and categories. Wood Sellars; and in France, Henri Bergson developed a Review of Metaphysics 19: 24–54. brand of emergent evolution (Blitz 1992; Stephan 1992). In Rorty, R. (1970). In defense of eliminative materialism. Review of the 1920s in the former Soviet Union, the members of the Metaphysics 24: 112–121. Debron School, headed by A. M. Debron, spoke of the emer- Sterelny, K. (1990). The Representational Theory of Mind. Oxford: gence of new forms in nature and maintained that the mecha- Blackwell. nists “neglected the specific character of the definite levels or stages of the development of matter” (Kamenka 1972: Embeddedness 164). Alexander (1920) spoke of levels of qualities or proper- SeeSITUATED COGNITION AND LEARNING; SITUATEDNESS/ ties, maintaining that “the higher-level quality emerges from EMBEDDEDNESS the lower level of existence and has its roots therein, but it emerges therefrom, and it does not belong to that lower Embodiment level, but constitutes its possessor a new order of existent with its special laws of behavior. The existence of emergent qualities thus described is something to be noted, as some See MIND-BODY PROBLEM; SITUATEDNESS/EMBEDDEDNESS would say, under the compulsion of brute empirical fact, or, as I should prefer to say in less harsh terms, to be accepted Emergent Structuring with the ‘natural piety’ of the investigator. It admits no explanation” (1920: 46). Morgan (1923) connected the notions of emergence and evolution and argued for an evo- See EMERGENTISM; SELF-ORGANIZING SYSTEMS lutionary cosmology. Morgan maintained that through a process of evolution genuinely new qualities emerge that Emergentism generate new fundamental forces that effect “the go” of events in ways unanticipated by force-laws governing mat- George Henry Lewes coined the term emergence (Lewes ter at lower levels of complexity. 1875). He drew a distinction between emergents and result- Broad (1925) contrasted “the ideal of Pure Mechanism” ants, a distinction he learned from John Stuart Mill. In his with emergentism. Of the ideal of pure mechanism, he said: System of Logic (1843), Mill drew a distinction between “On a purely mechanical theory all the apparently different “two modes of the conjoint action of causes, the mechani- kinds of matter would be made of the same stuff. They cal and the chemical” (p. xviii). According to Mill, when would differ only in the number arrangement and move- two or more causes combine in the mechanical mode to ments of their constituent particles. And their apparently produce a certain effect, the effect is the sum of what would different kinds of behaviour would not be ultimately differ- have been the effects of each of the causes had it acted ent. For they would all be deducible from a single simple alone. Mill’s principal example of this is the effect of two principle of composition from the mutual influences of the or more forces acting jointly to produce a certain move- particles taken by pairs [he cites the Parallelogram Law]; ment: the movement is the vector sum of what would have and these mutual influences would all obey a single law been the effect of each force had it acted alone. According which is quite independent of the configuration and sur- to Mill, two or more causes combine in the chemical mode roundings in which the particles happen to find themselves” to produce a certain effect if and only if they produce the (1925: 45–46). He noted that “a set of gravitating particles, effect, but not in the mechanical mode. Mill used the term on the classical theory of gravitation, is an almost perfect chemical mode because chemical agents produce effects in example of the ideal of Pure Mechanism” (1925: 45). He a nonmechanical way. Consider a chemical process such as pointed out that according to pure mechanism, “the external CH4 + 2O2 → CO2 + 2H2O (methane → oxygen produces world has the greatest amount of unity which is conceivable. There is really only one science and the various ‘special sci- carbon dioxide + water). The product of these reactants act- ences’ are just particular cases of it” (1925: 76). In contrast, ing jointly is not in any sense the sum of what would have on the emergentist view “we have to reconcile ourselves to been the effects of each acting alone. Mill labeled the 268 Emergentism much less unity in the external world and a much less inti- microconditions and microlaws (McLaughlin 1992). One mate connexion between the various sciences. At best the issue that remains a topic of intense debate is whether, in external world and the various sciences that deal with it will something like this sense of emergence, bridge laws linking form a hierarchy” (1925: 77). He noted that emergentism conscious properties with physical properties are irreduc- can “keep the view that there is only one fundamental kind ible, emergent psychophysical laws, and conscious proper- of stuff” (1925: 77). However, if emergentism is true, then ties thereby irreducible, emergent properties (Popper and “we should have to recognize aggregates of various orders. Eccles 1977; Sperry 1980; Van Cleve 1990; Chalmers And there would be two fundamentally different types of 1996). laws, which might be called ‘intra-ordinal’ and ‘trans- See also ANOMALOUS MONISM; CONSCIOUSNESS; PHYSI- ordinal’ respectively. A trans-ordinal law would be one CALISM; PSYCHOLOGICAL LAWS; REDUCTIONISM which connects the properties of adjacent orders. . . . An —Brian P. McLaughlin intra-ordinal law would be one which connects the proper- ties of aggregates of the same order. A trans-ordinal law References would be a statement of the irreducible fact that an aggre- gate composed of aggregates of the next lower order in such Alexander, S. (1920). Space, Time, and Deity. 2 vols. London: and such proportions and arrangements has such and such Macmillan. characteristic and non-deducible properties” (1925: 77–78). Bain, A. (1870). Logic, Books II and III. London. Broad maintained that transordinal laws are irreducible, Beckermann, A., H. Flohr, and J. Kim, Eds. (1992). Emergence or emergent laws because they cannot be deduced from laws Reduction? Berlin: Walter de Gruyter. Blitz, D. (1992). Emergent Evolution: Qualitative Novelty and the governing aggregates at lower levels and any compositional Levels of Reality. Dordrecht: Kluwer Academic Publishers. principle governing lower levels. Chalmers, D. (1996). The Conscious Mind: In Search of a Theory The British emergentists intended their notion of emer- of Conscious Experience. New York: Oxford University Press. gence to imply irreducibility. However, they presupposed a Forrest, S., Ed. (1991). Emergent Computation. Cambridge, MA: Newtonian conception of mechanistic reduction. Quantum MIT Press/Bradford Books. mechanics broadened our conception of mechanistic reduc- Kamenka, E. (1972). Communism, philosophy under. In P. tion by providing holistic reductive explanations of chemi- Edwards, Ed., Encyclopedia of Philosophy, vol. 2. 2nd ed. New cal bonding that make no appeal to additive or even linear York: Macmillan. compositional principles. The quantum mechanical expla- Kauffman, S. (1993a). The Origins of Order: Self-Organization and nation of chemical bonding is a paradigm of a reductive Selection in Evolution. New York: Oxford University Press. Kauffman, S. (1993b). At Home in the Universe: The Search for explanation. Chemical phenomena are indeed emergent in the Laws of Self-Organization and Complexity. New York: the sense that the product of chemical reactants is a hetero- Oxford University Press. pathic effect of chemical agents; moreover, the chemical Lewes, G. H. (1875). Problems of Life and Mind, vol. 2. London: properties of atoms are not additive resultants of properties Kegan Paul, Trench, Turbner, and Co. of electrons, and so chemical properties of atoms are emer- Lovejoy, A. O. (1926). The meanings of “emergence” and its gent. However, the quantum mechanical explanation of modes. In E. S. Brightman, Ed., Proceedings of the Sixth Inter- chemical bonding teaches us that reductive explanations national Congress of Philosophy. New York, pp. 20–33. need not render the reduced property of a whole as an addi- McLaughlin, B. P. (1992). The rise and fall of British emergentism. tive resultant of properties of its parts. Reductions need not In A. Beckermann, H. Flohr, J. Kim, Eds., Emergence or invoke additive or even linear compositional principles. The Reduction? Berlin: Walter de Gruyter. McLaughlin, B. P. (1997). Emergence and supervenience. Intellec- quantum mechanical explanation of chemical bonding, and tia 25: 25–43. the ensuing successes of molecular biology (such as the dis- Mill, J. S. (1843). System of Logic. London: Longmans, Green, covery of the structure of DNA) led to the almost complete Reader, and Dyer. 8th ed., 1872. demise of the antireductionist, emergentist view of chemis- Morgan, C. L. (1923). Emergent Evolution. London: Williams and try and biology (McLaughlin 1992). Norgate. Nonetheless, the British emergentists’ notion of an emer- Popper, K. R., and J. C. Eccles. (1977). The Self and Its Brain. gent property as a property of a whole that is not an additive New York: Springer. resultant of, or even linear function of, properties of the Smart, J. J. C. (1981). Physicalism and emergence. Neuroscience parts of the whole continues to be fairly widely used (Kauff- 6: 1090–1113. man 1993a, 1993b). The term emergent computation is used Sperry, R. W. (1980). Mind-brain interaction: Mentalism, yes; dualism, no. Neuroscience 5: 195–206. to refer to the computation of nonlinear functions (see the Stephan, A. (1992). Emergence—A systematic view of its histori- essays in Forrest 1991). cal facets. In A. Beckermann, H. Flohr, J. Kim, Eds., Emer- In philosophical circles, there have been some attempts gence or Reduction? Berlin: Walter de Gruyter. to develop a notion of emergence, loosely based on the Brit- Van Cleve, J. (1990). Emergence vs. panpsychism: Magic or mind ish emergentist notion, but that actually implies ontological dust? In J. E. Tomberlin, Ed., Philosophical Perspectives, vol. irreducibility (Klee 1984; Van Cleve 1990; Beckermann, 4. Atascadero, CA: Ridgeview, pp. 215–226. Flohr, and Kim 1992; Kim 1992; McLaughlin 1992, 1997). These attempts invoke the notion of SUPERVENIENCE and Further Readings make no appeal to nonadditivity or nonlinearity. On one view, bridge laws linking micro and macro properties are Caston, V. (1997). Epiphenomenals, ancient and modern. Philo- emergent laws if they are not semantically implied by initial sophical Review 106. Emotion and the Animal Brain 269 McCabe et al. 1992; Fanselow 1994). In order for condition- Hempel, C. G., and P. Oppenheim. (1948). Studies in the logic of explanation. Philosophy of Science 15: 135–175. ing to take place and for learned responses to be evoked by Henle, P. (1942). The status of emergence. Journal of Philosophy the CS after conditioning, the CS has to be relayed through 39: 486–493. the auditory system to the amygdala. If the CS is relatively Horgan, T. (1993). From supervenience to superdupervenience: simple (a single tone), it can reach the amygdala either from Meeting the demands of a material world. Mind 102: 555–586. the auditory thalamus or the auditory cortex. In more com- Jones, D. (1972). Emergent properties, persons, and the mind-body plex stimulus conditions that require discrimination or CAT- problem. The Southern Journal of Philosophy 10: 423–433. EGORIZATION, the auditory cortex becomes involved, though Kim, J. (1992). “Downward causation” in emergentism and non- the exact nature of this involvement is poorly understood reductive materialism. In A. Beckermann, H. Flohr, and J. Kim, (see Jarrell et al. 1987; Armony et al. 1997). Eds., Emergence or Reduction? Berlin: Walter de Gruyter. CS information coming from either the auditory THALA- Klee, R. (1984). Micro-determinism and the concepts of emer- gence. Philosophy of Science 51: 44–63. MUS or the cortex arrives in the lateral nucleus of the Meehl, P. E., and W. Sellars. (1956). The concept of emergence. In amygdala and is then distributed to the central nucleus by H. Feigl and M. Scriven, Eds., The Foundations of Science and way of internal amygdala connections that have been eluci- the Concepts of Psychology and Psychoanalysis. Minnesota dated in some detail (Pitkanen et al. 1997). The central Studies in the Philosophy of Science, vol. 1. Minneapolis: Uni- nucleus, in turn, is involved in the control of the expression versity of Minnesota Press, pp. 239–252. of conditioned responses through its projections to a variety Morris, C. R. (1926). The notion of emergence. Proceedings of the of areas in the brainstem. These behavioral (e.g., freezing, Aristotelian Society Suppl. 6: 49–55. escape, fighting back), autonomic (e.g. blood pressure, heart Nagel, E. (1961). The Structure of Science. New York: Harcourt, rate, sweating), and hormonal (adrenaline and cortisol Brace and World. released from the adrenal gland) responses mediated by the Pap, A. (1951). The concept of absolute emergence. British Jour- nal for the Philosophy of Science 2: 302–311. central nucleus are involuntary and occur more or less auto- Pepper, S. (1926). Emergence. Journal of Philosophy 23: 241–245. matically in the presence of danger (though they are modu- Popper, K. (1979). Natural selection and the emergence of mind. lated somewhat by the situation). Dialectica 32: 279–355. Other brain areas implicated in fear conditioning are the Stace, W. T. (1939). Novelty, indeterminism, and emergence. HIPPOCAMPUS and prefrontal cortex. The hippocampus is Philosophical Review 48: 296–310. important in conditioning to contextual stimuli, such as the Stephan, A. (1997). Armchair arguments against emergentism. situation in which an emotional event occurs. Its role is Erkenntnis 46: 305–314. more of that of a high-level sensory/cognitive structure that integrates the situation into a spatial or conceptual “context” Emotion and the Animal Brain rather than that of an emotional processor per se (Kim and Fanselow 1992; Phillips and LeDoux 1992; LeDoux 1996). Emotion, long ignored within the field of neuroscience, has The medial area of the prefrontal cortex is important for at the end of the twentieth century been experiencing a extinction, the process by which the CS stops eliciting emo- renaissance. Starting around mid-century, brain researchers tional reactions when its association with the shock is weak- began to rely on the LIMBIC SYSTEM concept as an explana- ened (Morgan and LeDoux 1995). Fear/anxiety disorders, tion of where emotions come from (MacLean 1949), and where fear persists abnormally, may involve alterations in subsequently paid scant attention to the adequacy of that the function of this region (LeDoux 1996). account. Riding the wave of the cognitive revolution (Gard- The fear pathways can be summarized very succinctly. ner 1987), brain researchers have instead concentrated on They involve the transmission of information about external the neural basis of perception, MEMORY, ATTENTION, and stimuli to the amygdala and the control of emotional other cognitive processes. However, starting in the 1980s, responses by way of outputs of the amygdala. The simpli- studies of a particular model of emotion, classical fear con- city of this scheme suggests a clear mapping of certain psy- ditioning, began to suggest that the limbic system concept chological processes (stimulus evaluation and response could not provide a meaningful explanation of the emotional control) onto brain circuits, and leads to hypotheses about brain (LeDoux 1996). The success of these studies in identi- how other aspects of emotion (feeling or experience) come fying the brain pathways involved in a particular kind of about. However, it is important to point out that the ideas in emotion has largely been responsible for the renewed inter- the discussion that follows mainly pertain to the fear system est in exploring more broadly the brain mechanisms of emo- of the brain, inasmuch as other emotions have not been stud- tion, including a new wave of studies of EMOTION AND THE ied in sufficient detail to allow these kinds of relations to be HUMAN BRAIN. This article briefly reviews the neural path- discussed. ways involved in fear conditioning, and then considers how Stimulus evaluation or appraisal is a key concept in the the organization of the fear pathways provides a neuroana- psychology of emotion (Lazarus 1991; Scherer 1988; Frijda tomical framework for understanding emotional processing, 1986). Although most psychological work treats appraisal including emotional stimulus evaluation (appraisal), emo- as a high-level cognitive process, often involving conscious tional response control, and emotional experience (feelings). access to underlying evaluations, it is clear from studies of The brain circuits involved in fear CONDITIONING have animals and people that stimuli are first evaluated at a lower been most thoroughly investigated for situations involving (unconscious) level prior to, and perhaps independent of, an auditory conditioned stimulus (CS) paired with foot- higher-level appraisal processes (see LeDoux 1996). In par- shock (see LeDoux 1996; Davis 1992; Kapp et al. 1992; ticular, the amygdala, which sits between sensory processes 270 Emotion and the Animal Brain (including low-level sensory processes originating precorti- and body are in a state of emotional arousal. By integrating cally and higher-level cortical processes) and motor control immediate stimuli with long-term memories about the systems, is likely to be the neural substrate of early (uncon- occurrence of such stimuli in the past, together with the scious) appraisal in the fear system. Not only do cells in the arousal state of the brain and feedback from the bodily amygdala respond to conditioned fear stimuli, but they also expression of emotion, working memory might just be the learn the predictive value of new stimuli associated with stuff that feelings are made of. danger (Quirk, Repa, and LeDoux 1995; Rogan, Staubli, Ever since William JAMES raised the question of whether and LeDoux 1997). we run from the bear because we are afraid or whether we The amygdala receives inputs from a variety of cortical are afraid because we run, the psychology of emotion has areas involved in higher cognitive functions. These areas been preoccupied with questions about where fear and other project to the basal and accessory basal nuclei of the conscious feelings come from. Studies of fear conditioning amygdala (Pitkanen et al. 1997). Thus, the emotional have gone a long way by addressing James’s other question responses controlled by the amygdala can be triggered by —what causes bodily emotional responses (as opposed to low-level physical features of stimuli (intensity, color, form), feelings)? Although James was correct in concluding that higher-level semantic properties (objects), situations involv- rapid-fire emotional responses are not caused by feelings of ing configurations of stimuli, and thoughts or memories fear, he did not say much about how these come about. about stimuli, and imaginary stimuli or situations. In this However, as we now see, by focusing on the responses we way higher-level appraisal processes can be critically have been able to get a handle on how the system works, involved in the functioning of this system. It is important to and even have gotten some ideas about where the feelings note that these hypotheses about the neural substrate of come from. higher-level processes have emerged from a detailed elucida- See also CEREBRAL CORTEX; CONDITIONING AND THE tion of the physiology of lower-level processes. A bottom-up BRAIN; CONSCIOUSNESS, NEUROBIOLOGY OF; EMOTIONS; approach can be very useful when it comes to figuring out MEMORY, ANIMAL STUDIES; SENSATIONS how psychological processes are represented in the brain. —Joseph LeDoux and Michael Rogan Involuntary emotional responses are EVOLUTION’s imme- diate solution to the presence of danger. Once these References responses occur, however, higher-level appraisal mecha- nisms are often activated. We begin planning what to do, Armony, J. L., D. Servan-Schreiber, L. M. Romanski, J. D. Cohen, given the circumstances. We then have two kinds of and J. E. LeDoux. (1997). Stimulus generalization of fear response possibilities. Habits are well-practiced responses responses: Effects of auditory cortex lesions in a computational model and in rats. Cerebral Cortex 7: 157–165. that we have learned to use in routine situations. Emotional Baddeley, A. (1992). Working memory. Science 255: 556–559. habits can enable us to avoid danger and escape from it once Damasio, A. (1994). Descarte’s error: Emotion, reason, and the we are in it. These kinds of responses may involve the human brain. New York: Gosset/Putnam. amygdala, cortex, and BASAL GANGLIA (see LeDoux 1996; Davis, M. (1992). The role of the amygdala in conditioned fear. In Everitt and Robbins 1992; McDonald and White 1993). J. P. Aggleton, Ed., The Amygdala: Neurobiological Aspects of Finally, there are emotional actions, such as choosing to run Emotion, Memory and Mental Dysfunction. New York: Wiley- away rather than to stay put in the presence of danger, given Liss, pp. 255–306. our assessment of the possible outcomes of each course of Everitt, B. J., and T. W. Robbins. (1992). Amygdala-ventral striatal action. These voluntary actions are controlled by cortical interactions and reward-related processes. In J. P. Aggleton, decision processes, most likely in the frontal lobe (Damasio Ed., The Amygdala: Neurobiological Aspects of Emotion, Memory and Mental Dysfunction. New York: Wiley-Liss, pp. 1994; Goldman-Rakic 1992; Georgopolous et al. 1989). 401–429. Voluntary processes allow us to override the amygdala and Fanselow, M. S. (1994). Neural organization of the defensive become emotional actors rather than simply reactors behavior system responsible for fear. Psychonomic Bulletin and (LeDoux 1996). The ability to shift from emotional reaction Review 1: 429–438. to action is an important feature of primate and especially Frijda, N. (1986). The Emotions. Cambridge: Cambridge Univer- human evolution. sity Press. The problem of feelings is really the problem of CON- Gardner, H. (1987). The Mind’s New Science: A History of the SCIOUSNESS (LeDoux 1996). Emotion researchers have Cognitive Revolution. New York: Basic Books. been particularly plagued by this problem. Although we are Georgopoulos, A., J. T. Lurito, M. Petrides, A. B. Schwartz, and nowhere near solving the problem of consciousness (feel- J. T. Massey. (1989). Mental rotation of the neuronal population vector. Science 243: 234–236. ings), there have been some interesting ideas in the area of Goldman-Rakic, P. S. (1992). Circuitry of primate prefrontal cor- consciousness that may be useful in understanding feelings. tex and regulation of behavior by representational memory. In In particular, it seems that consciousness is closely tied up J. M. Brookhart and V. B. Mountcastle, Eds., Handbook of with the process we call WORKING MEMORY (Baddeley Physiology—The Nervous System V. Baltimore, MD: American 1992), a mental workspace where we think, reason, solve Physiological Society, pp. 373–417. problems, and integrate disparate pieces of information Jarrell, T. W., C. G. Gentile, L. M. Romanski, P. M. McCabe, and from immediate situations and long-term memory (Kosslyn N. Schneiderman. (1987). Involvement of cortical and thalamic and Koenig 1992; Johnson-Laird 1988; Kihlstrom 1987). In auditory regions in retention of differential bradycardia condi- light of this, we might postulate that feelings result when tioning to acoustic conditioned stimuli in rabbits. Brain working memory is occupied with the fact that one’s brain Research 412: 285–294. Emotion and the Human Brain 271 Johnson-Laird, P. N. (1988). The computer and the mind: An intro- neurons to biologically significant objects. In J. P. Aggleton, duction to cognitive science. Cambridge, MA: Harvard Univer- Ed., The Amygdala: Neurobiological Aspects of Emotion, sity Press. Memory, and Mental Dysfunction. New York: Wiley-Liss, pp. Kapp, B. S., P. J. Whalen, W. F. Supple, and J. P. Pascoe. (1992). 167–191. Amygdaloid contributions to conditioned arousal and sensory Rolls, E. T. (1992). Neurophysiology and functions of primate information processing. In J. P. Aggleton, Ed., The Amygdala: amygdala. In J. P. Aggleton, Ed., The Amygdala: Neurobiologi- Neurobiological Aspects of Emotion, Memory and Mental Dys- cal Aspects of Emotion, Memory, and Mental Dysfunction. New function. New York: Wiley-Liss. York: Wiley-Liss, pp. 143–166. Kihlstrom, J. F. (1987). The cognitive unconscious. Science 237: 1445–1452. Emotion and the Human Brain Kim, J. J., and M. S. Fanselow. (1992). Modality-specific retro- grade amnesia of fear. Science 256: 675–677. Kosslyn, S. M., and O. Koenig. (1992). Wet Mind: The New Cogni- Popular ideas about the mind evolve over time: emotion tive Neuroscience. New York: Macmillan. came to have its contemporary meaning only in the late Lazarus, R. S. (1991). Cognition and motivation in emotion. Amer- nineteenth century (Candland 1977). In current usage, the ican Psychologist 46: 352–367. concept of emotion has two aspects. One pertains to a cer- LeDoux, J. E. (1996). The Emotional Brain. New York: Simon and tain kind of subjective experience, “feeling.” The other Schuster. relates to expression, the public manifestation of feeling. MacLean, P. D. (1949). Psychosomatic disease and the “visceral These dual aspects of emotion—the subjective and the brain”: Recent developments bearing on the Papez theory of expressive—were represented a century ago in the writings emotion. Psychosomatic Medicine 11: 338–353. of William JAMES (1884), who speculated on the neural and McCabe, P. M., N. Schneiderman, T. W. Jarrell, C. G. Gentile, A. somatic basis of feeling, and Charles DARWIN (1872), who H. Teich, R. W. Winters, and D. R. Liskowsky. (1992). Central pathways involved in differential classical conditioning of heart examined the evolution of emotional expression in various rate responses. In I. Gormezano, Ed., Learning and Memory: species. Most workers in this area have also pointed out that The Behavioral and Biological Substrates. Hillsdale, NJ: feelings and the actions that go with them are an essential Erlbaum, pp. 321–346. part of an organism’s relation to its environment. Thus, McDonald, R. J., and N. M. White. (1993). A triple dissociation of together with more elaborated cognition, emotions can be memory systems: Hippocampus, amygdala, and dorsal stria- said to be the means by which an animal or person appraises tum. Behavioral Neuroscience 107(1): 3–22. the significance of stimuli so as to prepare the body for an Morgan, M., and J. E. LeDoux. (1995). Differential contribution of appropriate response. dorsal and ventral medial prefrontal cortex to the acquisition Emotion is traditionally distinguished from cognition, and extinction of conditioned fear. Behavioral Neuroscience and for most of this century received little research attention 109: 681–688. Phillips, R. G., and J. E. LeDoux. (1992). Differential contribution in its own right—excepting possibly studies of the brain of amygdala and hippocampus to cued and contextual fear con- mechanisms of aggression. Emotion per se has come to be ditioning. Behavioral Neuroscience 106: 274–285. embraced as a legitimate topic only in the last several Pitkanen, A., V. Savander, and J. E. LeDoux. (1997). Organization decades. Its acceptance was probably due in part to Ekman’s of intra-amygdaloid circuitries: An emerging framework for influential cross-cultural studies of human facial expression understanding functions of the amygdala. Trends in Neuro- (Ekman, Sorenson, and Friesen 1969), which implied an sciences 20: 517–523. innate, biological basis for emotional experience. Social Quirk, G. J., J. C. Repa, and J. E. LeDoux. (1995). Fear condition- factors have undoubtedly also facilitated the entry of emo- ing enhances short-latency auditory responses of lateral tion into the arena of neuroscience research, for current pop- amygdala neurons: Parallel recordings in the freely behaving ular culture upholds emotion as a significant feature of rat. Neuron 15: 1029–1039. Rogan, M. T., U. V. Staubli, and J. E. LeDoux. (1997). Fear condi- human life (McCarthy 1989). tioning induces associative long-term porentiation in the An additional factor in the acceptance of emotion as a amygdala. Nature 390: 604–607. neurobiological entity was MacLean’s (1952) persuasive Scherer, K. R. (1988). Criteria for emotion-antecedent appraisal: A account of a brain system specialized for emotion. Building review. In V. Hamilton, G. H. Bower, and N. H. Frijda, Cogni- on earlier anatomical theories, MacLean grouped together tive Perspectives on Emotion and Motivation. Norwell, MA: certain evolutionarily ancient brain structures, primarily Kluwer, pp. 89–126. regions of medial cortex and interconnected subcortical regions such as the hypothalamus, and called them the “vis- Further Readings ceral brain.” He suggested that activity in this region was Davis, M., W. A. Falls, S. Campeau, and M. Kim. (1994). Fear responsible for the subjective aspect of emotional experi- potentiated startle: A neural and pharmacological analysis. ence. Later, following terminology introduced by the anato- Behavioral Brain Research 53: 175–198. mist BROCA, he called these structures the LIMBIC SYSTEM. Gray, J. A. (1987). The Psychology of Fear and Stress, vol. 2. New In the years following MacLean’s account, researchers York: Cambridge University Press. have debated exactly which structures can be said to be LeDoux, J. E. (1994). Emotion, memory and the brain. Scientific “limbic.” Most often included are the AMYGDALA, septum, American 270: 32–39. hippocampal formation, orbitofrontal cortex, and cingulate Maren, S., and M. S. Fanselow. (1996). The amygdala and fear gyrus. However, it is now appreciated that no criteria—be conditioning: Has the nut been cracked? Neuron 16: 237–240. they anatomic, association with visceral function, or asso- Ono, T., and H. Nishijo. (1992). Neurophysiological basis of the ciation with the behavioral manifestations of emotional Kulver-Bucy Syndrome: Responses of monkey amygdaloid 272 Emotion and the Human Brain found to have difficulty interpreting emotional and non- experience—bind the regions traditionally called “limbic” emotional intonations of voice. These findings are consis- unequivocally and uniquely together, leaving the status of tent with a number of lesion studies carried out from the this proposal in doubt (LeDoux 1991). Indeed, James had 1930s to the 1960s in nonhuman primates, involving struc- asserted a century ago that there is no special brain system tures such as the amygdala, orbital frontal cortex, and cortex mediating emotional experience. Instead, he held, the of the temporal pole. Researchers had concluded, based on bodily changes brought about by a stimulus are themselves the animals’ impaired ability to interpret social signals, that experienced in turn through interoceptive pathways that these structures are part of a brain system specialized for project to sensory cortex; the latter somatic sensations social responsiveness in primates (Kling and Steklis 1976). “are” emotional experience. The role of afferent activity Indeed, case reports have repeatedly shown that humans from the body in producing states of feeling continues to be with lesions in structures such as the hypothalamus, emphasized: indeed, the idea that somatic sensations form amygdala, cingulate gyrus, and orbitofrontal cortex exhibit the critical core of ongoing subjective experience has been altered social behavior and expressiveness. It is at present repeatedly proposed by philosophers and psychologists. uncertain whether one should conceptualize the defective Most neuroscientists accept the idea that the body plays a performance of patients with amygdala lesions in terms of a role, but they also believe that there are particular structures primary deficiency of emotional state (e.g., fear) or a pri- in the human brain that are specialized for emotional expe- mary deficiency of social communication (e.g., ability to rience and behavior. interpret expression). There are several distinct themes in studies of the neural A third theme in emotion research is the neurochemistry basis of human emotion. One pertains to the role of neural of mood. The discovery that the antihypertensive drug reser- structures in producing states of feeling. In the 1950s, neu- pine induced depression gave rise to models of depression rosurgeons demonstrated that subjective emotional experi- that invoked catecholamine transmission. Subsequently, the ences, especially fear, could be produced by electrical discovery of abnormally low levels of serotonin in the cere- stimulation in the temporal lobes, particularly in the brospinal fluid of suicide victims gave rise to hypotheses amygdala and hippocampal formation. The amygdala has invoking serotonin. Both theories are supported by the effi- come to the fore again in modern imaging studies that sug- cacy of medications that enhance catecholaminergic and gest that individuals with familial depression have increased serotonergic transmission for the treatment of depression, metabolic activity in the left amygdala. Depression has been but empirical confirmation of hypotheses regarding the spe- associated with both decreased and increased activity in cific sites and mechanisms of action remains lacking. Other orbitofrontal cortex. Several decades ago, before the rise of workers have proposed a role for dopamine in disorders of activity-dependent imaging techniques, there was an interest mood. At present, the clear efficacy of antidepressant medi- in the relation between mood and hemispheric side of brain cations is not matched by an equally clear understanding of lesions, with several researchers concluding that strokes their mechanisms. Likewise, roles for GABAergic and sero- involving the left hemisphere, particularly the frontal tonergic systems in anxiety have been postulated, based on regions, produce depression, whereas strokes in the right the clinical effects of agents that interact with these neu- produce euphoria. Although this interpretation of lesion rotransmitters. Imaging studies show some promise of illu- data has been debated subsequently, stable differences in minating the relation between neurotransmitters and mood individual temperament have been attributed to differing in the future. patterns of activation of anterior frontal and temporal There are some persisting uncertainties in emotion regions in the two hemispheres. research. For one, workers have long debated the relative Links between emotion, memory, and learning have also contributions of somatic states and cognition to emotional attracted interest. Normal subjects seem to show a right experience. A principled distinction between somatic states hemisphere superiority for recall of affective material, and that are emotional and those that are not is impossible: as a subjects with greater activation of the right amygdala appear result, emotion cannot be defined in terms of somatic states to have a greater ability to recall emotional movies. Dama- alone. Furthermore, there is general agreement that somatic sio (1994) has emphasized the role of central representa- changes cannot be specific enough by themselves to yield the tions of relevant somatic states for acquiring appropriate various discriminable emotional experiences. But because responses to positive and negative situations. In support of somatic elements seem indispensable to emotion, research- his thesis, he has demonstrated that certain patients with ers such as Schachter and Singer (1962) have argued that orbitofrontal lesions, who seem unable to make appropriate cognitive appraisal of the stimulus must be combined with decisions in real life situations, are also deficient in their physiological arousal in order for an emotion to be pro- autonomic responses to arousing stimuli. duced. However, the notion of appraisal itself is complex. A second major theme in emotion research relates to the Another area of uncertainty concerns which emotions production and understanding of expressive behavior. The deserve to be called “basic” (Ortony and Turner 1990). right hemisphere appears to predominate for the production Finally, one of the pillars of the emotion concept is the idea and the perception of expressions, both facial and vocal. of subjective experience—feeling. This raises the thorny Indeed, the temporal cortex of the right hemisphere may problem of QUALIA, a philosophical term for the felt nature have a region specialized for decoding facial expression. Furthermore, some patients with bilateral damage to the of experience (cf. MIND-BODY PROBLEM). Nevertheless, amygdala are deficient in understanding facial expressions, despite—or even because of—these uncertainties, emotion especially expressions of fear. One such patient was also will continue to attract interest as a topic in cognitive science. Emotions 273 See also Gainotti, G. (1972). Emotional behavior and hemispheric side of EMOTIONS; EMOTION AND THE ANIMAL BRAIN; the lesion. Cortex 8: 41–55. FREUD; INTERSUBJECTIVITY Gold, P., F. Goodwin, and G. Chrousos. (1988). Clinical and bio- —Leslie Brothers chemical manifestations of depression. NEJM 319: 348–420. House, A., M. Dennis, C. Warlow, K. Hawton, and A. Molyneux. (1990). Mood disorders after stroke and their relation to lesion References location: A CT scan study. Brain 113: 1113–1128. Mayberg, H., S. Starkstein, C. Peyser, J. Brandt, R. Dannals, and S. Candland, D. (1977). The persistent problems of emotion. In D. Folstein. (1992). Paralimbic frontal lobe hypometabolism in Candland, J. Fell, E. Keen, A. Leshner, R. Tarpy, and R. depression associated with Huntington’s disease. Neurology 42: Plutchik, Eds., Emotion. Monterey, CA: Brooks-Cole, pp. 2–84. 1791–1797. Damasio, A. (1994). Descartes’ Error: Emotion, Reason, and the Morris, J., D. Frith, D. Perrett, D. Rowland, A. Young, A. Calder, Human Brain. New York: G. P. Putnam’s Sons. and R. Dolan. (1996). A differential neural response in the Darwin, C. (1872). The Expression of the Emotions in Man and human amygdala to fearful and happy facial expressions. Animals. Nature 383: 812–815. Ekman, P., E. Sorenson, and W. Friesen. (1969). Pan-cultural ele- Papez, J. (1937). A proposed mechanism of emotion. Arch. Neurol. ments in facial displays of emotion. Science 186: 86–88. Psychiat. 38: 725–743. James, W. (1884). What is an emotion? Mind 9: 188–205. Scott, S., A. Young, A. Calder, D. Hellawell, J. Aggleton, and M. Kling, A., and H. D. Steklis. (1976). A neural substrate for affilia- Johnson. (1997). Impaired auditory recognition of fear and tive behavior in nonhuman primates. Brain, Behavior and Evo- anger following bilateral amygdala lesions. Nature 385: 254– lution 13: 216–238. 257. LeDoux, J. (1991). Emotion and the limbic system concept. Con- Suberi, M., and W. McKeever. (1977). Differential right hemi- cepts Neurosci. 2: 169–199. spheric memory storage of emotional and non-emotional faces. MacLean, P. (1952). Some psychiatric implications of physiologi- Neuropsychologia 15: 757–768. cal studies on frontotemporal portion of limbic system (visceral Swerdlow, N., and G. Koob. (1987). Dopamine, schizophrenia, brain). Electroencephalog. Clin. Neurophysiol. 4: 407–418. mania, and depression: Toward a unified hypothesis of cortico- McCarthy, E. D. (1989). Emotions are social things: An essay in striato-pallido-thalamic function. Behav. Brain Sciences 10: the sociology of emotions. In D. Franks and E. D. McCarthy, 197–245. Eds., The Sociology of Emotions: Original Essays and Weintraub, S., and M-M. Mesulam. (1983). Developmental learn- Research Papers. Greenwich, CT: JAI Press, pp. 51–72. ing disabilities of the right hemisphere. Arch. Neurol. 40: 463– Ortony, A., and T. Turner. (1990). What’s basic about basic emo- 468. tions? Psychological Review 97: 315–331. Weintraub, S., M-M. Mesulam, and L. Kramer. (1981). Distur- Schachter, S., and J. Singer. (1962). Cognitive, social, and physio- bances in prosody: A right-hemisphere contribution to lan- logical determinants of emotional state. Psychological Review guage. Arch. Neurol. 38: 742–744. 69: 379–399. Further Readings Emotions Adolphs, R., H. Damasio, D. Tranel, and A. Damasio. (1996). Cor- tical systems for the recognition of emotion in facial expres- An emotion is a psychological state or process that func- sions. J. Neurosci. 16: 7678–7687. tions in the management of goals. It is typically elicited by Asberg, M., P. Thoren, L. Traskman, L. Bertilsson, and V. Ring- evaluating an event as relevant to a goal; it is positive when berger. (1976). “Serotonin Depression”—a biochemical sub- the goal is advanced, negative when the goal is impeded. group within the affective disorders? Science 191: 478–480. The core of an emotion is readiness to act in a certain way Cahill, L., R. Haier, J. Fallon, M. Alkire, C. Tang, D. Keator, J. (Frijda 1986); it is an urgency, or prioritization, of some Wu, and J. McGaugh. (1996). Amygdala activity at encoding correlated with long-term, free recall of emotional information. goals and plans rather than others. Emotions can interrupt Proc. Natl. Acad. Sci. USA 93: 8016–8021. ongoing action; also they prioritize certain kinds of social Calder, A., A. Young, D. Rowland, D. Perrett, J. Hodges, and N. interaction, prompting, for instance, COOPERATION or con- Etcoff. (1996). Facial emotion recognition after bilateral flict. amygdala damage: Differentially severe impairment of fear. The term emotional is often used synonymously with the Cognitive Neuropsychology 13: 699–745. term affective. Emotions proper usually have a clear rela- Davidson, R. (1992). Anterior cerebral asymmetry and the nature tion to whatever elicited them. They are often associated of emotion. Brain and Cognition 20: 125–151. with brief (lasting a few seconds) expressions of face and Drevets, W., and M. Raichle. (1995). Positron emission tomo- voice, and with perturbation of the autonomic nervous sys- graphic imaging studies of human emotional disorders. In M. tem. Such manifestations often go unnoticed by the person Gazzaniga, Ed., The Cognitive Neurosciences. Cambridge, MA: MIT Press, pp. 1153–1164. who has the emotion. A consciously recognized emotion Drevets, W., J. Price, J. Simpson, R. Todd, T. Reich, M. Vannier, lasts minutes or hours. A mood has similar bases to an emo- and M. Raichle. (1997). Subgenual prefrontal cortex abnormal- tion but lasts longer; whereas an emotion tends to change ities in mood disorders. Nature 386: 824–827. the course of action, a mood tends to resist disruption. At Duffy, E. (1941). An explanation of “emotional” phenomena with- the longer end of the time spectrum, an emotional disorder, out the use of the concept “emotion.” J. Gen. Psychology 25: usually defined as a protracted mood plus specific symp- 283–293. toms, lasts from weeks to years. Personality traits, most Fried, I., C. Mateer, G. Ojemann, R. Wohns, and P. Fedio. (1982). with an emotional basis, last for years or a lifetime. (Defini- Organization of visuospatial functions in human cortex: Evi- tions, distinctions, and the philosophical and psychological dence from electrical stimulation. Brain 105: 349–371. 274 Emotions background of emotions discussed in the next paragraphs, Arnold and Gasson 1954). She proposed that emotions are are described in more detail by Oatley and Jenkins 1996.) relational: they relate selves, including physiological sub- Emotions have been analyzed by some of the world’s strates, to events in the world. Events are appraised, con- leading philosophers, including Aristotle, DESCARTES, and sciously or unconsciously, for their suitability to the Spinoza. Following Aristotle, in whose functionalist subject’s goals, for whether desired objects are available or account emotions were species of cognitive evaluations of not, and according to several other features of the event and events, most philosophical work on emotions has been cog- its context. Appraisal researchers have shown that which nitive. The stoics developed subtle analyses of emotions, emotion is produced by any event depends on which arguing that most were deleterious, because people had appraisals are made (e.g., Frijda 1986). Work on appraisal wrong beliefs and inappropriate goals. Stoic influence has was extended by Lazarus (1991) to research on coping and continued. Its modern descendent is cognitive therapy for its effects on health. A third approach was also begun in the emotional disorders. 1950s, by Tomkins (see, e.g., 1995). He proposed that, based Charles DARWIN (1872) argued that emotional expres- on feedback from bodily processes, particularly from sions are behavioral equivalents of vestigial anatomical expressions of the face, emotions act as amplifiers to spe- organs like the appendix; they derive from earlier phases of cific motivational systems. Personality is structured by sche- EVOLUTION or of individual development, and in adulthood mas, each with a theme of some emotional issue. Tomkins they occur whether or not they are of any use. According to inspired a surge of research (e.g., Scherer and Ekman 1984) William JAMES (1884), FOLK PSYCHOLOGY wrongly assumes that did much to place the study of emotions on an accepted that an event causes an emotion, which in turn causes a reac- empirical base. Notable has been the study of facial expres- tion. Instead, he argued that an emotion is a perception of the sions and their relation to emotions, both developmentally physiological reactions by the body to the event; emotions and cross-culturally. Some aspects of such expressions are give color to experience but, as perceptions of physiological agreed to be HUMAN UNIVERSALS, although how they are changes, they occur after the real business of producing best analyzed remains controversial. A fourth approach behavior is over. Following James, there has been a long tra- occurred with attempts to reconcile the work of James with dition of regarding emotions as bodily states, and although cognitive ideas: notable were Schachter and Singer (1962), cognitive approaches now dominate the field, body-based who proposed that emotion was a physiological perturba- research on emotions continues to be influential (see the tion, as had also been proposed by James, although not with third and fourth approaches following). FREUD developed the distinctive patterning that James had suggested; instead theories of emotional disorder, proposing that severe emo- an undifferentiated arousal was made recognizable by cogni- tional experiences, whether of trauma or conflict, undermine tive labeling (a kind of appraisal). This work has been RATIONAL AGENCY subsequently, and interfere with life. extended by Mandler (1984) who, like Simon (see next para- Cultural distrust of emotions was exacerbated by the graph) stressed that emotions occur when an ongoing activ- work of Darwin, James, and Freud. There seemed to be ity is interrupted, and an expectancy is violated. something wrong with emotions; they were either without Prompted by difficulties of COGNITIVE MODELING in cap- useful function in adult life or actively dysfunctional. Start- turing what is essential about the organization of human ing in the 1950s, however, several influential movements action, Simon (1967) argued that because resources are began with cognitive emphases, all stressing function, and always finite, any computational system operating in any all making it clear that emotions typically contribute to complex environment needs some system to manage PLAN- rationality instead of being primarily irrational. One result NING, capable of interrupting ongoing processes. The sys- of these movements has been to expand concepts of cogni- tem for handling interruptions can be identified with the tion to include emotion. Among the first cognitive emotional system of human beings. An extended idea of approaches to emotions, in the 1950s, was Bowlby’s (see Simon’s proposal can be put like this: In the ordinary world e.g., 1971). Bowlby proposed the idea of emotional attach- there are three large problems for orchestrating cognitively ment of infant to a mother or other caregiver. He was influ- based action. enced by theories of evolution and of PSYCHOANALYSIS. His 1. Mental models are always incomplete and sometimes compelling analogy was with the ethological idea of incorrect; resources of time and power are always lim- imprinting. With attachment—love—in infancy, a child’s ited. emotional development is based on the child’s building a 2. Human beings typically have multiple goals, not all of MENTAL MODEL of its relationship with the caregiver which can be reconciled. 3. Human beings are those agents who accomplish (Bowlby called it a “working model”) to organize the child’s together what they cannot do alone; hence individual relational goals and plans. goals and plans are typically parts of distributed cogni- Mental models are also known as schemas. Developmen- tive systems. talists have done much to demonstrate the importance of emotional schemas for structuring close relationships (see Although cooperation helps overcome limitations of SOCIAL COGNITION). Such demonstrations include those of resources, it exacerbates problems of multiple goals and children’s models of interaction with violent parents, proba- requires coordination of mental models among distributed bly functional in the family where they first occur, but often agents. These three problems ensure that fully rational solu- maladaptive in the outside world where they play a large role tions to most problems in life are rare. Humans’ biologically in later aggressive delinquency (Dodge, Bates, and Pettit based solution is the system of emotions. These provide 1990). A parallel, second approach was that of Arnold (e.g., genetically based heuristics for situations that affect ongoing Epiphenomenalism 275 action and that have recurred during evolution (e.g., threats, Scherer, K., and P. Ekman. (1984). Approaches to Emotion. Hills- dale, NJ: Erlbaum. losses, frustrations), they outline scripts for coordination Simon, H. A. (1967). Motivational and emotional controls of cog- with others during cooperation, social threat, interpersonal nition. Psychological Review 74: 29–39. conflict, etc.; and they serve as bases for constructing new Tomkins, S. S. (1995). Exploring Affect: Selected Writings of Syl- parts of the cognitive system when older parts are found van S. Tomkins. E. V. Demos., Ed. New York: Cambridge Uni- wrong or inadequate. versity Press. Much recent research is concerned with effects of emo- tions and moods. Emotions bias cognitive processing during Empiricism judgment and inference, giving preferential availability to some heuristics rather than others. For instance, happiness allows unusual associations and improves creative PROBLEM See INTRODUCTION: PHILOSOPHY; BEHAVIORISM; RATIO- SOLVING (Isen, Daubman, and Nowicki 1987); anxiety con- NALISM VS. EMPIRICISM strains ATTENTION to features of the environment concerned with safety or danger; sadness prompts recall from MEMORY Epiphenomenalism of incidents from the past that elicited comparable sadness. Such biases provide bases for both normal functions, and for disordered emotional processing (Mathews and Mac- The traditional doctrine of epiphenomenalism is that mental Leod 1994). phenomena are caused by physical phenomena but do not As compared with research on learning or perception, themselves cause anything. Thus, according to this doctrine, research on emotions has been delayed. With newer cog- mental states and events are causally inert, causally impo- nitive emphases, however, emotions are seen to serve im- tent; they figure in the web of causal relations only as portant intracognitive and interpersonal functions. A effects, never as causes. James Ward (1903) coined the term remarkable convergence is occurring: as well as support epiphenomenalism for this doctrine. However, William from evidence of social and developmental psychology, the JAMES (1890) was the first to use the term epiphenomena to largely functionalist account given here is supported by evi- mean phenomena that lack causal efficacy. (It is possible dence from animal neuroscience (EMOTION AND THE ANI- that his use of the term was inspired by the medical use of MAL BRAIN) and human neuropsychology (EMOTION AND epiphenomena to mean symptoms of an underlying condi- THE HUMAN BRAIN). There is growing consensus: emotions tion.) Huxley (1874) and Hodgson (1870) earlier discussed are managers of mental life, prompting heuristics that relate the doctrine of epiphenomenalism under the heading of the flow of daily events to goals and social concerns. “Conscious Automatism.” They both held that conscious states are caused by physiological states but have no causal —Keith Oatley effect on physiological states (see Caston 1997). According to proponents of epiphenomenalism, mental References phenomena seem to be causes only because they figure in regularities. For example, instances of a certain type of Arnold, M. B., and J. Gasson. (1954). Feelings and emotions as mental occurrence M (e.g., trying to raise one’s arm) might dynamic factors in personality integration. In M. B. Arnold and tend to be followed by instances of a type of physical occur- J. Gasson, Eds., The Human Person. New York: Ronald, pp. rence P (e.g., one’s arm’s rising). But it would be fallacious 294–313. to infer from that regularity that instances of M tend to cause Bowlby, J. (1971). Attachment and Loss, vol. 1: Attachment. Lon- don: Hogarth. instances of P: it would be to commit the fallacy of post Darwin, C. (1872). The Expression of the Emotions in Man and hoc, ergo propter hoc. According to the epiphenomenalist, Animals. London: Murray. when an M-type occurrence is followed by a P-type occur- Dodge, K. A., J. Bates, and G. Pettit. (1990). Mechanisms in the rence, the occurrences are dual effects of some common cycle of violence. Science 250: 1678–1683. physical cause. Frijda, N. H. (1986). The Emotions. Cambridge: Cambridge Uni- Epiphenomenalism is a shocking doctrine. If it is true, versity Press. then a PAIN could never cause us to wince or flinch, some- Isen, A. M., K. Daubman, and G. Nowicki. (1987). Positive affect thing’s looking red to us could never cause us to think it is facilitates creative problem solving. Journal of Personality and red, and a nagging headache could never cause us to be in a Social Psychology 52: 1122–1131. bad mood. Indeed, if epiphenomenalism is true, then James, W. (1884). What is an emotion? Mind 9: 188–205. Lazarus, R. S. (1991). Emotion and Adaptation. New York: Oxford although one thought may follow another, one thought never University Press. results in another. If thinking is a causal process, it follows Mandler, G. (1984). Mind and Body: Psychology of Emotions and that we never engage in the activity of thinking. Stress. New York: Norton. A central premise in the argument for epiphenomenalism Mathews, A., and C. MacLeod. (1994). Cognitive approaches to is that for every (caused) event, e, there is a causal chain of emotion and emotional disorders. Annual Review of Psychology physical events leading to e such that each link in the chain 45: 25–50. determines (or, if strict determinism is false, determines the Oatley, K., and J. M. Jenkins. (1996). Understanding Emotions. objective probability of) its successor. Such physical causal Cambridge, MA: Blackwell. chains are said to leave “no gap” to be filled by mental Schachter, S., and J. Singer. (1962). Cognitive, social and physio- occurrences, and it is thus claimed that mental occurrences logical determinants of emotional state. Psychological Review 69: 379–399. are epiphenomena (McLaughlin 1994). 276 Epiphenomenalism One critical response to this no-gap line of argument for recent years (see the essays in Heil and Mele 1993). But one epiphenomenalism is that physical events underlie mental can find the concern about type or property epiphenomenal- events in such a way that mental events are causally effica- ism even in ancient philosophical texts. Aristotle appears to cious by means of the causal efficacy of their underlying have criticized the harmonia theory of the soul—the theory physical events. The task for proponents of this response is according to which the soul is like the harmonia of a musi- to say what it is, exactly, for a physical event to underlie a cal instrument, its tuning or mode—on the grounds that it mental event, and to explain how mental events can count as implies property epiphenomenalism (see Caston 1997). causes in virtue of the causal efficacy of their underlying Type epiphenomenalism is itself a stunning doctrine. If physical events. The underlying relationship is typically type epiphenomenalism is true, then nothing has any causal spelled out in terms of some relationship between mental powers in virtue of (because of) being an instance of a men- and physical event types. An explication of the relationship tal type (or having a mental property). Thus, it could never should yield an account of how causal efficacy can transmit be the case that it is in virtue of being an urge to scratch that from underlying physical events to mental events. (See the a state results in scratching behavior; and it could never be discussion of realization that follows.) the case that it is in virtue of a state’s being a belief that dan- Perhaps the leading response to the no-gap line of argu- ger is near that it results in fleeing behavior. If type epiphe- ment, however, is that every token mental event is identical nomenalism is true, the mental qua mental, so to speak, is with some token physical event or other. According to this causally inert: mental types and properties make no differ- token physicalism, an event can be both an instance of a ence to causal transactions between states and events (Sosa mental type (e.g., belief) and an instance of a distinct physi- 1984; Horgan 1989). cal type (e.g., a neurophysiological type). CAUSATION is an How can mental types be related to physical types so that extensional relation between states and events: if two states type epiphenomenalism fails? How can mental types be or events are causally related, they are so related however related to physical types so that an event can be a cause in we may type or describe them. Given that the causal relation virtue of falling under a mental type? How must mental is extensional, because particular mental states and events types relate to physical types so as not to compete for causal are physical states and events with causal effects, mental relevance? states and events are causes, and thus epiphenomenalism is The notion of multiple realization is often invoked in false (Davidson 1970, 1993). response to such questions. It is claimed that mental types The token-identity response to epiphenomenalism does are multiply realized by physical types and that sometimes not, however, escape the issue of how mental and physical a mental type is causally relevant to a certain effect type, types (or properties) are related (McLaughlin 1989; 1993). whereas the relevant underlying, realizing physical type is For it prompts a concern about the relevance of mental prop- merely a matter of implementational detail (Putnam erties or types to causal relations. C. D. Broad (1925) char- 1975a; Yablo 1992a, 1992b). This happens whenever acterized the view that mental events are epiphenomena as instances of the mental type would produce an effect of the the view “that mental events either (a) do not function at all sort in question however the mental type is in fact physi- as causal-factors; or that (b) if they do, they do so in virtue cally realized. of their physiological characteristics and not in virtue of This, of course, raises the issue of what realization is. On their mental characteristics” (1925: 473). one notion of realization, the realization relation is that of Following Broad, we can distinguish two kinds of determinable to determinate (Yablo 1992a, 1992b). But it is epiphenomenalism (McLaughlin 1989): highly controversial whether mental types are determinables of physical types. On a related notion of realization, realiza- Token Epiphenomenalism Physical events cause mental tion is spelled out in terms of the notion of a causal role and events, but mental events have no causal effects. the notion of a role-player. Mental state types are types of Type Epiphenomenalism Events are causes in virtue of fall- functional states: they are second-order states, states of ing under physical types, but no event causes anything in being in a state that plays a certain casual role (Loar 1981). virtue of falling under a mental type. The first-order states that realize them are physical states (Property epiphenomenalism is the thesis that no event can that play the causal roles in question. The second-order state cause anything in virtue of having or being an exemplifica- may be multiply realizable in that there are many different tion of a mental property.) Token epiphenomenalism first-order states that play the causal role. It is controversial implies type epiphenomenalism; for if an event can cause whether appeal to this notion of realization can warrant the something in virtue of falling under a mental type, then an rejection of type epiphenomenalism. For it is arguable that event could be both a mental event and a cause, and thus second-order state types are themselves epiphenomena. It is token epiphenomenalism would be false. However, type arguably not in virtue of falling under such a second-order epiphenomenalism is compatible with the denial of token state type that a state has causal effects, but rather in virtue epiphenomenalism: mental events may be causes, but only of falling under some (relevant) first-order state type (Block in virtue of falling under physical types, and not in virtue of 1990). falling under mental types. Whether type epiphenomenal- Another notion of realization treats mental concepts (i.e., ism is true, and whether certain doctrines about the mind concepts of mental states) as equivalent to functional con- (such as Davidson’s (1970) doctrine of ANOMALOUS cepts, but treats mental states themselves as first-order states MONISM, which implies token physicalism) imply type that play the relevant causal role (Armstrong 1968; Lewis epiphenomenalism, have been subjects of intense debate in 1980). On this view, the concept of mental state M of an Epiphenomenalism 277 organism (or system) is equivalent to the concept of being a Burge, T. (1979). Individualism and the mental. Midwest Studies in Philosophy 4: 73–121. state of the organism (or system) that plays a certain causal Caston, V. (1997). Epiphenomenalisms, ancient and modern. role R. It is claimed that the states that answer to the func- Philosophical Review 106. tional concepts in question are invariably physical states, but Chalmers, D. (1996). The Conscious Mind: In Search of a Theory which physical states they are may vary from species to spe- of Conscious Experience. New York: Oxford University Press. cies, or even within a species, perhaps even within a given Davidson, D. (1970). Mental events. In L. Foster and J. W. Swan- individual at different times. son, Eds., Experience and Theory. Amherst: University of Mas- Concerns about type epiphenomenalism remain, how- sachusetts Press. ever. We type intentional mental states not only by their Davidson, D. (1993). Thinking causes. In J. Heil and A. Mele, intentional mode (e.g., belief, desire, intention), but also by Eds., Mental Causation. Oxford: Oxford University Press. their content (by what is believed, what is desired, or what is Dretske, F. (1988). Explaining Behavior: Reasons in a World of Causes. Cambridge, MA: MIT Press/Bradford Books. intended). According to externalist theories of content, the Heil, J., and A. Mele, Eds. (1993). Mental Causation. Oxford: content of a mental state can fail to supervene on intrinsic Oxford University Press. physical states of its occupant (Putnam 1975b; Burge 1979). Hill, C. (1991). Sensations: A Defense of Type Materialism. Cam- Two intrinsic physical duplicates could have mental states bridge: Cambridge University Press. (e.g., beliefs) with different contents. Thus intentional state Hodgson, S. H. (1870). The Theory of Practice: An Ethical types seem to involve contextual, environmental factors. Enquiry. 2 vols. London: Longmans, Green, Reader, and Dyer. The concern is that the contextual, environmental compo- Horgan, T. (1989). Mental quasation. Philosophical Perspectives 3: nent of content is causally irrelevant to behavior. This is a 47–76. problem in that the contents of beliefs and desires figure Huxley, T. H. (1874/1901). On the hypothesis that animals are essentially in belief-desire explanations of behavior. The automata, and its history. Reprinted in T. H. Huxley, Method and Results. Collected Essays, vol. 1. New York: D. Appleton problem is exacerbated by the fact that on some externalist and Company. theories, content depends on historical context (Dretske Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quar- 1988), and that on some it can depend on social context terly 32: 127–136. (Burge 1979). Lepore, E., and B. Loewer. (1987). Mind matters. Journal of Phi- A concern also remains about whether qualitative mental losophy 84: 630–642. states (states that have a subjective experiential aspect) are Lewis, D. (1980). Mad pain and martian pain. In N. Block, Ed., epiphenomena. Our concepts of sensory states—e.g., aches, Readings in the Philosophy of Psychology, vol. 1. Cambridge, pains, itches, and the like—are arguably not functional con- MA: Harvard University Press. cepts in either sense of functional concepts (Hill 1991). This Loar, B. (1981). Mind and Meaning. Cambridge: Cambridge Uni- has led some philosophers to embrace token dualism for versity Press. McLaughlin, B. P. (1989). Type dualism, type epiphenomenalism, such states and to maintain both type and token epiphenom- and the causal priority of the physical. Philosophical Perspec- enalism for them (Jackson 1982; Chalmers 1996). In rejec- tives 3: 109–135. tion of epiphenomenalism for qualitative states, some McLaughlin, B. P. (1993). On Davidson’s response to the charge of philosophers argue that sensory concepts are equivalent to epiphenomenalism. In Heil, J., and A. Mele, Eds., Mental Cau- functional concepts (White 1991). And some argue that sation. Oxford: Oxford University Press, 27–40. although sensory concepts are not equivalent to functional McLaughlin, B. P. (1994). Epiphenomenalism. In S. Guttenplan, concepts or physical concepts, nonetheless, sensory proper- Ed., A Companion to the Philosophy of Mind. Oxford: Black- ties are identical with neural properties (Hill 1991). well, pp. 277–288 Whether PHYSICALISM is true for sensory states raises the McLaughlin, B. P. (1995). Mental causation. In Encyclopedia of mind-body problem in perhaps its toughest form. That a Philosophy, Supplementary Volume. London: Routledge and Kegan Paul. nagging headache can cause one to be in a bad mood and Putnam, H. (1975a). Philosophy and our mental life. In H. Putnam, that an itch can cause one to scratch seem to be as intuitive Ed., Philosophical Papers, vol. 2. Cambridge: Cambridge Uni- cases of mental causation as one can find. But how, and versity Press. indeed even whether, a qualitative aspect of a mental state Putnam, H. (1975b). The meaning of “meaning.” In H. Putnam, (e.g., the achiness of the headache) can be causally relevant Ed., Philosophical Papers, vol. 2. Cambridge: Cambridge Uni- remains an issue of intense debate. versity Press. See also INDIVIDUALISM; MENTAL CAUSATION; MENTAL Sosa, E. (1984). Mind-body interaction and supervenient causa- REPRESENTATION; NARROW CONTENT tion. Midwest Studies in Philosophy 9: 271–281. Ward, J. (1896–98/1903). The Conscious Automaton Theory. Lec- —Brian P. McLaughlin ture XII of Naturalism or Agnosticism, vol. 2. London: Adam and Charles Black, pp. 34–64. White, S. L. (1991). The Unity of the Self. Cambridge, MA: MIT/ References Bradford Books. Armstrong, D. M. (1968). A Materialist Theory of Mind. London: Yablo, S. (1992a). Mental causation. Philosophical Review 101: Routledge and Kegan Paul. 245–280. Block, N. (1990). Can the mind change the world? In G. Boolos, Yablo, S. (1992b). Cause and essence. Synthese 93: 403–499. Ed., Meaning and Method: Essays in Honor of Hilary Putnam. Cambridge: Cambridge University Press. Further Readings Broad, C. D. (1925). The Mind and Its Place in Nature. London: Routledge and Kegan Paul. Fodor, J. A. (1987). Psychosemantics. Cambridge, MA: MIT Press. 278 Episodic vs. Semantic Memory 8. Information retrieved from either system can be Jackson, F., and P. Pettit. (1988). Broad contents and functional- ism. Mind 47: 381–400. expressed and communicated to others symbolically. Kim, J. (1984). Epiphenomenal and supervenient causation. Mid- 9. Information in both systems is accessible to INTROSPEC- west Studies in Philosophy 9: 257–270. TION: we can consciously “think” about things and Kim, J. (1993). Supervenience and Mind. Cambridge: Cambridge events in the world, as we can “think” about what we did University Press. yesterday afternoon, or in the summer camp at age ten. 10. The processes of both forms of memory depend criti- cally on the integrity of the medial temporal lobe and Episodic vs. Semantic Memory diencephalic structures of the brain. Consider now the differences. Episodic memory is a recently evolved, late developing, past-oriented memory system, probably unique to humans, 1. The simplest way of contrasting episodic and semantic that allows remembering of previous experiences as experi- memory is in terms of their functions: episodic memory enced. William JAMES (1890) discussed it as simply “mem- is concerned with remembering, whereas semantic mem- ory.” The advent of many different forms of memory since ory is concerned with knowing. Episodic remembering James’s time has made adjectival modifications of the term takes the form of “mental travel through subjective necessary. Semantic memory is the closest relative of epi- time,” accompanied by a special kind of awareness (“autonoetic,” or self-knowing, awareness). Semantic sodic memory in the family of memory systems. It allows knowing takes the form of thinking about what there is, humans and nonhuman animals to acquire and use knowl- or was, or could be in the world; it is accompanied by edge about their world. Although humans habitually express another kind of awareness (“noetic,” or knowing aware- and exchange their knowledge through language, language ness). Language is frequently involved in both episodic is not necessary for either remembering past experiences or and semantic memory, but it need not be. knowing facts about the world. 2. The relation between remembering and knowing is one Episodic and semantic memory are alike in many ways, of embeddedness: remembering always implies know- and for a long time were thought of and classified together ing, whereas knowing does not imply remembering. as an undifferentiated “declarative” memory that was distin- 3. Episodic memory is arguably a more recent arrival on guished from “procedural” memory. Nevertheless, rapidly the evolutionary scene than semantic memory. Many animals other than humans, especially mammals and accumulating evidence suggests that episodic and semantic birds, possess well-developed knowledge-of-the-world memory are fundamentally different in a number of ways, (semantic memory) systems. But there is no evidence and therefore need to be treated separately. In what follows, that they have the ability to autonoetically remember the similarities and differences are briefly summarized. past events in the way that humans do. Episodic and semantic systems share a number of fea- 4. Episodic lags behind semantic memory in human devel- tures that collectively define “declarative” (or “cognitive”) opment. Young children acquire a great deal of knowl- memory in humans. edge about their world before they become capable of adult-like episodic remembering. 1. Both are large and complex, and have unmeasurable 5. Episodic memory is the only form of memory that is ori- capacity to hold information, unlike WORKING MEM- ented toward the past: retrieval in episodic memory nec- ORY, which has limited capacity. essarily involves thinking “back” to an earlier time. All 2. Cognitive operations involved in encoding of informa- other forms of memory, including semantic memory, are tion are similar for both episodic and semantic memory. present-oriented: utilization (retrieval) of information Frequently a single short-lived event is sufficient for a usually occurs for the purpose of whatever one is doing permanent “addition” to the memory store, unlike in now without any thinking “back” to the experiences of many other forms of learning that require repeated the past. experiences of a given kind. 6. Episodic remembering is characterized by a state of 3. Both are open to multimodal influences and can receive awareness (autonoetic) that is different from that in semantic memory (noetic). When one recollects an event information for storage through different sensory autonoetically, one reexperiences aspects of a past expe- modalities, as well as from internally generated sources. rience; when one recalls a fact learned in the past, reex- 4. The operations of both systems are governed by princi- periencing of the learning episode is not necessary. ples such as encoding specificity and transfer-appropriate 7. Episodic remembering has an affectively laden “tone” processing. that is absent in semantic knowing. William James 5. Stored information in both systems represents aspects (1890) referred to is as a “feeling of warmth and inti- of the world, and it has truth value, unlike many other macy.” forms of learned behavior that do not. Given the many similarities and some fundamental dif- 6. Both are “cognitive” systems: their informational “con- ferences between episodic and semantic memory, it is diffi- tents” can be thought about independently of any overt cult to provide a simple description of the relation between action, although such action can be and frequently is the two. According to one proposal, however, the relation is taken. As cognitive systems, episodic and semantic process-specific: The two systems operate serially at the memory differ from all forms of procedural memory in time of encoding: information “enters” episodic memory which overt behavior at input and output is obligatory. “through” semantic memory. They operate in parallel in 7. Information in both systems is flexibly accessible holding the stored information: a given datum may be stored through a variety of retrieval queries and routes. Episodic vs. Semantic Memory 279 in one or both systems. And the two systems can act inde- See alsoAGING, MEMORY, AND THE BRAIN; IMPLICIT VS. pendently at the time of retrieval: recovery of episodic infor- EXPLICIT MEMORY; MEMORY, HUMAN NEUROPSYCHOLOGY; mation can occur separately of retrieval of semantic WORKING MEMORY, NEURAL BASIS OF information (Tulving 1995). —Endel Tulving The term “episodic memory” is sometimes used in senses that differ from the memory-systems orientation pre- References sented here. Some writers use “episodic memory” in its original sense of task orientation: Episodic memory refers Buckner, R. (1996). Beyond HERA: Contributions of specific pre- to tasks in which information is encoded for storage on a frontal brain areas to long-term memory. Psychonomic Bulletin particular occasion. This kind of usage is popular in work and Review 3: 149–158. Buckner, R., and E. Tulving. (1995). Neuroimaging studies of with animals. Other writers use “episodic memory” as a par- memory: Theory and recent PET results. In F. Boller and J. ticular kind of memory information or “material,” namely Grafman, Eds., Handbook of Neuropsychology, vol. 10. past “events,” in contrast with the “facts” of semantic mem- Amsterdam: Elsevier. pp. 439–466. ory. The systems-based definition of episodic memory as Cabeza, R., and L. Nyberg. (1997). Imaging cognition: An empiri- described here is more comprehensive than either the task- cal review of PET studies with normal subjects. Journal of specific and material-specific definitions. Finally, some Cognitive Neuroscience. writers still prefer the traditional view that there is only one Curran, H. V., J. M. Gardiner, R. Java, and D. Allen. (1993). kind of declarative memory, and they use the terms “epi- Effects of lorazepam upon recollective experience in recogni- sodic” and “semantic” for descriptive purposes only. tion memory. Psychopharmacology 110: 374–378. The evidential basis for the distinction between episodic Düzel, E., A. P. Yonelinas, H-J. Heinze, G. R. Mangun, and E. Tulving. (1997). Event-related brain potential correlates of two and semantic memory has been growing steadily over the states of conscious awareness in memory. Proceedings of past ten or fifteen years. General reviews have been provided National Academy of Sciences USA 94: 5973–5978. by Nyberg and Tulving (1996), Nyberg, McIntosh, and Tulv- Fletcher, P. C., C. D. Frith, and M. D. Rugg. (1997). The functional ing (1997), and Wheeler, Stuss, and Tulving (1997). Func- neuroanatomy of episodic memory. Trends in Neurosciences tional dissociations between autonoetic and noetic awareness 20: 213–218. in memory retrieval have been reviewed by Gardiner and Java Gardiner, J. M., and R. Java. (1993). Recognizing and remember- (1993). Developmental evidence for the distinction has been ing. In A. F. Collins, S. E. Gathercole, M. A. Conway, and P. presented by Mitchell (1989), Nelson (1993), Nilsson et al. E. Morris, Eds., Theories of Memory. Hove, England: (1997), and Perner and Ruffman (1995). Pertinent psychop- Erlbaum. harmacological data have been reported by Curran et al. Haxby, J. V., L. G. Ungerleider, B. Horwitz, J. M. Maisog, S. L. Rapoport, and C. L. Grady. (1996). Face encoding and recogni- (1993). Dissociations between episodic and semantic mem- tion in the human brain. Proceedings of the National Academy ory produced by known or suspected brain damage have been of Sciences USA 93: 922–927. reported, among others, by Hayman, Macdonald, and Tulv- Hayman, C. A. G., C. A. Macdonald, and E. Tulving. (1993). The ing (1993); Markowitsch (1995); Shimamura and Squire role of repetition and associative interference in new semantic (1987); and Vargha-Khadem et al. (1997). Electrophysiologi- learning in amnesia. Journal of Cognitive Neuroscience 5: 375– cal correlates of “remembering” versus “knowing” have been 389. described by Düzel et al. (1997), and differences in EEG James, W. (1890). Principles of Psychology. New York: Dover. power spectra in episodic versus semantic retrieval by Klime- Klimesch, W., H. Schimke, and J. Schwaiger. (1994). Episodic and sch, Schimke, and Schwaiger (1994). Finally, evidence in semantic memory: An analysis in the EEG theta and alpha support of the distinction between episodic and semantic band. Electroencephalography and Clinical Neurophysiology 91: 428–441. memory has been provided by a number of recent studies of Markowitsch, H. J. (1995). Which brain regions are critically functional neuroimaging, especially POSITRON EMISSION involved in the retrieval of old episodic memory. Brain TOMOGRAPHY (Buckner and Tulving 1995). One of the most Research Reviews 21: 117–127. persistent findings is that episodic retrieval is accompanied Mitchell, D. B. (1989). How many memory systems? Evidence by changes in neuronal activity in brain regions such as the from aging. Journal of Experimental Psychology: Learning, right prefrontal cortex, medial parietal cortical regions, and Memory and Cognition 15: 31–49. the left CEREBELLUM, whereas comparable semantic retrieval Nelson, K. (1993). The psychological and social origins of auto- processes are accompanied by changes in the left frontal and biographical memory. Psychological Science 4: 7–14. temporal regions (Buckner 1996; Cabeza and Nyberg 1997; Nilsson, L. G., L. Bäckman, K. Erngrund, L. Nyberg, R. Adolfs- Fletcher, Frith, and Rugg 1997; Haxby et al. 1996; Nyberg, son, G. Bucht, S. Karlsson, M. Widing, and B. Winblad. (1997). The Betula prospective cohort study: Memory, health, Cabeza, and Tulving 1996, Nyberg, McIntosh, and Tulving and aging. Aging and Cognition 1: 1–36. 1997; Shallice et al. 1994; Tulving et al. 1994). Future studies Nyberg, L., and E. Tulving. (1996). Classifying human long-term will undoubtedly further clarify the emerging picture of the memory: Evidence from converging dissociations. European functional neuroanatomy of episodic and semantic memory. Journal of Cognitive Psychology 8: 163–183. It is a moot question whether episodic and semantic Nyberg, L., R. Cabeza, and E. Tulving. (1996). PET studies of memory are basically similar or basically different. The encoding and retrieval: The HERA model. Psychonomic Bulle- question is not unlike one about basic similarities and differ- tin and Review 3: 135–148. ences between, say, vertebrates and invertebrates. As fre- Nyberg, L., A. R. McIntosh, and E. Tulving. (1997). Functional quently happens in science, it all depends on one’s interest brain imaging of episodic and semantic memory. Journal of and purpose. Molecular Medicine 76: 48–53. 280 Epistemology and Cognition Perner, J., and T. Ruffman. (1995). Episodic memory and autono- National Institute of Mental Health et al: The development and etic consciousness: Developmental evidence and a theory of neural bases of higher cognitive functions. (1989, Philadelphia, childhood amnesia. Journal of Experimental Child Psychology Pennsylvania). Annals of the New York Academy of Sciences 59: 516–548. 608: 572–595. Shallice, T., P. Fletcher, C. D. Frith, P. Grasby, R. S. J. Fracowiak, and Schacter, D. L., and E. Tulving. (1994). What are the memory sys- R. J. Dolan. (1994). Brain regions associated with acquisition and tems of 1994? In D. L. Schacter and E. Tulving, Eds., Memory retrieval of verbal episodic memory. Nature 368: 633–635. Systems 1994. Cambridge, MA: MIT Press, pp. 1–38. Shimamura, A. P., and L. R. Squire. (1987). A neuropsychological Squire, L. R. (1993). Memory and the hippocampus: A synthesis study of fact memory and source amnesia. Journal of Experi- from findings with rats, monkeys, and humans. Psychological mental Psychology: Learning, Memory, and Cognition 13: Review 99: 195–231. 464–473. Tulving, E. (1983). Elements of Episodic Memory. Oxford: Claren- Tulving, E., S. Kapur, F. I. M. Craik, M. Moscovitch, and S. don Press. Houle. (1994). Hemispheric encoding/retrieval asymmetry in Tulving, E. (1985). How many memory systems are there? Ameri- episodic memory: Positron emission tomography findings. can Psychologist 40: 385–398. Proceedings of the National Academy of Sciences USA 91: Tulving, E. (1985). Memory and consciousness. Canadian Psy- 2016–2020. chology 26: 1–12. Tulving, E. (1995). Organization of memory: Quo vadis? In M. S. Tulving, E. (1991). Concepts of human memory. In L. Squire, G. Gazzaniga, Ed., The Cognitive Neurosciences. Cambridge, Lynch, N. M. Weinberger, and J. L. McGaugh, Eds., Memory: MA: MIT Press, pp. 839–847. Organization and Locus of Change. New York: Oxford Univer- Vargha-Khadem, F., D. G. Gadian, K. E. Watkins, A. Connelly, W. sity Press, pp. 3–32. Van Paesschen, and M. Mishkin. (1997). Differential effects of Tulving, E. (1993). What is episodic memory? Current Perspec- early hippocampal pathology on episodic and semantic mem- tives in Psychological Science 2: 67–70. ory. Science 277: 376–380. Tulving, E. (1998). Brain/mind correlates of memory. In M. Wheeler, M. A., D. T. Stuss, and E. Tulving. (1997). Toward a the- Sabourin, M. Robert, and F. I. M. Craik, Eds., Advances in Psy- ory of episodic memory: The frontal lobes and autonoetic con- chological Science, vol. 2: Biological and cognitive aspects. sciousness. Psychological Bulletin 121: 331–354. Hove, England: Psychology Press. Further Readings Epistemology and Cognition Dalla Barba, G., M. C. Mantovan, E. Ferruzza, and G. Denes. (1997). Remembering and knowing the past: A case study of Epistemology and cognition is the confluence of the philos- isolated retrograde amnesia. Cortex 33: 143–154. ophy of knowledge and the science of cognition. Epistemol- Horner, M. D. (1990). Psychobiological evidence for the distinc- tion between episodic and semantic memory. Neuropsychology ogy is concerned with the prospects for human knowledge, Review 1: 281–321. and because these prospects depend on the powers and frail- Humphreys, M. S., J. D. Bain, and R. Pike. (1989). Different ways ties of our cognitive equipment, epistemology must work to cue a coherent memory system: A theory for episodic, hand in hand with cognitive science. Epistemology centers semantic, and procedural tasks. Psychological Review 96: 208– on normative or evaluative questions about cognition: what 233. are the good or right ways to think, reason, and form Humphreys, M. S., J. Wiles, and S. Dennis. (1994). Toward a the- beliefs? But normative assessments of cognitive systems ory of human memory: Data structures and access processes. and activities must rest on their descriptive properties, and Behavioral and Brain Sciences 17: 655–692. characterizing those descriptive properties is a task for cog- Huron, C., J. M. Danion, F. Giacomoni, D. Grange, P. Robert, and nitive science. Historically rationalism and empiricism L. Rizzo. (1995). Impairment of recognition memory with, but not without, conscious recollection in schizophrenia. American exemplified this approach by assigning different values to Journal of Psychiatry 152: 1737–1742. reason and the senses based on different descriptions of Kihlstrom, J. F. (1984). A fact is a fact is a fact. Behavioral and their capacities. Brain Sciences 7: 243–244. Epistemic evaluation can take many forms. First, it can Kitchner, E.G., J. R. Hodges, and R. McCarthy. (1998). Acquisi- evaluate entire cognitive systems, selected cognitive sub- tion of post-morbid vocabulary and semantic facts in the systems, or particular cognitive performances. Second, there absence of episodic memory. Brain 121: 1313–1327. are several possible criteria of epistemic assessment. (1) A Mandler, G. (1987). Memory: Conscious and unconscious. In P. R. system or process might be judged by its accuracy or verid- Solomon, G. R. Goethals, C. M. Kelley, and B. R. Stephens, icality, including its reliability—the proportion of true judg- Eds., Memory—An Interdisciplinary Approach. New York: ments it generates—and its power—the breadth of tasks or Springer, p. 42. McKoon, G., R. Ratcliff, and G. S. Dell. (1985). A critical evalua- situations in which it issues accurate judgments (Goldman tion of the semantic-episodic distinction. Journal of Experi- 1986). (2) It can be judged by its conformity or nonconfor- mental Psychology: Learning, Memory and Cognition 12: mity with normatively approved formal standards, such as 295–306. deductive validity or probabilistic coherence. (3) It might be Roediger, H. L., M. S. Weldon, and B. H. Challis. (1989). Explain- evaluated by its adaptiveness, or conduciveness for achiev- ing dissociations between implicit and explicit measures of ing desire-satisfaction or goal-attainment (Stich 1990). Nor- retention: A processing account. In H. L. Roediger and F. I. M. mative assessments of these kinds are not only of theoretical Craik, Eds., Varieties of Memory and Consciousness: Essays in interest, but also admit of several types of application. You Honour of Endel Tulving. Hillsdale, N.J: Erlbaum, pp. 3–41. might judge another person’s belief to be untrustworthy if Roediger, H. L., S. Rajaram, and K. Srinivas. (1990). Specifying the process productive of that belief is unreliable, or you criteria for postulating memory systems. Conference of the Epistemology and Cognition 281 might deploy a powerful intellectual strategy to improve rational under the social exchange approach if the criterion your own cognitive attainments. of adaptiveness is applied. To illustrate the reliability and power criteria, as well as The dominant approach to probabilistic reasoning is the the two types of application just mentioned, consider MEM- judgments and heuristics (or “heuristics and biases”) ORY and its associated processes (Schacter 1996). Studies approach (Tversky and Kahneman 1974; see JUDGMENT partly prompted by “recovered” memory claims show that HEURISTICS). Its proponents deny that people reason by memory is strongly susceptible to postevent distortions, means of normatively appropriate rules. Instead, people both in adults and especially in children. Suggestive ques- allegedly use shortcuts such as the “representativeness heu- tioning of preschool children can have devastating effects ristic,” which can yield violations of normative rules such as on the reliability of their memories (Ceci 1995). Knowing nonutilization of base rates and commission of the conjunc- that someone underwent suggestive questioning should tion fallacy (Tversky and Kahneman 1983). If someone give third parties grounds for distrusting related memories. resembles the prototype of a feminist bank teller more than Another application is the use of encoding strategies to the prototype of a bank teller, subjects tend to rate the prob- boost memory power. A runner hit on the technique of cod- ability of her being both a feminist and a bank teller higher ing a series of digits in terms of running times. After than the probability of her being a bank teller. According to months of practice with this coding strategy, he could recall the probability calculus, however, a conjunction cannot have over eighty digits in correct order after being exposed to a higher probability than one of its conjuncts. them only once (Chase and Ericsson 1981). Analogous Recent literature challenges the descriptive claims of the encoding strategies can help anyone increase his memory heuristics approach as well as its normative conclusions. power. A third memory example illustrates tradeoffs Gigerenzer (1991) finds that many so-called cognitive between different epistemic standards. Reliable recollec- biases or illusions “disappear” when tasks are presented in tion often depends on “source memory,” the recall of how frequentist terms. People do understand probabilities, but you encountered an object or event. Witnesses have errone- only in connection with relative frequencies, not single ously identified alleged criminals because they had seen cases. Koehler (1996) surveys the literature on the base-rate them outside the context of the crime, for example, on tele- fallacy and disputes on empirical grounds the conventional vision. A sense of familiarity was retained, but the source wisdom that base rates are routinely ignored. He adds that of this familiarity was forgotten (Thomson 1988). People because base rates are not generally equivalent to prior often fail to keep track of the origins of their experience or probabilities, a Bayesian normative standard does not man- beliefs. Is this epistemically culpable? The adaptiveness date such heavy emphasis on base rates. It is not easy to criterion suggests otherwise: forgetting sources is an eco- decide, then, whether people have a general competence at nomical response to the enormous demands on memory probabilistic reasoning, or the circumstances in which such (Harman 1986). a competence will be manifested. A related subject is the The formal-standards criterion of epistemic normativity forms of teaching that can successfully train people in nor- is applied to both DEDUCTIVE REASONING and PROBABILIS- matively proper reasoning (Nisbett 1993). TIC REASONING. It is unclear exactly which formal standards Epistemologists are traditionally interested in deciding are suitable criteria for epistemic rationality. Must a cogni- which beliefs or classes of belief meet the standard for tive system possess all sound rules of a natural deduction knowledge, where knowledge includes at least true justified system to qualify as deductively rational? Or would many belief. The crucial normative notion here is JUSTIFICATION such rules suffice? Must these rules be natively endowed, or (rather than rationality). Can cognitive science help address would it suffice that the system acquires them under appro- this question? One affirmative answer is supported by a reli- priate experience? Whether human deductive capacities able process theory of justification (Goldman 1986). If a qualify as rational depends on which normative criterion is justified belief is (roughly) a belief produced by reliable chosen, as well as on the descriptive facts concerning these cognitive processes, that is, processes that usually output capacities, about which there is ongoing controversy. truths, then cognitive science can assist epistemology by One psychological approach says that people’s deductive determining which mental processes are reliable. Reliabi- competence does come in the form of abstract rules akin to lism is not the only theory of justification, however, that natural deduction rules (Rips 1994). On this view, people’s promotes a tight link between epistemology and cognitive native endowments might well qualify as rational, at least science. Any theory can do so that emphasizes the cognitive under the weaker criterion (“many rules”) mentioned above. sources or context of belief. Other approaches deny that people begin with purely Assuming that humans do have extensive knowledge, abstract deductive principles, a conclusion supported by both epistemologists and cognitive scientists ask how such content effects discovered in connection with Wason’s knowledge is possible. During much of the twentieth cen- selection task. Cheng and Holyoak (1985) suggest that gen- tury, philosophers and psychologists assumed that general- eralized rules are induced from experience because of their purpose, domain-neutral learning mechanisms, such as usefulness. They explain (modest) deductive competence by deductive and inductive reasoning, were responsible. The reference to inductive capacities for acquiring rules, thereby most influential approach in current cognitive science, how- allowing for rationality under one of the foregoing propos- ever, is DOMAIN SPECIFICITY (Hirschfeld and Gelman 1994). als. Cosmides (1989) contends that evolution provided us On this view, the mind is less an all-purpose problem solver with specific contentful rules, ones useful in the context of than a collection of independent subsystems designed to social exchange. Reasoning capacities might qualify as perform circumscribed tasks. Whether in language, vision, 282 Essentialism or other domains, special-purpose mod- Cherniak, C. (1986). Minimal Rationality. Cambridge, MA: MIT FOLK PSYCHOLOGY, Press. ules have been postulated. As the philosopher Goodman Churchland, P. (1989). A Neurocomputational Perspective. Cam- (1955) emphasized, wholly unconstrained INDUCTION leads bridge, MA: MIT Press. to indeterminacy or antinomy. To resolve this indetermi- Dretske, F. (1981). Knowledge and the Flow of Information. Cam- nacy, cognitive scientists have sought to identify constraints bridge, MA: MIT Press. on learning or representation in each of multiple domains Fodor, J. (1983). Modularity of Mind. Cambridge, MA: MIT Press. (Keil 1981; Gelman 1990). Gilovich, T. (1991). How We Know What Isn’t So. New York: Free See also NATIVISM; NATIVISM, HISTORY OF; PROPOSI- Press. TIONAL ATTITUDES; RATIONALISM VS. EMPIRICISM; TVER- Goldman, A. (1993). Philosophical Applications of Cognitive Sci- ence. Boulder, CO: Westview Press. SKY, AMOS Gopnik, A., and A. Meltzoff. (1997). Words, Thoughts, and Theo- —Alvin Goldman ries. Cambridge, MA: MIT Press. Johnson-Laird, P., and R. Byrne. (1991). Deduction. Hillsdale, NJ: References Erlbaum. Karmiloff-Smith, A. (1992). Beyond Modularity. Cambridge, MA: Ceci, S. (1995). False beliefs: Some developmental and clinical MIT Press. considerations. In D. Schacter, J. Coyle, G. Fischbach, M. Kornblith, H., Ed. (1993). Naturalizing Epistemology. 2nd ed. Mesulam, and L. Sullivan, Eds., Memory Distortion. Cam- Cambridge, MA: MIT Press. bridge, MA: Harvard University Press, pp. 91–128 Nisbett, R., and L. Ross. (1980). Human Inference. Englewood Chase, W., and K. Ericsson. (1981). Skilled memory. In J. Ander- Cliffs, NJ: Prentice-Hall. son, Ed., Cognitive Skills and Their Acquisition. Hillsdale, NJ: Pollock, J. (1995). Cognitive Carpentry. Cambridge, MA: MIT Erlbaum. Press. Cheng, P., and K. Holyoak. (1985). Pragmatic reasoning schemas. Spelke, E., K. Breinlinger, J. Macomber, and K. Jacobson. (1992). Cognitive Psychology 17: 391–416. Origins of knowledge. Psychological Review 99: 605–632. Cosmides, L. (1989). The logic of social exchange: Has natural Sperber, D., D. Premack, and A. Premack, Eds. (1995). Causal selection shaped how humans reason? Cognition 31: 187–276. Cognition. New York: Oxford University Press. Gelman, R. (1990). First principles organize attention to and learn- Stein, E. (1996). Without Good Reason. Oxford: Oxford University ing about relevant data: Number and the animate-inanimate dis- Press. tinction as examples. Cognitive Science 14: 79–106. Thagard, P. (1992). Conceptual Revolutions. Princeton, NJ: Prince- Gigerenzer, G. (1991). How to make cognitive illusions disappear: ton University Press. Beyond “heuristics and biases.” European Review of Social Psychology 2: 83–115. Essentialism Goldman, A. (1986). Epistemology and Cognition. Cambridge, MA: Harvard University Press. Goodman, N. (1955). Fact, Fiction, and Forecast. Cambridge, MA: Psychological essentialism is any folk theory of concepts Harvard University Press. positing that members of a category have a property or Harman, G. (1986). Change in View. Cambridge, MA: MIT Press. attribute (essence) that determines their identity. Psycholog- Hirschfeld, L., and S. Gelman, Eds. (1994). Mapping the Mind: ical essentialism is similar to varieties of philosophical Domain Specificity in Cognition and Culture. Cambridge: essentialism, with roots extending back to ancient Greek Cambridge University Press. philosophers such as Plato and Aristotle. One important dif- Keil, F. (1981). Constraints on knowledge and cognitive develop- ference, however, is that psychological essentialism is a ment. Psychological Review 88: 197–227. Koehler, J. (1996). The base rate fallacy reconsidered: Descriptive, claim about human reasoning, and not a metaphysical claim normative, and methodological challenges. Behavioral and about the structure of the real world. Psychological essen- Brain Sciences 19: 1–17. tialism may be divided into three types: sortal, causal, and Nisbett, R., Ed. (1993). Rules for Reasoning. Hillsdale, NJ: ideal (see Gelman and Hirschfeld in press). Erlbaum. The sortal essence is the set of defining characteristics Rips, L. (1994). The Psychology of Proof. Cambridge, MA: MIT that all and only members of a category share. This notion Press. of essence is captured in Aristotle’s (1924) distinction Schacter, D. (1996). Searching for Memory. New York: Basic Books. between essential and accidental properties (see also Keil’s Stich, S. (1990). The Fragmentation of Reason. Cambridge, MA: 1989 defining versus characteristic properties): the essential MIT Press. properties constitute the essence. For example, on this view Thomson, D. (1988). Context and false recognition. In G. Davies and D. Thomson, Eds., Memory in Context: Context in Mem- the essence of a grandmother would be the property of being ory, Chichester, England: Wiley, pp. 285–304. the mother of a person’s parent (rather than the accidental or Tversky, A., and D. Kahneman. (1974). Judgment under uncer- characteristic properties of wearing glasses and having gray tainty: Heuristics and biases. Science 185: 1124–1131. hair). In effect, this characterization is a restatement of the Tversky, A., and D. Kahneman. (1983). Extensional vs. intuitive classical view of concepts: meaning (or identity) is supplied reasoning: The conjunction fallacy in probability judgment. by a set of necessary and sufficient features that determine Psychological Review 91: 293–315. whether an entity does or does not belong in a category (Smith and Medin 1981). Specific essentialist accounts then Further Readings provide arguments concerning which sorts of features are essential. The viability of this account has been called into Carey, S. (1985). Conceptual Change in Childhood. Cambridge, question by more recent models of concepts that stress the MA: MIT Press. Essentialism 283 suggests that human concepts are not constructed atomisti- importance of probabilistic features, exemplars, and theo- cally from perceptual features. ries in concepts. Causal essentialism is closely related to the notion of In contrast, the causal essence is the substance, power, “kinds” or NATURAL KINDS (Schwartz 1977). Whereas a quality, process, relationship, or entity that causes other category-typical properties to emerge and be sustained, category is any grouping together of two or more discrim- and that confers identity. Locke (1894/1959: Book III, p. inably different things, a kind is a category that is believed 26) describes it as “the very being of anything, whereby it to be based in nature, discovered rather than invented, and is what it is. And thus the real internal, but generally . . . capturing indefinitely many similarities. “Tigers” is a kind; unknown constitution of things, whereon their discover- the set of “striped things” (including tigers, striped shirts, able qualities depend, may be called their essence.” The and barbershop poles) is not, because it captures only a sin- causal essence is used to explain the observable properties gle, superficial property (stripedness); it does not capture of category members. Whereas the sortal essence could nonobvious similarities, nor does it serve as a basis of apply to any entity, the causal essence applies only to enti- induction (e.g., Mill 1843; Markman 1989). Similarly, the ties for which inherent, hidden properties determine ad hoc category of “things to take on a camping trip” does observable qualities. For example, the causal essence of not form a kind (Barsalou 1991). Whereas kinds are treated water may be something like H2O, which is responsible as having essences, other categories are not. It is not yet for various observable properties that water has. Thus, the known which categories are construed as “kinds” over cluster of properties “odorless, tasteless, and colorless” is development. The majority of evidence for causal essential- not a causal essence of water, despite being true of all ism obtains from animal categories. However, similar members of the category Water, because the properties beliefs seem to characterize how people construe social cat- have no direct causal force on other phenomenal properties egories such as race, gender, and personality. These racial, of that kind. gender, or personality “essences” may be analogical exten- Causal essentialism requires no specialized knowledge, sions from a folk biological notion, or an outgrowth of a and, in contrast to sortal essentialism, people may possess more general “essentialist construal” (see Atran 1990; an “essence placeholder” without knowing what the essence Carey 1995; Keil 1994; Pinker 1994 for discussion). is (Medin 1989; Medin and Ortony 1989). For example, a Essentialism is pervasive across history, and initial evi- child might believe that girls have some inner, nonobvious dence suggests that it may be pervasive across cultures (Atran quality that distinguishes them from boys and that is respon- 1990). Whether biological taxa truly possess essences is a sible for the many observable differences in appearance and matter of much debate (Sober 1994; Mayr 1982; Kornblith behavior between boys and girls, before ever learning about 1993; Dupré 1993), although essentialism is largely believed chromosomes or human physiology. to be incompatible with current biological knowledge. Some The ideal essence is assumed to have no actual instantia- scholars have proposed that essentialist views of the species tion in the world. For example, on this view the essence of as fixed and unchanging may present obstacles for accurately “goodness” is some pure, abstract quality that is imperfectly learning scientific theories of EVOLUTION (Mayr 1982). realized in real-world instances of people performing good See also COGNITIVE DEVELOPMENT; CONCEPTS; CONCEP- deeds. None of these good deeds perfectly embodies “the TUAL CHANGE; FOLK BIOLOGY; NAIVE SOCIOLOGY; STEREO- good,” but each reflects some aspect of it. Plato’s cave alle- TYPING gory (The Republic), in which what we see of the world are —Susan A. Gelman mere shadows of what is real and true, exemplifies this view. The ideal essence thus contrasts with both the sortal and the References causal essences. There are relatively little empirical data available on ideal essences in human reasoning (but see Aristotle. (1924). Metaphysics. Oxford: Clarendon Press. Barsalou 1985). Atran, S. (1990). Cognitive Foundations of Natural History. New Most accounts of psychological essentialism focus on York: Cambridge University Press. causal essences. Causal essentialism has important implica- Barsalou, L. W. (1985). Ideals, central tendency, and frequency of tions for category-based inductive inferences, judgments of instantiation as determinants of graded structure in categories. constancy over time, and stereotyping. By two to three years Journal of Experimental Psychology: Learning, Memory, and of age, children expect category members to share nonobvi- Cognition 11: 629–654. ous similarities, even in the face of salient perceptual dis- Barsalou, L. W. (1991). Deriving categories to achieve goals. In G. similarities. For example, on learning that an atypical H. Bower, Ed., The Psychology of Learning and Motivation. New York: Academic Press, pp. 1–64. exemplar is a member of a category (e.g., that a penguin is a Carey, S. (1995). On the origins of causal understanding. In D. bird), children and adults draw novel inferences from typi- Sperber, D. Premack, and A. J. Premack, Eds., Causal Cogni- cal instances to the atypical member (Gelman and Markman tion: A Multi-Disciplinary Approach. Oxford: Clarendon Press, 1986). By four years of age children judge nonvisible inter- pp. 268–308. nal parts to be especially crucial to the identity and func- Dupré, J. (1993). The Disorder of Things: Metaphysical Founda- tioning of an item. Children also treat category membership tions of the Disunity of Science. Cambridge, MA: Harvard Uni- as stable and unchanging over transformations such as cos- versity Press. tumes, growth, metamorphosis, or changing environmental Gelman, S. A., J. D. Coley, and G. M. Gottfried. (1994). Essential- conditions (Keil 1989; Gelman, Coley, and Gottfried 1994). ist beliefs in children: The acquisition of concepts and theories. The finding that young children hold essentialist beliefs thus In L. A. Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: 284 Ethics Domain Specificity in Cognition and Culture., pp. 341–366. Hirschfeld, L. (1996). Race in the Making: Cognition, Culture, and New York: Cambridge University Press. the Child’s Construction of Human Kinds. Cambridge, MA: Gelman, S. A., and L. A. Hirschfeld. How biological is essential- MIT Press. ism? Forthcoming in D. Medin and S. Atran, Eds., Folkbiology. Jones, S., and L. B. Smith. (1993). The place of perception in chil- Cambridge, MA: MIT Press. dren’s concepts. Cognitive Development 8: 113–139. Gelman, S. A., and E. M. Markman. (1986). Categories and induc- Kalish, C. (1995). Essentialism and graded membership in animal tion in young children. Cognition 23: 183–209. and artifact categories. Memory and Cognition 23: 335–353. Keil, F. (1989). Concepts, Kinds, and Cognitive Development. Keil, F. (1995). The growth of causal understandings of natural Cambridge, MA: Bradford Books/MIT Press. kinds. In D. Sperber, D. Premack, and A. Premack, Eds., Keil, F. (1994). The birth and nurturance of concepts by domains: Causal Cognition: A Multidisciplinary Debate. Oxford: Oxford The origins of concepts of living things. In L. A. Hirschfeld and University Press. S. A. Gelman, Eds., Mapping the Mind: Domain Specificity in Kripke, S. (1972). Naming and necessity. In D. Davidson and G. Cognition and Culture. New York: Cambridge University Press. Harman, Eds., Semantics of Natural Language. Dordrecht: D. Kornblith, H. (1993). Inductive Inference and its Natural Ground. Reidel. Cambridge, MA: MIT Press. McNamara, T. P., and R. J. Sternberg. (1983). Mental models of Locke, J. (1894/1959). An Essay Concerning Human Understand- word meaning. Journal of Verbal Learning and Verbal Behav- ing, vol. 2. New York: Dover. ior 22: 449–474. Markman, E. M. (1989). Categorization and Naming in Children: Malt, B. (1994). Water is not H2O. Cognitive Psychology 27: 41–70. Problems in Induction. Cambridge, MA: Bradford Books/MIT Putnam, H. (1975). The meaning of “meaning.” In H. Putnam, Ed., Press. Mind, Language and Reality: Philosophical Papers, vol. 2. Mayr, E. (1982). The Growth of Biological Thought. Cambridge, New York: Cambridge University Press. MA: Harvard University Press. Rips, L. J., and A. Collins. (1993). Categories and resemblance. Medin, D. (1989). Concepts and conceptual structure. American Journal of Experimental Psychology: General 122: 468–486. Psychologist 44: 1469–1481. Rorty, R. (1979). Philosophy and the Mirror of Nature. Princeton: Medin, D. L., and A. Ortony. (1989). Psychological essentialism. Princeton University Press. In S. Vosniadou and A. Ortony, Eds., Similarity and Analogical Rothbart, M., and M. Taylor. (1990). Category labels and social Reasoning. Cambridge: Cambridge University Press, pp. 179– reality: Do we view social categories as natural kinds? In G. 195. Semin and K. Fiedler, Eds., Language and Social Cognition. Mill, J. S. (1843). A System of Logic, Ratiocinative and Inductive. London: Sage. London: Longmans. Solomon, G. E. A., S. C. Johnson, D. Zaitchik, and S. Carey. Pinker, S. (1994). The Language Instinct. New York: W. Morrow. (1996). Like father, like son: Young children’s understanding of Schwartz, S. P., Ed. (1977). Naming, Necessity, and Natural Kinds. how and why offspring resemble their parents. Child Develop- Ithaca, NY: Cornell University Press. ment 67: 151–171. Smith, E. E., and D. L. Medin. (1981). Categories and Concepts. Springer, K. (1996). Young children’s understanding of a biologi- Cambridge, MA: Harvard University Press. cal basis for parent-offspring relations. Child Development 67: Sober, E. (1994). From a Biological Point of View. New York: 2841–2856. Cambridge University Press. Taylor, M. (1996). The development of children’s beliefs about social and biological aspects of gender differences. Child Further Readings Development 67: 1555–1571. Wierzbicka, A. (1994). The universality of taxonomic categoriza- Atran, S. (1996). Modes of thinking about living kinds: science, tion and the indispensability of the concept “kind.” Rivista di symbolism, and common sense. In D. Olson and N. Torrance, Linguistica 6: 347–364. Eds., Modes of Thought: Explorations in Culture and Cogni- tion. New York: Cambridge University Press. Ethics Braisby, N., B. Franks, and J. Hampton. (1996). Essentialism, word use, and concepts. Cognition 59: 247–274. Fuss, D. (1989). Essentially Speaking: Feminism, Nature, and Dif- See CULTURAL RELATIVISM; ETHICS AND EVOLUTION; ference. New York: Routledge. MORAL PSYCHOLOGY Gelman, S. A., and J. D. Coley. (1991). Language and categoriza- tion: The acquisition of natural kind terms. In S. A. Gelman and Ethics and Evolution J. P. Byrnes, Eds., Perspectives on Language and Thought: Interrelations in Development. Cambridge: Cambridge Univer- sity Press, pp. 146–196. When Charles DARWIN wrote The Origin of Species, he Gelman, S. A., and D. L. Medin. (1993). What’s so essential about withheld discussion of the origins of human morality and essentialism? A different perspective on the interaction of per- ception, language, and conceptual knowledge. Cognitive Devel- cognition. Despite Darwin’s restraint, some of the strongest opment 8: 157–167. reactions to the theory of natural selection had to do with its Gelman, S. A., and H. M. Wellman. (1991). Insides and essences: connection to ethical matters. The intersection between evo- Early understandings of the nonobvious. Cognition 38: 213– lution and ethics continues to be a site of controversy. Some 244. claim that human ethical judgments are to be explained by Gopnik, A., and H. M. Wellman. (1994). The theory theory. In L. their adaptive value. Others claim that human ethical sys- A. Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: tems are the result of cultural evolution, not biological evo- Domain Specificity in Cognition and Culture. New York: Cam- lution. In the context of cognitive science, the central issue bridge University Press. is whether humans have ethics-specific beliefs or cognitive Hirschfeld, L. (1995). Do children have a theory of race? Cogni- mechanisms that are the result of biological evolution. tion 54: 209–252. Ethics and Evolution 285 There is increasing evidence that the human brain comes tendency to have some moral feeling or belief does not nec- prewired for a wide range of specialized capacities (see essarily entail that we ought to act on that feeling or accept NATIVISM and DOMAIN SPECIFICITY). With regard to ethics, that belief on reflection. In fact, some commentators have the central questions are to what extent the human brain is suggested, following Thomas Huxley (1894), that “the ethi- prewired for ethical thinking and, insofar as it is, what the cal progress of society depends, not on imitating [biological implications of this are. evolution], . . . but in combating it.” There is one sense in which humans are prewired for eth- Ethical nativists have various responses to the charge that ics: humans have the capacity for ethical reasoning and they commit the naturalistic fallacy. Some allow that the fact reflection while amoebas do not. This human capacity is that humans have some innate moral belief does not entail biologically based and results from EVOLUTION. Ethical that we ought to act on it, while insisting that nativism has something to tell us about ethics. Perhaps biology can tell us nativism is the view that there are specific, prewired mecha- that we are not able to do certain things and thus that it can- nisms for ethical thought. Adherents of SOCIOBIOLOGY, the not be the case that we ought to do this. For example, con- view that evolutionary theory can explain all human social cerning feminism, some sociobiologists have claimed that behavior, are among those who embrace ethical nativism. E. many of the differences between men and women are bio- O. Wilson, in Sociobiology: The New Synthesis (1975), goes logically based and unchangeable; a feminist political so far as to say that ethics can be “biologized.” Sociobiolo- agenda that strives for equality is therefore destined to fail- gists claim that humans have specific ethical beliefs and an ure. This argument has been criticized on both empirical associated ethical framework that are innate and are the and normative grounds (see Kitcher 1985 and Fausto- result of natural selection. They support this view with evi- Sterling 1992). dence that humans in all cultures share certain ethical Some sociobiologists (Wilson 1975 and Ruse 1986) have beliefs and certain underlying ethical principles (see HUMAN argued that the facts of human evolution have implications UNIVERSALS), evidence of ethical or “pre-ethical” behavior for moral realism, the metaethical position that there are among other mammals, especially primates (see de Waal moral facts like, for example, the moral fact that it is wrong 1996), and with evolutionary accounts of the selective to torture babies for fun. A standard argument for moral advantage of having innate ethical mental mechanisms. realism says that the existence of moral facts explains the Most notably, they talk about the selective advantage (to the fact that we have moral beliefs (on moral realism, see Har- individual or to the species) of ALTRUISM. man 1977; Mackie 1977; Brink 1989). If, however, ethical Consider a particular moral belief or feeling for which an nativism is true and an evolutionary account can be given evolutionary explanation has been offered, namely the belief for why people have the moral beliefs they do, then an that it is wrong to have (consensual) sex with one’s sibling. empirical explanation can be given for why we have the eth- Some sociobiologists have argued that this belief (more pre- ical capacities that we do. The standard argument for moral cisely, the feeling that there is something wrong about hav- realism is thus undercut. ing sex with a person one was raised with) is innate and that One promising reply to this line of thought is to note we have this belief because of its selective advantage. When that moral facts might be involved in giving a biological close blood relatives reproduce, there is a relatively high account of why we humans have the moral beliefs that we chance that traits carried on recessive genes (most notably, do. In the case of incest, the moral status of incest might serious diseases like sickle-cell anemia and hemophilia) will be related to the selective advantageousness of incest. be exhibited in the resulting offspring. Such offspring are Consider an analogy to mathematics. Although we might thus more likely to fail to reproduce. Engaging in incest is give an evolutionary explanation of the spread of mathe- thus an evolutionarily nonadaptive strategy. If a mutation matical abilities in humans (say, because the ability to occurred that caused an organism to feel or believe that it is perform addition was useful for hunting), mathematical wrong to engage in incest, then, all else being equal, this facts, like 2 + 2 = 4, would still be required to explain why gene would spread through the population over subsequent mathematical ability is selectively advantageous. Many of generations. Sociobiologists think they can give similar our mathematical beliefs are adaptive because they are accounts of our other ethical beliefs and the mechanisms true. The idea is to give the same sort of account for moral that underlie them. beliefs: they are selectively advantageous because they What are the implications for ethics if ethical nativism are true. Selective advantage and moral status can, how- and some version of the sociobiological story behind it are ever, come apart in some instances. One can imagine a true? Some philosophers have denied there are any interest- context in which it would be selectively advantageous for ing implications. Ethics, they note, is normative (it says men to rape women. In such a context, it might be selec- what we ought to do), whereas biology—in particular, the tively advantageous to have the belief that rape is morally details of the evolutionary origins of humans and our vari- permissible. Rape would, however, remain morally repre- ous capacities—is descriptive. One cannot derive normative hensible and repugnant even if it were selectively advan- conclusions from empirical premises. To do so is to commit tageous to believe otherwise. the naturalistic fallacy. It would be a mistake, for example, Even if there is a tension between ethical nativism and to infer from the empirical premise that our teeth evolved moral realism, the tension might not be so serious if only a for tearing flesh to the normative conclusions that we ought few of our ethical beliefs are in fact innate. Many of our eth- to eat meat. This empirical premise is compatible with ethi- ical beliefs come from and are justified by a reflective pro- cal arguments that it is morally wrong to eat meat. By the cess that involves feedback among our various ethical same reasoning, the fact that evolution produced in us the 286 Ethnopsychology beliefs; this suggests that many of them are not innate. The domain is made difficult by problems of translation, inter- nativist argument against moral realism depends on the pretation and representation. So, for example, studies of strength of its empirical premises. complex emotion concepts such as Ilongot liget (roughly, “anger”) or Japanese amae (“dependent love”) have pro- See also ADAPTATION AND ADAPTATIONISM; CULTURAL duced extended debates about issues of meaning and repre- EVOLUTION; CULTURAL VARIATION; EVOLUTIONARY PSY- sentation (see Rosaldo 1980 and Doi 1973, respectively, for CHOLOGY; MORAL PSYCHOLOGY extended analyses of these terms). —Edward Stein Beginning in the early 1950s, at about the same time that social psychologists were examining English-language folk References psychology (Heider 1958), A. I. Hallowell called for the comparative study of “ethnopsychology,” by which he meant Brink, D. O. (1989). Moral Realism and the Foundations of Ethics. “concepts of self, of human nature, of motivation, of person- Cambridge: Cambridge University Press. ality” (1967: 79). With the advent of COGNITIVE ANTHRO- de Waal, F. (1996). Good Natured: The Origins of Right and Wrong in Humans and Other Animals. Cambridge: Harvard POLOGY and the development of lexical techniques for the University Press. semantic analysis of terminological domains such as color, Fausto-Sterling, A. (1992). Myths of Gender: Biological Theories kinship, or botany, anthropologists initially approached eth- about Women and Men. Rev. ed. New York: Basic Books. nopsychology in much the same way as other areas of ethno- Harman, G. (1977). The Nature of Morality. Oxford: Oxford Uni- or “folk” knowledge, as an essentially cognitive system that versity Press. could be studied as a set of interrelated categories and propo- Huxley, T. H. (1894). Evolution and Ethics and Other Essays. New sitions. The semantic theories that informed this work York: D. Appleton. derived largely from the study of referential meaning, ana- Kitcher, P. (1985). Vaulting Ambition. Cambridge, MA: MIT Press. lyzed in terms of category structures and distinctive features Mackie, J. L. (1977). Ethics: Inventing Right and Wrong. New or dimensions (see D’Andrade 1995; Quinn and Holland York: Penguin. Ruse, M. (1986). Taking Darwin Seriously. Oxford: Blackwell. 1987 for historical overviews). Wilson, E. O. (1975). Sociobiology: The New Synthesis. Cam- The two types of psychological vocabulary most fre- bridge, MA: Harvard University Press. quently studied with lexical methods are personality and emotion. In both cases, comparative research has sought lin- Further Readings guistic evidence for cognitive and psychological universals. Studies of personality terms in both English (Schneider Bradie, M. (1994). The Secret Chain: Evolution and Ethics. 1973) and non-Western languages (e.g., Shweder and Albany: SUNY Press. Bourne 1982) indicate that two or three dimensions of inter- Goldman, A. (1993). Ethics and cognitive science. Ethics 103: personal meaning, particularly “solidarity” and “power,” 337–360. Lumsden, C., and E. O. Wilson. (1981). Genes, Minds and Culture. structure person concepts across cultures (White 1980). Cambridge, MA: Harvard University Press. Similarly, studies of emotion vocabularies have found com- Nitecki, M., and D. Nitecki, Eds. (1993). Evolutionary Ethics. plex patterns of convergence interpreted as evidence for a Albany: SUNY Press. small number of basic or universal affects (Gerber 1985; Nozick, R. (1981). Philosophical Explanations. Cambridge, MA: Romney, Moore, and Rusch 1997). Claims for universal Harvard University Press. emotion categories, however, are often complicated by Thompson, P., Ed. (1995). Issues in Evolutionary Ethics. Albany: detailed accounts of the relevance of culture-specific models SUNY Press. for emotional understanding (Lutz 1988; Heider 1991). The search for linguistic correlates of basic emotions is Ethnopsychology motivated by robust findings of biological invariance in facial expressions associated with five or six discrete emo- Ethnopsychology refers to cultural or “folk” models of sub- tions, often labeled with the English language terms jectivity, particularly as applied to the interpretation of “anger,” “disgust,” “fear,” “happiness,” “surprise,” and social action. It also refers to the comparative, anthropologi- “shame” (Ekman 1992). Inspired by research on COLOR cal study of such models as used in particular languages and CATEGORIZATION that shows color lexicons everywhere to cultures. Whereas the fields of psychology and philosophy be structured according to a small set of prototypic catego- have both given concerted attention to folk theories of the ries, numerous authors have speculated that prototype mod- mind (see FOLK PSYCHOLOGY and THEORY OF MIND), the els may be an effective means of representing emotion hallmark of anthropological studies has been the empirical concepts as internally structured categories (Gerber 1985; study of commonsense psychologies in comparative per- Russell 1991) or scenarios (Lakoff and Kövecses 1987). spective (Heelas and Lock 1981; White and Kirkpatrick Lexical studies of an entire corpus of terms extracted 1985; see also CULTURAL PSYCHOLOGY). from linguistic and social context generally produce A growing body of ethnographic work has established highly abstract results. Analyses focusing on the conceptu- that (1) people everywhere think and talk in ordinary lan- alization of emotion in ordinary language have identified guage about subjective states and personal qualities more complex cultural or propositional networks of mean- (D’Andrade 1987), that (2) cultures vary in the conceptual ing associated with key emotion terms (e.g., Rosaldo elaboration and sociocultural importance of such concepts, 1980). In particular, analyses of METAPHOR show that met- and that (3) determining conceptual universals in this aphorical associations play a central role in the elaboration Ethnopsychology 287 As is often the case for comparative research, ethnopsy- of cultural models of emotion. Emotion metaphors often chological studies raise questions about the validity of the acquire their significance by linking together other meta- domain under study, of “psychology” as a basis for cross- phors pertaining to the conceptualization of bodies, per- cultural interpretation. As working concepts of psychology sons, and minds. So, for example, the English language are adjusted for purposes of comparison, the boundaries expression “He is about to explode” obtains its meaning in between ethnopsychology, NAIVE SOCIOLOGY, and FOLK relation to such metaphorical propositions as anger is heat and the body is a container for emotions (Lakoff and BIOLOGY, among other areas, are likely to be remapped as Kövecses 1987). psychology’s metalanguage comes to terms with the diver- Both comparative and developmental studies show that sity of psychological forms and practices worldwide. implicit models of emotion frequently take the form of See also CULTURAL EVOLUTION; CULTURAL VARIATION; event schemas in which feelings and other psychological HUMAN UNIVERSALS; MOTIVATION AND CULTURE states mediate antecedent events and behavioral responses —Geoffrey White (Harris 1989; Lutz 1988). The most systematic framework developed for the analysis of emotion language is that of References linguist Anna Wierzbicka (1992), who has proposed a meta- language capable of representing the meanings of emotion Bruner, J. (1990). Acts of Meaning. Cambridge, MA: Harvard Uni- words in terms of a limited number of semantic primitives. versity Press. In this approach, scriptlike understandings of emotion are D’Andrade, R. G. (1987). A folk model of the mind. In D. Holland represented as a string of propositions forming prototypic and N. Quinn, Eds., Cultural Models in Language and Thought. event schemas. Application of this framework has clarified a Cambridge: Cambridge University Press. D’Andrade, R. G. (1995). The Development of Cognitive Anthro- number of debates about the nature of emotional meaning in pology. Cambridge: Cambridge University Press. specific languages and cultures (White 1992). Doi, T. (1973). The Anatomy of Dependence. Tokyo: Kodansha The relevance of prototype schemas for emotional International Ltd. understanding follows from the wider salience of narrative Ekman, P. (1992). An argument for basic emotions. Cognition and as an organizing principle in ethnopsychological thought Emotion 6: 169–200. generally (Bruner 1990; Johnson 1993). Among the many Gerber, E. (1985). Rage and obligation: Samoan emotions in con- types of narrative used to represent and communicate social flict. In G. White and J. Kirkpatrick, Eds., Person, Self and experience, “life stories” appear to be an especially salient Experience: Exploring Pacific Ethnopsychologies. Berkeley: genre across cultures (Linde 1993; Ochs and Capps 1996; University of California Press. Peacock and Holland 1993). There is, however, some evi- Hallowell, A. I. (1967). The self and its behavioral environment. In Culture and Experience. New York: Schocken Books (origi- dence that Euro-American cultures tend to “package” expe- nally published in 1954). rience in the form of individualized life stories more than Harris, P. (1989). Children and Emotion: The Development of Psy- many non-Western cultures that do not value or elaborate chological Understanding. Oxford: Blackwell. individual self-narrative. Heelas, P., and A. Lock, Eds. (1981). Indigenous Psychologies: Research on talk about personal experience in ordinary The Anthropology of the Self. London and New York: Aca- social contexts indicates that self-reports are often interac- demic Press. tively produced by narrators and audiences intent on render- Heider, F. (1958). The Psychology of Interpersonal Relations. New ing experience in moral terms and on actively directing the York: Wiley. course of social interaction (Miller et al. 1990). This line of Heider, K. (1991). Landscapes of Emotion: Mapping Three Cul- sociolinguistic research has identified the importance of tures in Indonesia. Cambridge: Cambridge University Press. Johnson, M. (1993). The narrative context of self and action. In sociocultural institutions for analyzing the pragmatic force Moral Imagination: Implications of Cognitive Science for Eth- of psychological talk in context. Ethnographic research in ics. Chicago: University of Chicago Press. small-scale societies finds that verbal representations of the Lakoff, G., and Z. Kövecses. (1987). The cognitive model of anger thoughts and feelings of others are likely to carry consider- inherent in American English. In D. Holland and N. Quinn, able moral weight and may be limited to specific, culturally Eds., Cultural Models in Language and Thought. Cambridge: defined occasions. Cambridge University Press. By raising interpretive questions, the comparative per- Linde, C. (1993). Life Stories: The Creation of Coherence. Oxford: spective has drawn attention to the constructed nature of Oxford University Press. commonsense psychologies, noting that concepts of emo- Lutz, C. A. (1988). Unnatural Emotions: Everyday Sentiments on tion, person, and so forth generally derive their signifi- a Micronesian Atoll and Their Challenge to Western Theory. Chicago: University of Chicago Press. cance from wider systems of cultural meaning and value. Markus, H., and S. Kitayama. (1991). Culture and the Self: Impli- The comparative approach of ethnopsychological research cations for Cognition, Emotion and Motivation. Psychological focuses attention back on psychological theory itself, not- Review 98: 224–253. ing ways in which English language constructs and para- Miller, P. J., R. Potts, H. Fung, L. Hoogstra, and J. Mintz. (1990). digms are constrained by implicit cultural concepts of Narrative practices and the social construction of self in child- person. The major theme in this line of criticism is that the hood. American Ethnologist 17: 292–311. values and ideology of Euro-American individualism sys- Ochs, E., and L. Capps. (1996). Narrating the self. Annual Review tematically influence a range of psychological concepts, of Anthropology 25:19–43. including personality (White 1992), emotion (Markus and Peacock, J., and D. Holland. (1993). The narrated self: life stories Kitayama 1991), and moral reasoning (Shweder 1990). in process. Ethos 21(4): 367–383. 288 Ethology tor Oscar Heinroth, who was intimate with the behavior of Quinn, N., and D. Holland. (1987). Culture and cognition. In D. Holland and N. Quinn, Eds., Cultural Models in Language and scores of animals, Lorenz became convinced that compara- Thought. Cambridge: Cambridge University Press. tive ethological studies could be as objective as anatomical Romney, A. K., C. Moore, and C. Rusch (1997). Cultural univer- investigations. Young animals of diverse species, raised sals: measuring the semantic structure of emotion terms in under similar conditions, consistently develop distinctive English and Japanese. Proceedings of the National Academy of behaviors, stable enough to yield insights into taxonomy Science 94: 5489–5494. and phylogeny. Many of the species-specific displays of Rosaldo, M. Z. (1980). Knowledge and Passion: Ilongot Notions of captive ducks match those in the wild. The emphasis on Self and Social Life. Cambridge: Cambridge University Press. comparative study, rendered more quantitative by Tinbergen Russell, J. (1991). Culture and the categorization of emotion. Psy- and his students, was an important innovation, embodied in chological Bulletin 110: 426–450. Lorenz’s term “fixed action patterns.” Although action pat- Schneider, D. J. (1973). Implicit personality theory: a review. Psy- chological Bulletin 79: 294–309. terns are not completely “fixed,” any more than is morphol- Shweder, R. (1990). Cultural psychology: what is it? In J. Stigler, ogy, careful scrutiny reveals consistent species and R. Shweder, and G. Herdt, Eds., Cultural Psychology: Essays individual differences in modal action patterns. Descriptive on Comparative Human Development. Cambridge: Cambridge studies of behavior took on new momentum, shifting focus University Press. somewhat from species comparisons to intraspecific varia- Shweder, R., and E. Bourne. (1982). Does the concept of the per- tion, culminating more than a generation later in the quanti- son vary cross-culturally? In A. J. Marsella and G. M. White, tative sophistication of mating choice theory, and cladistic Eds., Cultural Conceptions of Mental Health and Therapy. and other approaches to problems in phylogeny (Harvey and Boston: D. Reidel. Pagel 1991). White, G. M. (1980). Conceptual universals in interpersonal lan- A second innovation emphasized endogenous sources of guage. American Anthropologist 82: 759–781. White, G. M. (1992). Ethnopsychology. In T. Schwartz, G. White, motivation. The BEHAVIORISM of Watson (1924), given a and C. Lutz, Eds., New Directions in Psychological Anthropol- more biological flavor by Schneirla and his colleagues, ogy. Cambridge: Cambridge University Press. stressed the role of external forces in the control of behav- White, G. M., and J. Kirkpatrick, Eds. (1985). Person, Self and ior, perhaps initially as a healthy antidote to the indulgences Experience: Exploring Pacific Ethnopsychologies. Berkeley: of introspectionist psychology. Ethologists provided a University of California Press. refreshing reminder that without a genome you have no Wierzbicka, A. (1992). Semantics, Culture, and Cognition: Uni- organism, and no instructions on how to respond to external versal Human Concepts in Culture-Specific Combinations. Chi- stimuli or how to learn as a consequence of experience, cago: University of Chicago Press. countering such excesses as “psychology without heredity” (Kuo 1924). The notion of “instinct,” eloquently champi- Ethology oned by Darwin, had fallen into disrepute. Lorenz redressed this balance by stressing the importance of endogeneity, Ethology had the most impact from about 1940 to 1970, both in the motivational sense, manifest in the inherent when it took the discipline of animal behavior by storm, rhythmicity of many behaviors, and in the ontogenetic earning Konrad Lorenz and Niko Tinbergen, with Karl von sense, with certain patterns of behavior developing endoge- Frisch, a Nobel Prize in 1973. The underlying concepts nously, only minimally perturbed or adjusted according to were biological rather than psychological, derived from a the vagaries of individual experience. The notion of strong Darwinian approach to naturally occurring animal behavior. internal forces driving behavior was then highly controver- Historically, the naturalistic aspect was crucial, with an sial. We now accept that the underlying neural circuitry of emphasis lacking in psychology at the time. Although antic- many behaviors includes neuronal pacemakers as key com- ipated by American behaviorists, ethology came of age as a ponents. The theme of endogenous forces came to pervade mature discipline in Europe. The proceedings of a 1949 research at the Institute for Behavioral Physiology estab- conference in Cambridge, England, on physiological mech- lished for Lorenz and codirector Eric von Holst by the Max anisms in animal behavior presented a full exposition of the Planck Gesellschaft in Bavaria in 1956, including pioneer- ideas of Lorenz, Tinbergen, and other participants in the ing studies of swimming rhythms, and later circadian and emerging discipline (Lorenz 1950; see 1970, 1971 for col- circannual behavioral cycles. Across the Atlantic, ethologi- lected works). Tinbergen’s classical treatise on “The Study cally inspired insect physiologists Kenneth Roeder and of Instinct” followed a year later (Tinbergen 1951). Ethol- Donald Wilson convincingly demonstrated the delicate ogy provided a comprehensive framework for studying the interplay of exogenous and endogenous forces underlying functions, evolution, and development of behavior and its locomotion and other behaviors (Roeder 1963; reviewed in physiological basis. Some insights were conceptual and oth- detail in Gallistel 1980). ers methodological. Four historically important aspects The interpretation of responses to external situations not were the basic endogeneity of behavior, the concept of sign as driven from without but as interactions between changing stimuli, the reinstatement of instincts, and the importance of environments and purposively changing organismal states cross-species comparisons. was not unique to ethology. A generation previously, in an Lorenz’s medical training in Vienna exposed him to con- essay on “appetites and aversions as constituents of cepts of phylogeny emerging from comparative anatomy, instincts,” the American ethologist Wallace Craig (1918) but behavior was viewed as too amorphous to be amenable clarified the issues of uniformity and plasticity with a dis- to the same kind of study. Encouraged by Berlin zoo direc- tinction made previously by Sherrington, and later by Ethology 289 Lorenz, between appetitive (i.e., proceptive) behavior, mechanisms” with properties varying according to geneti- endogenously motivated and variable, and consummatory cally encoded instructions. The determination of ethologists behavior, externally triggered and more stereotyped. Craig to reinstate concepts of innateness led to the sometimes was a student of Charles Otis Whitman, whose 1898 Woods legitimate criticism that they caricatured young animals as Hole lectures on animal behavior anticipated other aspects completely preprogrammed automata (Lehrman 1953). It of ethological thinking. But Craig’s message fell on deaf became clear later that even young organisms with well- ears, appreciated by few psychologists, and apparently defined innate responsiveness, such as herring gulls, display known to Lorenz only after his career was well launched. great developmental plasticity, quickly acquiring new infor- He may also have been unaware of Craig’s assertion that mation about their parents and other aspects of life around aggression is less endogenously driven than most behaviors them. But they do so by learning processes that are cana- (Craig 1928). Lorenz’s controversial 1966 book “On lized by innate predispositions, insuring a certain range of Aggression,” presented the contrary case, that there are trajectories for development, whether behavioral or neuro- strong endogenous wellsprings for agonistic behavior. Few logical (Hailman 1967; Waddington 1966; Rauschecker and of the many critics of Lorenz’s position acknowledged Marler 1987). Craig’s thoughtful counterargument (see also Heiligenberg The importance of innately guided learning, well illus- and Kramer 1972). trated by song learning in birds, was the special province of Until the postwar years, American biologists and psy- British ethologist William Homan Thorpe (1956). More than chologists alike were curiously unresponsive to efforts Lorenz or Tinbergen, Thorpe prepared the way for the emer- within their ranks to instate ethologically styled concepts. A gence of COGNITIVE ETHOLOGY (Griffin 1976). He was the prophetic paper by one of its most respected pioneers, titled first to formalize criteria for different types of learning, “Experimental analysis of instinctive behavior” (LASHLEY some very basic, others with clear cognitive implications. 1938; see Beach et al. 1960), anticipated some develop- His thoughtful scholarship emphasized the importance of ments in ethology but had much less impact than his work internalized processing in perception and the purposiveness on cortical memory mechanisms. Lashley’s preoccupation of behavior. The interplay of nature and nurture is most evi- with endogenous motivational forces was evident in his dent in “imprinting,” the developmental process for which arguments with Pavlovian theorists who reduced all behav- Lorenz is best known. During imprinting, young of some ioral and psychological activity to chains of conditioned organisms learn to recognize and bond with their parents, or reflexes. There had also been limited appreciation of the parent surrogates, and others like them, by processes des- case made a generation previously by Lashley’s teacher, tined to become favored paradigms for investigating the neu- Jennings (1906), for the existence, even in single-celled ral basis of memory formation (Horn 1985). Imprinting organisms, of complex, sometimes purposive, endogenously occurs most rapidly during sensitive periods, as experience driven behaviors. His suggestion that equivalent observa- interacts with innate preferences for visual and auditory tions in higher organisms would suffice to encourage cogni- stimuli that capture attention and initiate the imprinting pro- tive theorizing may have helped to spur his later colleague cess. That these early experiences sometimes influence later Watson (1924) to put an end to subjective, introspective psy- social and sexual preferences, with varying degrees of chologizing. Instead, Watson shifted the emphasis to reversibility, attracted special attention in psychiatry and the observable behaviors and Pavlovian reflexes, thus launching social sciences (Bowlby 1969, 1973; Hinde 1982). There are behaviorism. But what began as a worthy effort to reintro- parallels with song learning in birds and with human speech duce objectivity into comparative psychology hardened into acquisition, where the choice of what to learn is guided by dogma, and the importance of endogenous factors was again generalized innate preferences, resulting ultimately in spe- forgotten until reinstated by Beach and other students of cific learned vocal traditions (Marler 1991; Pinker 1994). Lashley, increasingly preoccupied with physiological psy- The interplay of inheritance and experience that canalizes chology as an emerging discipline (Beach et al. 1960). The the development of many behaviors is epitomized by the notion of instinct met a similar fate, swept aside by the apparently paradoxical term “instincts to learn” (Gould and appeals of pragmatism. In developmental studies, American Marler 1987). More than any other, the concept of instinc- comparative psychologists grappling with the nature/nurture tively guided learning captures the essence of what was problem lost sight of the need to balance environmental uniquely distinctive about classical ethology, still providing influences with contributions of the genome, a primary a valued heuristic framework for contemporary research on emphasis in ethology. behavioral ontogeny (see EVOLUTIONARY PSYCHOLOGY). Environmental factors were not neglected by ethologists. See also SOCIAL COGNITION IN ANIMALS; SOCIAL PLAY Tinbergen and his students, more experimentally oriented BEHAVIOR than Lorenz, focused on key components of complex situa- —Peter Marler tions to which animals actually respond (collected in Tin- bergen 1971, 1973). Many of these “sign stimuli,” or “social References releasers,” often a small fraction of the total that the animal can perceive, were communicative signals (see ANIMAL Beach, F. A., D. O. Hebb, C. T. Morgan, and H. W. Nissen. (1960). COMMUNICATION). Physiological mechanisms were inferred The Neuropsychology of Lashley: Selected Papers of K. S. for filtering incoming stimuli, apparently operating at birth Lashley. New York: McGraw-Hill. in many young organisms, such as the nestling birds that Bowlby, J. (1969, 1973). Attachment and Loss. Vols. 1 and 2. Lon- Tinbergen studied. Lorenz posited central “innate release don: Hogarth Press. 290 Evoked Fields Craig, W. (1918). Appetites and aversions as constituents of Experimental Psychology, vol. 1: Perception and Motivation. instincts. Biological Bulletin 34: 91–107. New York: Wiley, pp. 765–830. Craig, W. (1928). Why do animals fight? International Journal of Barlow, G. W. (1977). Modal action patterns. In T. A. Sebeok, Ed., Ethics 31: 264–278. How Animals Communicate. Bloomington: University of Indi- Gallistel, C. R. (1980). The Organization of Action: A New Synthe- ana Press. sis. Hillsdale, NJ: Erlbaum. Bateson, P. P. G. (1966). The characteristics and context of Gould, J. L., and P. Marler. (1987). Learning by instinct. Scientific imprinting. Biological Reviews 41: 177–220. American 256(1):62–73. Bateson, P. P. G. (1978). Early experience and sexual preferences. Griffin, D. R. (1976). The Question of Animal Awareness: Evolu- In J. B. Hutchison, Ed., Biological Determinants of Sexual tionary Continuity of Mental Experience. New York: Rocke- Behavior. New York: Wiley, pp. 29–53. feller University Press. von Cranach, M., K. Foppa, W. Lepenies, and D. Ploog, Eds. Hailman, J. P. (1967). The ontogeny of an instinct. Behavior 15: 1– (1979). Human Ethology. Cambridge: Cambridge University 142. Press. Harvey, P. H., and M. D. Pagel. (1991). The Comparative Method Eibl-Eibesfeldt, I. (1975). Ethology: The Biology of Behavior. New in Evolutionary Biology. Oxford: Oxford University Press. York: Holt, Rinehart and Winston. Heiligenberg, W., and U. Kramer. (1972). Aggressiveness as a Gottlieb, G. (1979). Comparative psychology and ethology. In E. function of external stimulation. Journal of Comparative Physi- Hearst, Ed., The First Century of Experimental Psychology. ology 77: 332–340. Hillsdale, NJ: Erlbaum. Hinde, R. A. (1982). Ethology: Its Nature and Relations with Gould, J. L., and P. Marler. (1987). Learning by instinct. Scientific Other Sciences. New York: Oxford University Press. American 256(1): 62–73. Horn, B. (1985). Memory, Imprinting and the Brain. Oxford: Clar- Grillner, S., and P. Wallen. (1985). Central pattern generators for endon Press. locomotion with special reference to vertebrates. Annual Jennings, H. S. (1906). Behavior of the Lower Organisms. Bloom- Review of Neuroscience 8: 233–261. ington: Indiana University Press. (reprint, 1962.) Hess, E. H. (1973). Imprinting: Early Experience and the Develop- Kuo, Z. Y. (1924). A psychology without heredity. Psychology mental Psychobiology of Attachment. New York: Van Nostrand. Review 31:427–448. Hinde, R. A. (1960). Energy models of motivation. Sym. Soc. Exp. Lashley, K. S. (1938). Experimental analysis of instinctive behav- Biol. 14: 199–213. ior. Psychological Review 45: 445–471. Hinde, R. A. (1970). Animal Behavior: A Synthesis of Ethology and Lehrman, D. S. (1953). A critique of Konrad Lorenz’s theory of Comparative Psychology. 2nd ed. New York: McGraw-Hill. instinctive behavior. Quarterly Review of Biology 28: 337– von Holst, E. (1973). The Behavioral Physiology of Animals and 363. Man. Selected Papers of Eric von Holst. Coral Gables, FL: Uni- Lorenz, K. Z. (1950). The comparative method in studying innate versity of Miami Press. behaviour patterns. Symposia of the Society for Experimental Maier, N. R. F., and T. C. Schneirla. (1935). Principles of Animal Biology No. IV. Cambridge: Cambridge University Press, 221– Psychology. New York: McGraw-Hill. 268 Marler, P. (1985). Ethology of communicative behavior. In H. I. Lorenz, K. Z. (1966). On Aggression. New York: Harcourt, Brace Kaplan and B. J. Sadock, Eds., Comprehensive Textbook of and World. Psychiatry, vol. 1. Baltimore/London: Williams and Wilkins, Lorenz, K. Z. (1970, 1971). Studies in Animal and Human Behav- pp. 237–246. ior, vols. 1 and 2. Cambridge, MA: Harvard University Press. Marler, P. R., and W. J. Hamilton III. (1966). Mechanisms of Ani- Marler, P. (1991). The instinct to learn. In S. Carey and R. Gelman, mal Behavior. New York: Wiley. Eds., The Epigenesis of Mind: Essays on Biology and Cogni- Schleidt, W. M. (1962). Die historische Entwicklung der Begriffe tion. Hillsdale, NJ: Erlbaum, pp. 37–66. “Angeborenes auslösendes Schema”: und “Angeborener Aus- Pinker S. (1994). The Language Instinct. New York: William Mor- lösemechanismus”. Z. Tierpsychol. 19: 697–722. row. Seligman, M. E. P., and J. L. Hager, Eds. (1972). Biological Rauschecker, J, and P. Marler, Eds. (1987). Imprinting and Corti- Boundaries of Learning. New York: Appleton-Century Crofts. cal Plasticity. New York: Wiley. Thorpe, W. H. (1961). Bird Song. Cambridge: Cambridge Univer- Roeder, K. D. (1963). Nerve Cells and Insect Behavior. Cam- sity Press. bridge, MA: Harvard University Press. Thorpe, W. H. (1956). Learning and Instinct in Animals. Cam- Evoked Fields bridge, MA: Harvard University Press. Tinbergen, N. (1951). The Study of Instinct. Oxford: Clarendon Press. See ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC Tinbergen, N. (1971, 1973). The Animal in its World: Explorations EVOKED FIELDS of an Ethologist, vols. 1 and 2. Cambridge, MA: Harvard Uni- versity Press. Waddington, C. H. (1966). Principles of Development and Differ- Evolution entiation. New York: Macmillan. Watson, J. B. (1924). Behaviorism. Chicago: University of Chi- cago Press. In its simplest form, the theory of evolution is just the idea that life has changed over time, with younger forms Further Readings descending from older ones. This idea existed well before the age of Charles DARWIN, but he and his successors devel- Alcock, J. (1997). Animal Behavior: An Evolutionary Approach. oped it to explain both the diversity of life, and the adapta- Sunderland, MA: Sinauer Associates Inc. tion of living things to their circumstances. Ernest Mayr Baerends, G. P. (1988). Ethology. In R. C. Atkinson, R. J. Herrn- (1991) argues that this developed conception of evolution stein, G. Lindzey, and R. D. Luce, Eds., Stevens’ Handbook of Evolution 291 combines five main ideas: are classic examples of complex, fine-tuned adaptation. Bat ECHOLOCATION requires mechanisms that enable bats to 1. The living world is not constant; evolutionary change produce highly energetic sound waves. So they also have occurs. mechanisms that protect their ears while they are making 2. Evolutionary change has a branching pattern. The spe- such loud sounds. They have elaborately structured facial cies that we see are descended from one (or a few) architectures to maximize their chances of detecting return remote ancestors. echoes, together with specialized neural machinery to use 3. New species form when a population splits into isolated fragments which then diverge. the information in those echoes to guide their flight to their 4. Evolutionary change is gradual. Very few organisms that target. But there are many other examples of complex adap- differ dramatically from their parents are able to survive. tation. Many parasites, for example, manufacture chemicals Of those few that survive, only a small proportion found that they use to manipulate the morphology and behavior of populations that preserve these differences. their host. 5. The mechanism of adaptive change is natural selection. Darwin’s greatest achievement was to give a naturalistic Darwin, Wallace, and others rapidly convinced their sci- explanation of adaptation. His key idea, natural selection, entific contemporaries of the fact of evolution (Darwin can explain both adaptation and diversity. Imagine the popu- 1859/1964; Wallace 1870). They persuaded that community lation ancestral to the Australasian bittern. Let us suppose of the existence of the tree of life. Darwin himself was a that this population, like current bitterns, lived in reeds adja- gradualist, thinking that tiny increments across great periods cent to wetlands and sought to escape predation by crouch- of time accumulate as evolutionary change, and he thought ing still when a threatening creature was near. It is quite that the main agent of that change was natural selection. But likely that the color and pattern of the plumage of this his views on gradual change and on the importance of selec- ancestral population varied. If so, some birds were favored. tion were not part of the biological consensus until the syn- Their plumage made them somewhat harder to see when thesis of population genetics with evolutionary theory by they froze among the reeds. They were more likely to sur- Fisher, Wright, Haldane, and others in the 1930s (Depew vive to breed. If the plumage patterns of their offspring were and Weber 1995). The importance of isolation in the genera- like those of their parents, the plumage patterns of the tion of new species remained controversial even longer. It descendant generation would be somewhat different from became part of the consensus view on evolution only after that of the ancestral generation. Over time, the colors and Mayr’s postwar work on speciation and evolution (Mayr patterns characteristic of the population would change. Thus 1942, 1976, 1988). we could reach today’s superbly well-concealed bitterns. The biological world confronted Darwin, Wallace, and Natural selection selects fitter organisms, and the heritabil- their successors with two central problems. The world of life ity of their traits ensures a changed descendant population. as we know it is fabulously diverse, even though today’s life Evolutionary change depends on variation in a population, is only a tiny fraction of its total historical diversity. We tend fitness differences in the population consequent on that vari- to underestimate that diversity, because most large ani- ation and heritability. Adaptive change takes place despite mals—animals that we notice—are vertebrates like us. But the fact that the mechanisms that generate variation in the many organisms are weirdly different from us and from one population are decoupled from the adaptive needs of the another, and weird not just in finished adult form, but also in population. But it depends on more than those principles. their developmental history. Humans do not undergo major The adaptive shift to good camouflage took place gradually, physical reorganizations during their growth from children, over many generations. It depended on cumulative selec- whereas (for example) many parasites’ life cycles take them tion. If selection is to explain major adaptation it must be through a number of hosts, and in their travels they experi- cumulative. Innovation is the result of a long sequence of ence complete physical transformation. Yet though life is selective episodes rather than one, for the chances of a sin- diverse, that diversity is clumped in important ways. Arthro- gle mutation producing a new adaptation are very low. pods have jointed, segmented bodies with various limbs and Thus evolution under natural selection can produce adap- feelers attached, the whole covered with an exoskeleton. tation. At the same time, it can produce diversity, as popula- They are very different from anything else, from vertebrates, tions become adapted to different local environments, and worms, and other invertebrates. Before Darwin, the differ- thus diverge from one another. ences between arthropods, vertebrates, worms, and other Evolutionary biology has developed a consensus on the great branches of life seemed so vast as to rule out evolu- broad outline of life’s history. There is agreement on important tionary transitions between them. They are so distinctive that aspects of the mechanism of evolution. Everyone agrees that even after the universal acceptance of the evolutionary selection is important, but that chance and other factors play an descent of life, arthropod affinities remain controversial. important role too. No one doubts the importance of isolation Thus one task of evolutionary biology is in the explanation in generating diversity. But important disagreements remain. of diversity and its clumping, both on the large scale of dif- The nature of species and speciation remains problematic. ferent kinds of organism, and on the smaller scale of the dif- Although everyone agrees that selection, chance, history, and ference between species of the same general kind. development combine to generate life’s history, the nature of If diversity is important, so too is ADAPTATION. The that combination remains controversial. Though all agree that selection matters, the mode of its action remains contested. structured complexity of organisms, and their adaptation to Dawkins and others think of selection as primarily selecting their environment, is every bit as striking as the diversity of lineages of genes in virtue of their differing capacities to get organisms through their environment. Perceptual systems 292 Evolution and Ethics themselves replicated (Dawkins 1982). Others—for example, thirty or so years, however, findings from psychology, evo- Gould (1989) and Sober (1984)—conceive of selection as act- lutionary biology, and linguistics have radically changed the ing on many different kinds of entities: genes, organisms, colo- way that scholars approach this issue, leading to some sur- nies and groups, and even species. Finally, some—David Hull prising insights and opening up areas of fruitful empirical (1988) being one—think of evolution in biology as just a spe- investigation. cial case of a general mechanism of change involving undi- For one thing, proposals that language is entirely a cul- rected variation and selective retention. These controversial tural innovation, akin to agriculture or bowling, can be ideas lead to attempts to give evolutionary accounts of scien- safely dismissed. Historical linguistics gives no support to tific and cultural change. the speculation that language was invented once and then spread throughout the world; instead, the capacity to create See also COGNITIVE ETHOLOGY; CULTURAL EVOLUTION; language is to some extent within every human, and it is ETHOLOGY; EVOLUTION OF LANGUAGE; EVOLUTIONARY invented anew each generation (Pinker 1994). This is most COMPUTATION; EVOLUTIONARY PSYCHOLOGY apparent from studies of creolization; children who are —Kim Sterelny exposed to a rudimentary communication system will embellish and expand it, transforming it into a full-fledged References language within a single generation—a CREOLE (Bickerton 1981). A similar process might occur in all normal instances Darwin, C. (1859/1964). On the Origin of Species: A Facsimile of of LANGUAGE ACQUISITION: Children are remarkably profi- the First Edition. Cambridge, MA: Harvard University Press. Dawkins, R. (1982). The Extended Phenotype. Oxford: Oxford cient at obeying subtle syntactic and morphological con- University Press. straints for which there is little evidence in the sentences Depew, D., and B. H. Weber. (1995). Darwinism Evolving: Sys- they hear (e.g., Crain 1991), suggesting that some capacity tems Dynamics and the Genealogy of Natural Selection. Cam- for language has emerged through biological evolution. bridge, MA: MIT Press. Could this capacity have emerged as an accidental result Gould, S. J. (1989). Wonderful Life: The Burgess Shale and the of the large brains that humans have evolved, or as a by- Nature of History. New York: W. W. Norton. product of some enhanced general intelligence? Probably Hull, D. (1988). Science as a Process. Chicago: University of Chi- not; there are people of otherwise normal intelligence and cago Press. brain size who have severe problems learning language Mayr, E. (1942). Systematics and the Origin of Species. New York: (e.g., Gopnik 1990), as well as people with reduced intelli- Columbia University Press. Mayr, E. (1976). Evolution and the Diversity of Life. Cambridge, gence or small brains who have no problems with language MA: Harvard University Press. (e.g., Lenneberg 1967). Furthermore, the human language Mayr, E. (1988). Towards a New Philosophy of Biology. Cam- capacity cannot be entirely explained in terms of the evolu- bridge, MA: Harvard University Press. tion of mechanisms for the production and comprehension Mayr, E. (1991). One Long Argument: Charles Darwin and the of speech. Although the human vocal tract shows substantial Genesis of Modern Evolutionary Thought. London: Penguin. signs of design for the purpose of articulation—something Sober, E. (1984). The Nature of Selection: Evolutionary Theory in observed by both DARWIN and the theologian William Paley, Philosophical Focus. Cambridge, MA: MIT Press. though they drew quite different morals from it—humans Wallace, A. R. (1870). Contributions to the Theory of Natural are equally proficient at learning and using SIGN LAN- Selection. London: Macmillan. GUAGES (Newport and Meier 1985). Further Readings How else could language have evolved? Modern biolo- gists have elaborated Darwin’s insight that although natural Bowler, P. (1989). Evolution: The History of an Idea. Berkeley: selection is the most important of all evolutionary mecha- University of California Press. nisms, it is not the only one. Many traits that animals pos- Williams, G. C. (1966). Adaptation and Natural Selection: A Cri- sess are not adaptations, but emerge either as by-products of tique of Some Current Evolutionary Thought. Princeton, NJ: adaptations (“spandrels”) or through entirely nonselectionist Princeton University Press. processes, such as random genetic drift (Gould and Williams, G. C. (1992). Natural Selection: Domains, Levels and Lewontin 1979). Natural selection is necessary only in order Challenges. Oxford: Oxford University Press. to explain the evolution of what Darwin (1859) called “organs of extreme perfection and complexity,” such as the Evolution and Ethics heart, the hand, and the eye. This is because only a selec- tionist process can evolve biological traits capable of accomplishing impressive engineering tasks of adaptive See ETHICS AND EVOLUTION benefit to organisms (Dawkins 1986; Williams 1966). Although there is controversy about the proper scope of Evolution of Language selectionist theories, this much at least is agreed on, even by those who are most cautious about applying adaptive expla- The question of how language evolved has never been a nations (e.g., Gould 1977). respectable one. In the nineteenth century it motivated so Does language show signs of complex adaptive design to much wild speculation that the Société de Linguistique de the same extent as organs such as the hand and the eye? Paris banned all discussion on the topic—and many aca- Many linguists would claim that it does, arguing that lan- demics today wish this ban were still in place. Over the last guage is composed of different parts, including PHONOLOGY, Evolutionary Computation 293 and SYNTAX, that interact with one another, Cheney, D. L., and R. M. Seyfarth. (1990). How Monkeys See the MORPHOLOGY, World. Chicago: University of Chicago Press. as well as with perceptual, motoric, and conceptual systems, Chomsky, N. (1980). Rules and Representations. New York: so as to make possible an extraordinarily complicated engi- Columbia University Press. neering task—the transduction of thoughts into speech or Crain, S. (1991). Language acquisition in the absence of experi- sign. The conclusion that language decomposes into distinct ence. Behavioral and Brain Sciences 14: 597–650. neural and computational components is supported by inde- Darwin, C. (1859). On the Origin of Species. London: John Mur- pendent data from studies of acquisition, processing, and ray. pathology (Pinker 1994). Dawkins, R. (1986). The Blind Watchmaker. New York: W. W. Based on these conclusions, some scholars have argued Norton. that language has evolved as a biological adaptation for the Dawkins, R., and Krebs, J. R. (1978). Animal signals: information function of communication (Newmeyer 1991; Pinker and or manipulation. In J. R. Krebs and N. B. Davies, Eds., Behav- ioral Ecology. Oxford: Blackwell. Bloom 1990). Others have proposed instead that the ability Gopnik, M. (1990). Feature blindness: a case study. Language to learn and use language is a by-product of brain mecha- Acquisition 1: 139–164. nisms evolved for other purposes, such as motor control Gould, S. J. (1977). Darwin’s untimely burial. In Ever Since Dar- (Lieberman 1984), social cognition (Tomasello 1995) and win: Reflections on Natural History. New York: W. W. Norton. internal computation and representation (Bickerton, Gould, S. J., and R. Lewontin. (1979). The spandrels of San Marco 1995)—and that such mechanisms have been exploited, and the Panglossian Paradigm: a critique of the Adaptationist with limited subsequent modification, for speech and sign. Programme. Proceedings of the Royal Society 205: 581–598. To put the issue in a different context, nobody doubts that Hauser, M. D. (1996). The Evolution of Communication. Cam- the acquisition and use of human language involves capaci- bridge, MA: MIT Press. ties we share with other animals; the interesting debate is Lenneberg, E. H. (1967). Biological Foundations of Language. New York: Wiley. over whether the uniquely human ability to learn language Lieberman, P. (1984). The Biology and Evolution of Language. can be explained entirely in terms of enhancements of such Cambridge, MA: Harvard University Press. capacities, or whether much of language has been specifi- Mayr, E. (1983). How to carry out the adaptationist program? The cally evolved in the millions of years that separate us from American Naturalist 121: 324–334. other primates. The study of the communication systems of Newmeyer, F. J. (1991). Functional explanations in linguistics and nonhuman primates is plainly relevant here (e.g., Cheney the origin of language. Language and Communication 11: 1– and Seyfarth 1990) as is the study of their conceptual and 28. social capacities (e.g., Povinelli and Eddy 1996). Newport, E. L., and R. P. Meier. (1985). The acquisition of Ameri- The more we learn about the cognitive and neural mecha- can Sign Language. In D. I. Slobin, Ed., The Crosslinguistic nisms underlying language, the more we will know about Study of Language Acquisition. Vol. 1, The Data. Hillsdale, NJ: Erlbaum. how it evolved—which aspects are adaptations for different Pinker, S. (1994). The Language Instinct. New York: Morrow. purposes, which are by-products of adaptations, and which Pinker, S., and P. Bloom. (1990). Natural language and natural are accidents (Bloom 1998). Perhaps more importantly, we selection. Behavioral and Brain Sciences 13: 585–642. can gain insights in the opposite direction. As Mayr (1983) Povinelli, D. J., and T. J. Eddy. (1996). What young chimpanzees points out, asking about the function of a given structure or know about seeing. Monographs of the Society for Research in organ “has been the basis for every advance in physiology.” Child Development 61(3): 1–152. And although one can ask about function without consider- Tomasello, M. (1995). Language is not an instinct. Cognitive ing evolutionary biology, an appreciation of how natural Development 10: 131–156. selection works is necessary in order to discipline and guide Williams, G. C. (1966). Adaptation and Natural Selection: A Cri- functional inquiry; this is especially so for the quite non- tique of Some Current Evolutionary Thought. Princeton: Princ- eton University Press. intuitive functional considerations that arise in the evolution of communication systems (Dawkins and Krebs 1978; Hauser 1996). To the extent that language is part of human Evolutionary Computation physiology, exploring how it evolved will inevitably lead to insights about its current nature. Evolutionary computation is a collection of computational See also ADAPTATION AND ADAPTATIONISM; CULTURAL search, learning, optimization, and modeling methods EVOLUTION; EVOLUTION; INNATENESS OF LANGUAGE; LAN- loosely inspired by biological EVOLUTION. The methods GUAGE AND CULTURE; PRIMATE COGNITION; PRIMATE LAN- most often used are called genetic algorithms (GAs), evolu- GUAGE tion strategies (ESs), and evolutionary programming (EP). —Paul Bloom These three methods were developed independently in the 1960s: GAs by Holland (1975), ESs by Rechenberg (1973) References and Schwefel (1977), and EP by Fogel, Owens, and Walsh (1966). (Genetic programming, a variant of genetic algo- Bickerton, D. (1981). Roots of Language. Ann Arbor, MI: Karoma. rithms, was developed in the 1980s by Koza 1992, 1994.) Bickerton, D. (1995). Language and Human Behavior. Seattle: Such methods are part of a general movement for using bio- University of Washington Press. logical ideas in computer science that started with pioneers Bloom, P. (1998). Some issues in the evolution of language and such as VON NEUMANN, TURING, and WIENER, and continues thought. In D. Cummins and C. Allen, Eds., Evolution of the today with evolutionary computation, NEURAL NETWORKS, Mind. Oxford: Oxford University Press. 294 Evolutionary Computation and methods inspired by the immune system, insect colo- to numerical optimization problems. In the original formu- nies, and other biological systems. lation of EP, candidate solutions to given tasks were repre- Imitating the mechanisms of evolution has appealed to sented as finite-state machines, which were evolved by computer scientists from nearly the beginning of the com- randomly mutating their state-transition diagrams and puter age. Very roughly speaking, evolution can be viewed selecting the fittest. Since that time a somewhat broader for- as searching in parallel among an enormous number of pos- mulation has emerged. In contrast with ESs and EP, GAs sibilities for “solutions” to the problem of survival in an were originally formulated not to solve specific problems, environment, where the solutions are particular designs for but rather as a means to study formally the phenomenon of organisms. Viewed from a high level, the “rules” of evolu- adaptation as it occurs in nature and to develop ways in tion are remarkably simple: species evolve by means of her- which the mechanisms of natural adaptation might be itable variation (via mutation, recombination, and other imported into computer systems. Only after Holland’s origi- operators), followed by natural selection in which the fittest nal theoretical work were GAs adapted to solving optimiza- tend to survive and reproduce, thus propagating their tion problems. Since the early 1990s there has been much genetic material to future generations. Yet these simple rules cross-fertilization among the three areas, and the original are thought to be responsible, in large part, for the extraordi- distinctions among GAs, ESs, and EP have blurred consid- nary variety and complexity we see in the biosphere. Seen in erably in the current use of these labels. this light, the mechanisms of evolution can inspire computa- Setting the parameters for the evolutionary process (pop- tional search methods for finding solutions to hard problems ulation size, selection strength, mutation rate, crossover in large search spaces or for automatically designing com- rate, and so on) is often a matter of guesswork and trial and plex systems. error, though some theoretical and heuristic guidelines have In most evolutionary computation applications, the user been discovered. An alternative is to have the parameters has a particular problem to be solved and a way to encode “self-adapt”—to change their values automatically over the candidate solutions so that the solution space can be course of evolution in response to selective pressures. Self- searched. For example, in the field of computational protein adapting parameters are an intrinsic part of ESs and EP, and design, the problem is to design a one-dimensional are the subject of much research in GAs. sequence of amino acids that will fold up into a three- EC methods have been applied widely. Examples of dimensional protein with desired characteristics. Assuming applications include numerical parameter optimization and that the sequence is of length l, candidate solutions can be combinatorial optimization, the automatic design of com- expressed as strings of l amino-acid codes. There are twenty puter programs, bioengineering, financial prediction, robot different amino acids, so the number of possible strings is learning, evolving production systems for artificial intelli- 20l. The user also provides a “fitness function” or “objective gence applications, and designing and training neural net- works. In addition to these “problem-solving” applications, function” that assigns a value to each candidate solution EC methods have been used in models of natural systems in measuring its quality. which evolutionary processes take place, including eco- Evolutionary computation (EC) methods all begin with a nomic systems, immune systems, ecologies, biological evo- population of randomly generated candidate solutions lution, evolving systems with adaptive individuals, insect (“individuals”), and perform fitness-based selection and societies, and more complex social systems. (See Mitchell random variation to create a new population. Typically, 1996 for an overview of applications in some of these some number of the highest-fitness individuals are chosen areas.) under selection to create offspring for the next generation. Much current research in the EC field is on making the Often, an offspring will be produced via a crossover basic EC framework more biologically realistic, both for between two or more parents, in which the offspring modeling purposes and in the hope that more realism will receives “genetic material”—different parts of candidate improve the search performance of these methods. One solutions—from different parents. Typically the offspring is approach is incorporating more complex genetic informa- also mutated randomly—parts of the candidate solution are tion in individuals in the population, such as sexual differen- changed at random. (Mutation and crossover in evolutionary tiation, diploidy, and introns. Another is incorporating computation are meant to mimic roughly biological muta- additional genetic operators, such as inversion, transloca- tion and sexual recombination, two main sources of genetic tion, and gene doubling and deletion. A third is embedding variation.) Offspring are created in this way until a new gen- more complex ecological interactions into the population, eration is complete. This process typically iterates for many such as host-parasite coevolution, symbiosis, sexual selec- generations, often ending up with one or more optimal or tion, and spatial migration. Finally, there has been consider- high-quality individuals in the population. able success in combining EC methods with other types of GA, EP, and ES methods differ in the details of this pro- search methods, such as simple gradient ascent and simu- cess. In general, ESs and EP each define fairly specific ver- lated annealing. Such hybrid algorithms are thought by sions of this process, whereas the term “genetic algorithm,” many to be the best approach to optimization in complex originally referring to a specific algorithm, has come to and ill-understood problem spaces (Davis 1991). refer to many considerably different variations of the basic EC is relevant for the cognitive sciences both because scheme. of its applications in the fields of artificial intelligence and ESs were originally formulated to work on real-valued parameter optimization problems, such as airplane wing- MACHINE LEARNING and because of its use in models of shape optimization. They are still most commonly applied the interaction of evolution and cognitive processes. For Evolutionary Psychology 295 example, researchers in EVOLUTIONARY PSYCHOLOGY and Holland, J. H. (1975). Adaptation in Natural and Artificial Sys- tems. Ann Arbor, MI: University of Michigan Press. 2nd ed: other areas have used EC methods in their models of inter- MIT Press, 1992. actions between evolution and LEARNING (e.g., Ackley and Koza, J. R. (1992). Genetic Programming: On the Programming of Littman 1992; Belew and Mitchell 1996; Miller and Todd Computers by Means of Natural Selection. Cambridge, MA: 1990; Parisi, Nolfi and Elman 1994; Todd 1996). Like- MIT Press. wise, EC methods have been used in models of the rela- Koza, J. R. (1994). Genetic Programming II: Automatic Discovery tionship between evolution and development (e.g., Belew of Reusable Programs. Cambridge, MA: MIT Press. 1993; Dellaert and Beer 1994). Social scientists and lin- Michalewicz, Z. (1992). Genetic Algorithms + Data Structures = guists have used EC methods to study, for example, the Evolution Programs. New York: Springer. evolution of cooperation and communication in multiagent Miller, G. F., and P. M. Todd. (1990). Exploring adaptive agency I: systems (e.g., Ackley and Littman 1994; Axelrod 1987; Theory and methods for simulating the evolution of learning. In D. S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, Batali 1994; Batali and Kitcher 1995; Stanley, Ashlock, Eds., Proceedings of the (1990) Connectionists Models Summer and Tesfatsion 1994). Many of these models are consid- School. New York: Morgan Kaufmann, pp. 65–80. ered to fall in the purview of the field of ARTIFICIAL LIFE. Mitchell, M. (1996). An Introduction to Genetic Algorithms. Cam- These examples by no means exhaust the uses of EC in bridge, MA: MIT Press. cognitive science, and the literature is growing as interest Mitchell, M., and S. Forrest. (1994). Genetic algorithms and artifi- increases in the role of evolution in shaping cognition and cial life. Artificial Life 1(3): 267–289. behavior. Parisi, D., S. Nolfi, and J. L. Elman. (1994). Learning and evolu- tion in neural networks. Adaptive Behavior 3(1): 5–28. —Melanie Mitchell Rechenberg, I. (1973). Evolutionsstrategie: Optimierung Technis- cher Systeme nach Prinzipien der Biologischen Evolution. Stut- tgart: Frommann-Holzboog. References Schwefel, H.-P. (1977). Numerische Optimierung von Computer- Ackley, D., and M. Littman. (1992). Interactions between learning Modellen mittels der Evolutionsstrategie. Basel: Birkhauser. and evolution. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Stanley, E. A., D. Ashlock, and L. Tesfatsion. (1994). Iterated Rasmussen, Eds., Artificial Life II. Reading, MA: Addison- Prisoner’s Dilemma with choice and refusal of partners. In C. Wesley, pp. 487–509. G. Langton, Ed., Artificial Life III. Reading, MA: Addison- Ackley, D., and M. Littman. (1994). Altruism in the evolution of Wesley, pp. 131–175. communication. In R. A. Brooks and P. Maes, Eds., Artificial Todd, P. M. (1996). Sexual selection and the evolution of learning. Life IV. Cambridge, MA: MIT Press, pp. 40–48. In R. K. Belew and M. Mitchell, Eds., Adaptive Individuals in Axelrod, R. (1987). The evolution of strategies in the iterated Evolving Populations: Models and Algorithms. Reading, MA: Prisoner’s Dilemma. In L. D. Davis, Ed., Genetic Algorithms Addison-Wesley, pp. 365–393. and Simulated Annealing, New York: Morgan Kaufmann, pp. von Neumann, J. (1966). Theory of Self-Reproducing Automata. 32–41. Ed. A. W. Burks. Urbana: University of Illinois Press. Back, T. (1996). Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Al- Further Readings gorithms. New York: Oxford University Press. Batali, J. (1994). Innate biases and critical periods: combining evo- Holland, J. H., K. J. Holyoak, R. Nisbett, and P. R. Thagard. lution and learning in the acquisition of syntax. In R. A. Brooks (1986). Induction: Processes of Inference, Learning and Dis- and P. Maes, Eds., Artificial Life IV. Cambridge, MA: MIT covery. Cambridge, MA: MIT Press. Press, pp. 160–177. Langton, C. G., Ed. (1995). Artificial Life: An Overview. Cam- Batali, J., and P. Kitcher. (1995). Evolution of altruism in optional bridge, MA: MIT Press. and compulsory games. Journal of Theoretical Biology 175(2): Schwefel, H.-P. (1995). Evolution and Optimum Seeking. New 161. York: Wiley. Belew, R. K. (1993). Interposing an ontogenic model between genetic algorithms and neural networks. In S. J. Hanson, J. D. Evolutionary Psychology Cowan, and C. L. Giles, Eds., Advances in Neural Information Processing (NIPS 5). New York: Morgan Kaufmann. Belew, R. K., and M. Mitchell, Eds. (1996). Adaptive Individuals Evolutionary psychology is an approach to the cognitive sci- in Evolving Populations: Models and Algorithms. Reading, ences in which evolutionary biology is integrated with the MA: Addison-Wesley. cognitive, neural, and behavioral sciences to guide the sys- Davis, L. D., Ed. (1991). Handbook of Genetic Algorithms. New tematic mapping of the species-typical computational and York: Van Nostrand Reinhold. Dellaert, F., and R. D. Beer. (1994). Toward an evolvable model of neural architectures of animal species, including humans. development for autonomous agent synthesis. In R. A. Brooks Although the field draws on many disciplines, of particu- and P. Maes, Eds., Artificial Life IV. Cambridge, MA: MIT lar importance was the integration of (1) the cognitive study Press, pp. 246–257. of functional specializations pioneered in perception and Fogel, D. B. (1995). Evolutionary Computation: Toward a New Chomskyan psycholinguistics (MARR 1982); (2) hunter- Philosophy of Machine Intelligence. Los Angeles: IEEE gatherer and primate studies (Lee and DeVore 1968); and Press. (3) the revolution that placed evolutionary biology on a Fogel, L. J., A. J. Owens, and M. J. Walsh. (1966). Artificial Intel- more rigorous, formal foundation of replicator dynamics ligence through Simulated Evolution. New York: Wiley. (Williams 1966; Dawkins 1982). Beginning in the 1960s, Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimiza- this revolution catalyzed the derivation of a set of theories tion, and Machine Learning. Reading, MA: Addison-Wesley. 296 Evolutionary Psychology about how evolution shapes organic design with respect to tational devices that are specialized in function (Gallistel kinship, foraging, parental care, mate selection, COOPERA- 1995), such as FACE RECOGNITION systems, a language TION AND COMPETITION, aggression, communication, life acquisition device, navigation specializations, and animate history, and so forth—theories that were refined and tested motion recognition. They are skeptical that an architecture on an empirical base that now includes thousands of spe- consisting predominantly of content-independent cognitive cies. This body of theory has allowed evolutionary psychol- processes, such as general-purpose pattern associators, ogists to apply the concepts and methods of the cognitive could solve the diverse array of adaptive problems effi- sciences to nontraditional topics, such as reciprocation, for- ciently enough to reproduce themselves reliably in complex, aging memory, parental motivation, coalitional dynamics, unforgiving natural environments that include, for example, incest avoidance, sexual jealousy, and so on. Evolutionary antagonistically coevolving biotic adversaries, such as para- psychology is unusual in that a primary goal is the construc- sites, prey, predators, competitors, and incompletely harmo- tion of a comprehensive map of the entire species-typical nious social partners. computational architecture of humans, including motiva- Selection drives design features to become incorporated tional and emotional mechanisms, and that its scope into architectures in proportion to the actual distribution of includes all human behavior, rather than simply “cold cog- adaptive problems encountered by a species over evolution- nition.” ary time. There is no selection to generalize the scope of George Williams’s (1966) volume Adaptation and Natu- problem solving to include never or rarely encountered ral Selection was of particular formative significance to evo- problems at the cost of efficiency in solving frequently lutionary psychology. Williams identified the defects in the encountered problems. To the extent that problems cluster imprecise, panglossian functionalist thinking that had per- into types (domains) with statistically recurrent properties vaded evolutionary biology and that continues, implicitly, to and structures (e.g., facial expression statistically cues emo- permeate other fields. The book outlined the principles of tional state), it will often be more efficient to include com- modern adaptationism (see ADAPTATION AND ADAPTATION- putational specializations tailored to inferentially exploit the ISM), showed how tightly constrained any adaptationist (i.e., recurrent features of the domain (objects always have loca- functionalist) or by-product claim had to be to be consistent tions, are bounded by surfaces, cannot pass through each with neo-Darwinism, and identified the empirical tests such other without deformation, can be used to move each other, claims had to pass. Until Williams, many biologists etc.). Because the effects of selection depend on iteration explained the existence of a trait (or attributed functionality over evolutionary time, evolutionary psychologists expect to traits) by identifying some beneficial consequence (to the the detailed design features of domain-specific inference individual, the social group, the ecosystem, the species, engines to intricately reflect the enduring features of etc.). They did so without regard to whether the functional- domains. Consequently, evolutionary psychologists are very ity or benefit was narrowly coupled, as neo-Darwinism interested in careful studies of enduring environmental and requires, to a design that led to systematic genic propagation task regularities, because these predict details of functional of replicas of itself within the context of the species’ ances- design (Shepard 1987). Adaptationist predictions of tral environment. Evolutionary psychologists apply these DOMAIN SPECIFICITY have gained support from many precise adaptationist constraints on functionalism to the sources, for example, from cognitive neuroscience, demon- cognitive, neural, and social sciences, and maintain that strating that many dissociable cognitive deficits show sur- cognitive scientists should at least be aware that many cog- prising content-specificity, and from developmental nitive theories routinely posit complex functional organiza- research indicating that infants come equipped with evolved tion of kinds that evolutionary processes are unlikely to domain-specific inference engines (e.g., a NAIVE PHYSICS, a produce. THEORY OF MIND module; Hirschfeld and Gelman 1994). Evolutionary psychologists consider their field method- A distinguishing feature of evolutionary psychology is ologically analogous to reverse engineering in computer sci- that evolutionary psychologists have principled theoretical ence. In such an enterprise, evolutionary psychologists reasons for their hypotheses derived from biology, paleoan- argue, knowledge of the evolutionary dynamics and ances- thropology, GAME THEORY, and hunter-gatherer studies. tral task environments responsible for the construction of Such theoretically derived prior hypotheses allow research- each species’ architecture can provide valuable, although ers to devise experiments that make possible the detection incomplete, models of the computational problems (sensu and mapping of computational devices that no one would Marr 1982) that each species regularly encountered. These, otherwise have thought to test for in the absence of such in turn, can be used to pinpoint many candidate design fea- theories. To the extent that the evolutionary theory used is tures of the computational devices that could have evolved accurate, evolutionary psychologists argue that this prac- to solve these problems, which can then be used to guide tice allows a far more efficient research strategy than exper- empirical investigations. For example, if eye direction reli- iments designed and conducted in ignorance of the ably provided useful information ancestrally about the principles of evolved design or the likely functions of the intentions of conspecifics or predators, then specialized eye brain. Using this new research program, many theoretically direction detectors may have evolved as a component of motivated discoveries have been made about, for instance, SOCIAL COGNITION, and it may prove worthwhile testing for internal representations of trajectories; computational spe- their existence and design (Baron-Cohen 1995). cializations for reasoning about danger, social exchanges, Evolutionary psychologists consider it likely that cogni- and threats; female advantage in the incidental learning of tive architectures contain a large number of evolved compu- the spatial locations of objects; the frequency format of Evolutionary Psychology 297 features of a species’ cognitive or neural architecture can be probabilistic reasoning representations; the decision rules partitioned into adaptations, which are present because they governing risk aversion and its absence; universal mate were selected for (e.g., the enhanced recognition system for selection criteria and standards of beauty; eye direction snakes coupled with a decision-rule to acquire a motivation detection and its relationship to theory of mind; principles to avoid them); by-products, which are present because they of generalization; life history shifts in aggression and are causally coupled to traits that were selected for (e.g., the parenting decisions; social memory; reasoning about avoidance of harmless snakes); and noise, which was groups and coalitions; the organization of jealousy, and injected by the stochastic components of evolution (e.g., the scores of other topics (see Barkow, Cosmides, and Tooby fact that a small percentage of humans sneeze when exposed 1992 for review). to sunlight). One payoff of integrating adaptationist analysis Although some critics (Gould 1997) have argued that the with cognitive science was the realization that complex field consists of post hoc storytelling, it is difficult to recon- functional structures (computational or anatomical), in spe- cile such claims with the actual practice of evolutionary cies with life histories like humans, will be overwhelmingly psychologists, inasmuch as in evolutionary psychology the species-typical (Tooby and Cosmides 1990a). That is, the evolutionary model or “explanation” precedes the empirical complex adaptations that compose the human COGNITIVE discovery and guides researchers to it, rather than being constructed post hoc to explain some known fact. Although ARCHITECTURE must be human universals, while variation critics have also plausibly maintained that reconstructions caused by genetic differences are predominantly noise: of the past are inherently speculative, evolutionary psychol- minor random perturbations around the species-typical ogists have responded that researchers know with certainty design. This principle allows cross-cultural triangulation of or high confidence thousands of important things about our the species-typical design, which is why many evolutionary ancestors, many of which can be deployed in designing cog- psychologists include cross-cultural components in their nitive experiments: our ancestors had two sexes; lived in an research. environment where self-propelled motion reliably predicted Evolutionary psychologists emphasize the study of adap- that the entity was an animal; inhabited a world where the tations and their by-products not because they think all or motions of objects conformed to the principles of kinematic most traits are adaptations (or their side effects), but because geometry; chose mates; had color vision; were predated (1) at present, adaptationist theories of function provide upon; had faces; lived in a biotic environment with a hierar- clear and useful prior predictions about cognitive organiza- chical taxonomic structure; and so on. Moreover, evolution- tion; (2) the functional elements are far more likely to be ary psychologists point out that, to the extent that species-typical and hence experimentally extractable; (3) reconstructions are uncertain, they will simply lead to analysis of the random or contingent components of evolu- experiments that are no more or less likely to be productive tion provides very few constrained or falsifiable predictions than evolutionarily agnostic empiricism, the alternative about cognitive architecture; and (4) theories of phyloge- research strategy. netic constraint are not yet very useful or well developed, Similarly, critics have argued that adaptationist analysis is although that may change. Evolutionary psychologists do misconceived, because adaptations are of poor quality, ren- not maintain that all traits are adaptive, that the realized dering functional predictions irrelevant (Gould 1997). Evolu- architecture of the human mind is immune to modification, tionary psychologists respond that although selection does that genes or biology are deterministic, that culture is unim- not optimize, it demonstrably produces well-engineered portant, or that existing human social arrangements are fair adaptations to long-enduring adaptive problems. Indeed, or inevitable. Indeed, they provide testable theories about whenever engineers have attempted to duplicate any natural the developmental processes that build (and can change) the competence (color vision, object recognition, grammar mechanisms that generate human behavior. acquisition, texture perception, object manipulation, locomo- See also ALTRUISM; EVOLUTION; MODULARITY OF MIND; tion over natural terrains, language comprehension, etc.), SEXUAL ATTRACTION, EVOLUTIONARY PSYCHOLOGY OF; even when using huge budgets, large research teams, and SOCIAL COGNITION IN ANIMALS; SOCIOBIOLOGY decades of effort, they are unable to engineer artificial sys- —Leda Cosmides and John Tooby tems that can come close to competing with naturally engi- neered systems. The processes of evolutionary change divide into two References families: chance and selection. Chance processes (drift, mutation pressure, environmental change, etc.) produce ran- Barkow, J., L. Cosmides, and J. Tooby, Eds. (1992). The Adapted Mind: Evolutionary Psychology and the Generation of Culture. dom evolutionary change, and so cannot build organic struc- New York: Oxford University Press. ture more functionally organized than chance could account Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and for. Natural selection, in contrast, is the only component of Theory of Mind. Cambridge, MA: MIT Press. the evolutionary process that sorts features into or out of the Dawkins, R. (1982) The Extended Phenotype. San Francisco: W. architecture on the basis of how well they function. Conse- H. Freeman. quently, all cognitive organization that is too improbably Gallistel, C. R. (1995) The replacement of general-purpose theo- well-ordered with respect to function to have arisen by ries with adaptive specializations. In M. S. Gazzaniga, Ed., The chance must be attributed to the operation of selection, a Cognitive Neurosciences. Cambridge, MA: MIT Press. constrained set of processes that restrict the kinds of func- Gould, S. J. (1997). Evolution: the pleasures of pluralism. New tional organization that can appear in organisms. As a result, York Review of Books 44(11): 47–52. 298 Expert Systems Hirschfeld, L., and S. Gelman, Eds. (1994). Mapping the Mind: Staddon, J. E. R. (1988). Learning as inference. In R. C. Bolles and Domain Specificity in Cognition and Culture. New York: Cam- M. D. Beecher, Eds., Evolution and Learning. Hillsdale, NJ: bridge University Press. Erlbaum. Lee, R. B., and I. DeVore, Eds. (1968) Man the Hunter. Aldine: Stephens, D., and J. Krebs. (1986). Foraging Theory. Princeton, Chicago. NJ: Princeton University Press. Marr, D. (1982). Vision. Cambridge, MA: MIT Press. Symons, D. (1979). The Evolution of Human Sexuality. New York: Shepard, R. N. (1987). Evolution of a mesh between principles of Oxford University Press. the mind and regularities of the world. In J. Dupre, Ed., The Tooby, J., and L. Cosmides. (1990b). The past explains the present: Latest on the Best: Essays on Evolution and Optimality. Cam- emotional adaptations and the structure of ancestral environ- bridge, MA: The MIT Press. ments. Ethology and Sociobiology 11: 375–424. Tooby, J., and L. Cosmides. (1990a). On the universality of human nature and the uniqueness of the individual: the role of genetics Expert Systems and adaptation. Journal of Personality 58: 17–67. Tooby, J., and L. Cosmides. (1992). The psychological foundations of culture. In J. Barkow, L. Cosmides, and J. Tooby, Eds., The See EXPERTISE; KNOWLEDGE-BASED SYSTEMS Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York: Oxford University Press. Expertise Williams, G. C. (1966). Adaptation and Natural Selection. Prince- ton: Princeton University Press. Expertise refers to the mechanisms underlying the superior Further Readings achievement of an expert, that is, “one who has acquired special skill in or knowledge of a particular subject through Atran, S. (1990). The Cognitive Foundations of Natural History. professional training and practical experience” (Webster’s, New York: Cambridge University Press. 1976: 800). The term expert is used to describe highly expe- Brown, D. E. (1991). Human Universals. New York: McGraw- rienced professionals such as medical doctors, accountants, Hill. teachers, and scientists, but has been expanded to include Buss, D. M. (1994). The Evolution of Desire. New York: Basic Books. individuals who attained their superior performance by Carey, S., and R. Gelman, Eds. (1991). Epigenesis of the Mind: instruction and extended practice: highly skilled performers Essays in Biology and Knowledge. Hillsdale, NJ: Erlbaum. in the arts, such as music, painting, and writing; sports, such Cosmides, L., and J. Tooby. (1992). Cognitive adaptations for as swimming, running, and golf; and games, such as bridge, social exchange. In J. Barkow, L. Cosmides, and J. Tooby, Eds., chess, and billiards. The Adapted Mind: Evolutionary Psychology and the Genera- When experts exhibit their superior performance in pub- tion of Culture. New York: Oxford University Press. lic their behavior looks so effortless and natural that we are Daly, M., and M. Wilson. (1995). Discriminative parental solici- tempted to attribute it to special talents. Although a certain tude and the relevance of evolutionary models to the analysis of amount of knowledge and training seems necessary, the role motivational systems. In M. S. Gazzaniga (Ed.), The Cognitive of acquired skill for the highest levels of achievement has Neurosciences. Cambridge, MA: MIT Press. Daly, M., and M. Wilson. (1988) Homicide. New York: Aldine. traditionally been minimized. However, when scientists Ekman, P. (1993). Facial expression and emotion. American Psy- began measuring the experts’ supposedly superior powers of chologist 48: 384–392. speed, memory and intelligence with psychometric tests, no Gigerenzer, G., and K. Hug. (1992). Domain specific reasoning: general superiority was found—the demonstrated superior- social contracts, cheating, and perspective change. Cognition ity was domain-specific. For example, the superiority of the 43: 127–171. CHESS experts’ memory was constrained to regular chess Krebs, J. R., and N. B. Davies. (1997). Behavioural Ecology: An positions and did not generalize to other types of materials Evolutionary Approach. 4th ed. Sunderland, Mass.: Sinauer (Djakow, Petrowski and Rudik 1927). Not even IQ could Associates. distinguish the best among chess-players (Doll and Mayr Maynard Smith, J. (1982) Evolution and the Theory of Games. 1987) nor the most successful and creative among artists Cambridge: Cambridge University Press. Pinker, S. (1997). How the Mind Works. New York: W. W. Norton. and scientists (Taylor 1975). Ericsson and Lehmann (1996) Rozin, P. (1976) The evolution of intelligence and access to the found that cognitive unconscious. In J. M. Sprague and A. N. Epstein, 1. Measures of general basic capacities do not predict suc- Eds., Progress in Psychobiology and Physiological Psychology. cess in a domain. New York: Academic Press. 2. The superior performance of experts is often very Shepard, R. N. (1987). Toward a universal law of generalization domain-specific, and transfer outside their narrow area for psychological science. Science 237: 1317–1323. of expertise is surprisingly limited. Sherry, D., and D. Schacter. (1987). The evolution of multiple 3. Systematic differences between experts and less profi- memory systems. Psychological Review 94: 439–454. cient individuals nearly always reflect attributes acquired Spelke, E. (1990). Principles of object perception. Cognitive Sci- by the experts during their lengthy training. ence 14: 29–56. Sperber, D. (1994). The modularity of thought and the epidemeol- In a pioneering empirical study of the thought processes ogy of representations. In L. Hirschfeld and S. Gelman, Eds., mediating the highest levels of performance, de Groot Mapping the Mind: Domain-Specificity in Cognition and Cul- (1978) instructed expert and world-class chess players to ture. Cambridge: Cambridge University Press. think aloud while they selected their next move for an unfa- Sperber, D. (1996). Explaining Culture: A Naturalistic Approach. miliar chess position. The world-class players did not differ Cambridge: Blackwell. Expertise 299 physics, sports, and medicine (Chi, Glaser, and Farr 1988; in the speed of their thoughts or the size of their basic mem- Ericsson and Smith 1991; Starkes and Allard 1993). For ory capacity, and their ability to recognize promising poten- appropriate challenging problems experts do not just auto- tial moves was based on their extensive experience and matically extract patterns and retrieve their response directly knowledge of patterns in chess. In their influential theory of from memory. Instead, they select the relevant information expertise, Chase and Simon (1973; Simon and Chase 1973) and encode it in special representations in WORKING MEM- proposed that experts with extended experience acquire a larger number of more complex patterns and use these new ORY that allow PLANNING, evaluation and reasoning about patterns to store knowledge about which actions should be alternative courses of action (Ericsson and Lehmann 1996). taken in similar situations. According to this influential the- Hence, the difference between experts and less skilled sub- ory, expert performance is viewed as an extreme case of jects is not merely a matter of the amount and complexity of skill acquisition (Proctor and Dutta 1995; Richman et al. the accumulated knowledge; it also reflects qualitative differ- 1996; VanLehn 1996) and as the final result of the gradual ences in the organization of knowledge and its representation improvement of performance during extended experience in (Chi, Glaser, and Rees 1982). Experts’ knowledge is a domain. Furthermore, the postulated central role of encoded around key domain-related concepts and solution acquired knowledge has encouraged efforts to extract procedures that allow rapid and reliable retrieval whenever experts’ knowledge so that computer scientists can build stored information is relevant. Less skilled subjects’ knowl- expert systems that would allow a computer to act as an edge, in contrast, is encoded using everyday concepts that expert (Hoffman 1992). make the retrieval of even their limited relevant knowledge Among investigators of expertise, it has generally been difficult and unreliable. Furthermore, experts have acquired assumed that the performance of experts improves as a domain-specific memory skills that allow them to rely on direct function of increases in their knowledge through long-term memory (Ericsson and Kintsch 1995) to dramati- training and extended experience. However, recent studies cally expand the amount of information that can be kept show that there are, at least, some domains where “experts” accessible during planning and during reasoning about alter- perform no better then less trained individuals (cf. outcomes native courses of action. The superior quality of the experts’ of therapy by clinical psychologists; Dawes 1994) and that mental representations allow them to adapt rapidly to chang- sometimes experts’ decisions are no more accurate than ing circumstances and anticipate future events in advance. beginners’ decisions and simple decision aids (Camerer and The same acquired representations appear to be essential for Johnson 1991; Bolger and Wright 1992). Most individuals experts’ ability to monitor and evaluate their own perfor- who start as active professionals or as beginners in a domain mance (Ericsson 1996; Glaser 1996) so that they can keep change their behavior and increase their performance for a improving their own performance by designing their own limited time until they reach an acceptable level. Beyond training and assimilating new knowledge. this point, however, further improvements appear to be See also DOMAIN SPECIFICITY; EXPERT SYSTEMS; unpredictable, and the number of years of work and leisure KNOWLEDGE-BASED SYSTEMS; KNOWLEDGE REPRESENTA- experience in a domain is a poor predictor of attained per- TION; PROBLEM SOLVING formance (Ericsson and Lehmann 1996). Hence, continued —Anders Ericsson improvements (changes) in achievement are not automatic consequences of more experience, and in those domains References where performance consistently increases, aspiring experts seek out particular kinds of experience, that is, deliberate Bolger, F., and G. Wright. (1992). Reliability and validity in expert practice (Ericsson, Krampe, and Tesch-Römer 1993)— judgment. In G. Wright and F. Bolger, Eds., Expertise and Decision Support. New York: Plenum, pp. 47–76. activities designed, typically by a teacher, for the sole pur- Camerer, C. F., and E. J. Johnson. (1991). The process-perfor- pose of effectively improving specific aspects of an individ- mance paradox in expert judgment: how can the experts know ual’s performance. For example, the critical difference so much and predict so badly? In K. A. Ericsson and J. Smith between expert musicians differing in the level of attained (Eds.), Towards a General Theory of Expertise: Prospects and solo performance concerned the amounts of time they had Limits. Cambridge: Cambridge University Press, pp. 195–217. spent in solitary practice during their music development, Charness, N., R. T. Krampe, and U. Mayr. (1996). The role of prac- which totaled around ten thousand hours by age twenty for tice and coaching in entrepreneurial skill domains: an interna- the best experts, around five thousand hours for the least- tional comparison of life-span chess skill acquisition. In K. A. accomplished expert musicians, and only two thousand Ericsson, Ed., The Road to Excellence: The Acquisition of hours for serious amateur pianists. More generally, the accu- Expert Performance in the Arts and Sciences, Sports, and Games. Mahwah, NJ: Erlbaum, pp. 51–80. mulated amount of deliberate practice is closely related to Chase, W. G., and H. A. Simon. (1973). The mind’s eye in chess. the attained level of performance of many types of experts, In W. G. Chase, Ed., Visual Information Processing. New York: such as musicians (Ericsson, Krampe, and Tesch-Römer Academic Press, pp. 215–281. 1993; Sloboda et al. 1996), chess players (Charness, Chi, M. T. H., R. Glaser, and M. J. Farr, Eds. (1988). The Nature of Krampe, and Mayr 1996) and athletes (Starkes et al. 1996). Expertise. Hillsdale, NJ: Erlbaum. The recent advances in our understanding of the complex Chi, M. T. H., R. Glaser, and E. Rees. (1982). Expertise in problem representations, knowledge and skills that mediate the supe- solving. In R. S. Sternberg, Ed., Advances in the Psychology of rior performance of experts derive primarily from studies Human Intelligence, vol. 1. Hillsdale, NJ: Erlbaum, pp. 1–75. where experts are instructed to think aloud while completing Dawes, R. M. (1994). House of Cards: Psychology and Psycho- representative tasks in their domains, such as chess, music, therapy Built on Myth. New York: Free Press. 300 Explanation explanation and its role in thinking has been addressed by Djakow, I. N., N. W. Petrowski, and P. Rudik. (1927). Psychologie des Schachspiels [Psychology of Chess]. Berlin: Walter de philosophers, psychologists, and artificial intelligence Gruyter. researchers. Doll, J. and U. Mayr. (1987). Intelligenz und Schachleistung—eine The main philosophical concern has been to characterize Untersuchung an Schachexperten. [Intelligence and achieve- the nature of explanations in science. In 1948, Hempel and ment in chess—a study of chess masters]. Psychologische Oppenheim proposed the deductive-nomological (D-N) Beitrge 29: 270–289. model of explanation, according to which an explanation is de Groot, A. (1978). Thought and Choice in Chess. The Hague: an argument that deduces a description of a fact to be Mouton. (Original work published 1946.) explained from general laws and descriptions of observed Ericsson, K. A. (1996). The acquisition of expert performance: an facts (Hempel 1965). For example, to explain an eclipse of introduction to some of the issues. In K. A. Ericsson, Ed., The the sun, scientists use laws of planetary motion to deduce that Road to Excellence: The Acquisition of Expert Performance in the Arts and Sciences, Sports, and Games. Mahwah, NJ: at a particular time the moon will pass between the earth and Erlbaum, pp. 1–50. the sun, producing an eclipse. Many artificial intelligence Ericsson, K. A., and W. Kintsch. (1995). Long-term working mem- researchers also assume that explanations consist of deductive ory. Psychological Review 102: 211–245. proofs (e.g., Mitchell, Keller, and Kedar-Cabelli 1986). Ericsson, K. A., R. T. Krampe, and C. Tesch-Römer. (1993). The Although the D-N model gives a good approximate role of deliberate practice in the acquisition of expert perfor- account of explanation in some areas of science, particularly mance. Psychological Review 100: 363–406. mathematical physics, it does not provide an adequate gen- Ericsson, K. A., and A. C. Lehmann. (1996). Expert and excep- eral account of explanation. Some explanations are induc- tional performance: evidence on maximal adaptations on task tive and statistical rather than deductive, showing only that constraints. Annual Review of Psychology 47: 273–305. an event to be explained is likely or falls under some proba- Ericsson, K. A., and J. Smith, Eds. (1991). Toward a General The- ory of Expertise: Prospects and Limits. Cambridge, England: bilistic law rather than that it follows deductively from laws Cambridge University Press. (see DEDUCTIVE REASONING and INDUCTION). For example, Glaser, R. (1996). Changing the agency for learning: acquiring we explain why people get influenza in terms of their expo- expert performance. In K. A. Ericsson, Ed., The Road to Excel- sure to the influenza virus, but many people exposed to the lence: The Acquisition of Expert Performance in the Arts and Sci- virus do not get sick. In areas of science such as evolution- ences, Sports, and Games. Mahwah, NJ: Erlbaum, pp. 303–311. ary biology, scientists cannot predict how different species Hoffman, R. R., Ed. (1992). The Psychology of Expertise: Cogni- will evolve, but they can use the theory of evolution by natu- tive Research and Empirical AI. New York: Springer. ral selection and the fossil record to explain how a given Proctor, R. W., and A. Dutta. (1995). Skill Acquisition and Human species has evolved. Often, the main concern of explanation Performance. Thousand Oaks, CA: Sage. is not so much to deduce what is to be explained from gen- Richman, H. B., F. Gobet, J. J. Staszewski, and H. A. Simon. (1996). Perceptual and memory processes in the acquisition of eral laws as it is to display causes (Salmon 1984). Under- expert performance: the EPAM model. In K. A. Ericsson, Ed., standing an event or class of events then consists of The Road to Excellence: The Acquisition of Expert Perfor- describing the relevant causes and causal mechanisms that mance in the Arts and Sciences, Sports, and Games. Mahwah, produce such events. Salmon (1989) and Kitcher (1989) NJ: Erlbaum, pp. 167–187. provide good reviews of philosophical discussions of the Simon, H. A., and W. G. Chase. (1973). Skill in chess. American nature of scientific explanation. According to Friedman Scientist 61: 394–403. (1974) and Kitcher, explanations yield understanding by Sloboda, J. A., J. W. Davidson, M. J. A. Howe, and D. G. Moore. unifying facts using common patterns. (1996). The role of practice in the development of performing Deduction from laws is just one of the ways that facts can musicians. British Journal of Psychology 87: 287–309. be explained by fitting them into a more general, unifying Starkes, J. L., and F. Allard, Eds. (1993). Cognitive Issues in Motor Expertise. Amsterdam: North Holland. framework. More generally, explanation is a process of Starkes, J. L., J. Deakin, F. Allard, N. J. Hodges, and A. Hayes. applying a schema that fits what is to be explained into a sys- (1996). Deliberate practice in sports: what is it anyway? In K. tem of information. An explanation schema consists of an A. Ericsson, Ed., The Road to Excellence: The Acquisition of explanation target, which is a question to be answered, and Expert Performance in the Arts and Sciences, Sports, and an explanatory pattern, which provides a general way of Games. Mahwah, NJ: Erlbaum, pp. 81–106. answering the question. For example, when you want to Taylor, I. A. (1975). A retrospective view of creativity investiga- explain why a person is doing an action such as working long tion. In I. A. Taylor and J. W. Getzels, Eds., Perspectives in hours, you may employ a rough explanation schema like: Creativity. Chicago: Aldine. pp. 1–36. VanLehn, K. (1996). Cognitive skill acquisition. Annual Review of Explanation target: Psychology 47: 513–539. Why does a person with a set of beliefs and desires perform Webster’s Third New International Dictionary. (1976). Springfield, a particular action? MA: Merriam. Explanatory pattern: The person has the belief that the action will help fulfil the Explanation desires. This belief causes the person to pursue the action. An explanation is a structure or process that provides under- To apply this schema to a particular case, we replace the standing. Furnishing explanations is one of the most impor- italicized terms with specific examples, as in explaining tant activities in high-level cognition, and the nature of Explanation-Based Learning 301 Mary’s action of working long hours in terms of her belief that References this will help her to fulfill her desire to finish her Ph.D. disser- Chi, M. T. H., M. Bassok, M. W. Lewis, P. Reimann, and R. Glaser. tation. Many writers in philosophy of science and cognitive (1989). Self-explanations: how students study and use examples science have described explanations and theories in terms of in learning to solve problems. Cognitive Science 13: 145–182. schemas, patterns, or similar abstractions (Kitcher 1989, 1993; Craik, K. (1943). The Nature of Explanation. Cambridge: Cam- Kelley 1972; Leake 1992; Schank 1986; Thagard 1992). bridge University Press. One kind of explanation pattern that is common in biol- Friedman, M. (1974). Explanation and scientific understanding. ogy, psychology, and sociology explains the presence of a Journal of Philosophy 71: 5–19. structure or behavior in a system by reference to how the Harman, G. (1986). Change in View: Principles of Reasoning. structure or behavior contributes to the goals of the system. Cambridge, MA: MIT Press/Bradford Books. Hempel, C. G. (1965). Aspects of Scientific Explanation. New For example, people have hearts because this organ func- York: Free Press. tions to pump blood through the body, and democracies Kelley, H. H. (1972). Causal schemata and the attribution process. conduct elections in order to allow people to choose their In E. E. Jones, D. E. Kanouse, H. H. Kelley, R. E. Nisbett, S. leaders. These functional (teleological) explanations are not Valins, and B. Weiner, Eds., Attribution: Perceiving the Causes incompatible with causal/mechanical ones: in a biological of Behavior. Morristown, NJ: General Learning Press. organism, for example, the explanation of an organ in terms Kitcher, P. (1989). Explanatory unification and the causal structure of its function goes hand in hand with a causal explanation of the world. In P. Kitcher and W. C. Salmon (Eds.), Scientific that the organ developed as the result of natural selection. Explanation. Minneapolis: University of Minnesota Press, pp. Craik (1943) originated the important idea that an explana- 410–505. tion of events can be accomplished by mental models that Kitcher, P. (1993). The Advancement of Science. Oxford: Oxford University Press. parallel the events in the same way that a calculating Kitcher, P. and W. Salmon. (1989). Scientific Explanation. Minne- machine can parallel physical changes. apolis: University of Minnesota Press. Analogies can contribute to explanation at a more spe- Leake, D. B. (1992). Evaluating Explanations: A Content Theory. cific level, without requiring explicit use of laws or sche- Hillsdale, NJ: Erlbaum. mas. For example, DARWIN’s use of his theory of natural Lipton, P. (1991). Inference to the Best Explanation. London: Rou- selection to explain evolution frequently invoked the famil- tledge. iar effects of artificial selection by breeders of domesticated Mitchell, T., R. Keller, and S. Kedar-Cabelli. (1986). Explanation- animals. Pasteur formed the germ theory of disease by anal- based generalization: a unifying view. Machine Learning 1: ogy with his earlier explanation that fermentation is caused 47–80. by bacteria. In analogical explanations, something puzzling Read, S. J., and A. Marcus-Newhall. (1993). Explanatory coher- ence in social explanations: a parallel distributed processing is compared to a familiar phenomenon whose causes are account. Journal of Personality and Social Psychology 65: known (see ANALOGY). 429–447. In both scientific and everyday understanding, there is Salmon, W. (1984). Scientific Explanation and the Causal Struc- often more than one possible explanation. Perhaps Mary is ture of the World. Princeton: Princeton University Press. working long hours merely because she is a workaholic and Salmon, W. C. (1989). Four decades of scientific explanation. In P. prefers working to other activities. One explanation of why Kitcher and W. C. Salmon, Eds., Scientific Explanation (Minne- the dinosaurs became extinct is that they were killed when an sota Studies in the History of Science, vol. 13. Minneapolis: asteroid hit the earth, but acceptance of this hypothesis must University of Minnesota Press. compare it with alternative explanations. The term inference Schank, R. C. (1986). Explanation Patterns: Understanding to the best explanation refers to acceptance of a hypothesis on Mechanically and Creatively. Hillsdale, NJ: Erlbaum. Swartout, W. (1983). XPLAIN: a system for creating and explaining the grounds that it provides a better explanation of the evi- expert consulting systems. Artificial Intelligence 21: 285–325. dence than alternative hypotheses (Harman 1986; Lipton Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton 1991; Thagard 1992). Examples of inference to the best University Press. explanation include theory choice in science and inferences we make about the mental states of other people. What social Further Readings psychologists call attribution is inference to the best explana- tion of a person’s behavior (Read and Marcus-Newhall 1993). Keil, F., and R. Wilson, Eds. (1997). Minds and Machines 8: 1– Explanations are often useful for improving the perfor- 159. mance of human and machine systems. Automated expert systems are sometimes enhanced by giving them the ability Explanation-Based Learning to produce computer-generated descriptions of their own operation so that people will be able to understand the infer- ences underlying their conclusions (Swartout 1983). Chi et Explanation-based learning (EBL) systems attempt to al. (1989) found that students learn better when they use improve the performance of a problem solver (PS) by first “self-explanations” that monitor progress or lack of pro- examining how the PS solved previous problems, then mod- gress in understanding problems. ifying the PS to enable it to solve similar problems better (typically, more efficiently) in the future. See also CONCEPTUAL CHANGE; EXPLANATION-BASED Many problem-solving tasks—which here include diag- LEARNING; REALISM AND ANTIREALISM; UNITY OF SCIENCE nosis, classification, PLANNING, scheduling and parsing (see —Paul Thagard also KNOWLEDGE-BASED SYSTEMS; NATURAL LANGUAGE 302 Explanation-Based Learning combinato- be the first rule considered the next time a lender was PROCESSING; CONSTRAINT SATISFACTION)—are rially difficult, inasmuch as they require finding a (possibly sought. This would allow the modified solver—call it PS´— very long) sequence of rules to reduce a given goal to a set to handle houseowning uncles in a single backward- of operational actions, or to a set of facts, and so forth. chaining step, without any backtracking. Unfortunately, this can force a problem solving system (PS) Such EBL modules first “explain” why u1 was a lender to take a long time to solve a problem. Fortunately, many by examining the derivation structure obtained for this problems are similar, which means that information query; hence the name “explanation-based learning.” Many obtained from solving one problem may be useful for solv- such systems then “collapse” the derivation structure of the ing subsequent, related problems. Some PSs therefore motivating problem: directly connecting a variablized form include an “explanation-based learning” module that learns of the conclusion to the atomic facts used in its derivation. from each solution: after the basic problem-solving module In general, the antecedents of the new rule are the weakest has solved a specific problem, the EBL module then modi- set of conditions required to establish the conclusion (here fies the solver (perhaps by changing its underlying rule lender(X, Y)) in the context of the given instance. The base, or by adding new control information), to produce a example itself was used to specify what information—in new solver that is able to solve this problem, and related particular, which of the facts about me and u1—were problems, more efficiently. required. As a simple example, given the information in the Pro- Although the new rnew rule is useful for queries that deal log-style logic program with houseowning uncles, it will not help when dealing with house-owning aunts or with CEO uncles. In fact, rnew will   lender(X,Y) :- relative(X,Y), rich(Y). be counterproductive for such queries, as the associated   solver PS′ will have to first consider rnew before going on to   relative(X,Y) :- aunt(X,Y)   find the appropriate derivation. If only a trivial percentage relative(X,Y) :- uncle(X,Y)   of the queries deal with houseowning uncles, then the per-   formance of PS′ will be worse than the original problem     rich(X,Y) :- ceo(Y,B), bank(B). solver, as PS′ will take longer, on average, to reach an  :- own(Y,H), house(H).  answer. This degradation is called the “utility problem” rich(Y)   (Minton 1988; Tambe, Newell, and Rosenbloom 1990).   One way to address this problem is first to estimate the  uncle(me,u 1 ) own(u 1 ,h 2 ). house(h 2 )    distribution of queries that will be posed, then evaluate the efficiency of a PS against that distribution (Greiner and (where the first rule states that a person Y may lend money Orponen 1996; Segre, Gordon, and Elkan 1996; Cohen to X if Y is a relative of X and Y is rich, etc., as well as 1990; Zweben et al. 1992). (Note that this estimation pro- information about a house-owning uncle u1), the PS would cess may require the EBL module to examine more than a correctly classify u1 as a lender—that is, return yes to the single example before modifying the solver.) The EBL mod- query lender(me, u1). ule can then decide whether to include a new rule, and if so, Most PSs would have had to backtrack here; first asking where it should insert that rule. (Storing the rule in front of if u1 was an aunt, and only when this subquery failed, then the rule set may not be optimal, especially after other EBL- asking whether u1 was an uncle; similarly, to establish generated rules have already been added.) Because the latter rich(u1), it would first see whether u1 was the CEO of a task is unfortunately intractable (Greiner 1991), many EBL bank before backtracking to see if he owns a house. In gen- systems involve hill-climbing to a local optimum (Greiner eral, PSs may have to backtrack many times as they search 1996; Gratch and DeJong 1996; see GREEDY LOCAL through a combinatorial space of rules until finding a SEARCH). sequence of rules that successfully reduces the given query There are, however, some implemented systems that have to a set of known facts. successfully addressed these challenges. As examples, the Although there may be no way to avoid such searches the Samuelsson and Rayner EBL module improved the perfor- first time a query is posed (note that many of these tasks are mance of a natural language parsing system by a factor of NP-hard; see COMPUTATIONAL COMPLEXITY), an EBL mod- three (Samuelsson and Rayner 1991); Zweben et al. (1992) ule will try to modify the PS to help it avoid this search the improved the performance of their constraint-based sched- second time it encounters the same query, or one similar to uling system by about 34 percent on realistic data, and it. As simply “caching” or “memorizing” (Michie 1967) the Gratch and Chien (1996), by about 50 percent. particular conclusion—here, lender(me, u1)—would only Explanation-based learning differs from typical MA- help the solver if it happens to encounter this exact query a CHINE LEARNING tasks in several respects. First, standard second time, most EBL modules instead incorporate more learners try to acquire new domain-level information, general information, perhaps by adding in a new rule that which the solver can then use to solve problems that it directly encodes the fact that any uncle who owns a house is could not solve previously; for example, many INDUCTION a lender; that is, learners learn a previously unknown classification func- tion, which can then be used to classify currently unclassi- lender(X, Y) :– uncle(X, Y), owns(Y, H), house(H). fied instances. By contrast, a typical EBL module does not Many EBL systems would then store this rule—call it extend the set of problems that the underlying solver could rnew—in the front of the PS’s rule set, which means it will solve (Dietterich 1986); instead, its goal is to help the Explanation-Based Learning 303 solver to solve problems more efficiently. Stated another DeJong, G. (1997). Explanation-base learning. In A. Tucker, Ed., The Computer Science and Engineering Handbook. Boca way, explanation-based learning does not extend the Raton, FL: CRC Press, pp. 499–520. deductive closure of the information already known by the DeJong, G., and R. Mooney. (1986). Explanation-based learning: solver. Such knowledge-preserving transformations can be an alternative view. Machine Learning 1(2): 145–76. critical, as they can turn correct, but inefficient-to-use Dietterich, T. G. (1986). Learning at the knowledge level. Machine information (e.g., first-principles knowledge) into useful, Learning 1(3): 287–315. (Reprinted in Readings in Machine efficient, special-purpose expertise. Learning.) Of course, the solver must know a great deal initially. Ellman, T. (1989). Explanation-based learning: a survey of pro- There are other learning systems that similarly exploit a grams and perspectives. Computing Surveys 21(2): 163–221. body of known information, including work in INDUCTIVE Fikes, R., P. E. Hart, and N. J. Nilsson. (1972). Learning and exe- LOGIC PROGRAMMING (attempting to build an accurate cuting generalized robot plans. Artificial Intelligence 3: 251– 288. deductive knowledge base from examples; Muggleton Gratch, J., and S. Chien. (1996). Adaptive problem-solving for 1992) and theory revision (modifying a given initial knowl- large-scale scheduling problems: a case study. Journal of Artifi- edge base, to be more accurate over a set of examples; Our- cial Intelligence Research 4: 365–396. ston and Mooney 1994; Wogulis and Pazzani 1993; Greiner Gratch, J., and G. DeJong. (1996). A decision-theoretic approach 1999; Craw and Sleeman 1990). However, these other learn- to adaptive problem solving. Artificial Intelligence 88(1–2): ing systems differ from EBL (and resemble standard learning 365–396. algorithms) by changing the deductive closure of the initial Greiner, R. (1991). Finding the optimal derivation strategy in a theory (see DEDUCTIVE REASONING). redundant knowledge base. Artificial Intelligence 50(1): 95–116. Finally, most learners require a great number of train- Greiner, R. (1996). PALO: A probabilistic hill-climbing algorithm. ing examples to be guaranteed to learn effectively; by con- Artificial Intelligence 83(1–2). Greiner, R. (1999). The complexity of theory revision. Artificial trast, many EBL modules attempt to learn from a single Intelligence. solved problem. As we saw above, this single “solved Greiner, R., and P. Orponen. (1996). Probably approximately opti- problem” is in general very structured, and moreover, mal satisficing strategies. Artificial Intelligence 82(1–2): 21– most recent EBL modules use many samples to avoid the 44. utility problem. Laird, J. E., P. S. Rosenbloom, and A. Newell. (1986). Universal This article has focused on EBL modules that add new Subgoaling and Chunking: The Automatic Generation and (entailed) base-level rules to a rule base. Other EBL mod- Learning of Goal Hierarchies. Hingham, MA: Kluwer Aca- ules instead try to speed up a performance task by adding demic Press. new control information, for example, which help the solver Michie, D. (1967). Memo functions: a language facility with “rote to select the appropriate operator when performing a state learning” properties. Research Memorandum MIP–r–29, Edin- burgh: Department of Machine Intelligence and Perception. space search (Minton 1988). In general, EBL modules first Minton, S. (1988). Learning Search Control Knowledge: An detect characteristics that make the search inefficient, and Explanation-Based Approach. Hingham, MA: Kluwer Aca- then modify the solver to avoid poor performance in future demic Publishers. problems. Also, although our description assumes the back- Minton, S., J. Carbonell, C. A. Knoblock, D. R. Kuokka, O. Etzi- ground theory to be “perfect,” there have been extensions to oni, and Y. Gil. (1989). Explanation-based learning: a problem deal with theories that are incomplete, intractable, or incon- solving perspective. Artificial Intelligence 40(1–3): 63–119. sistent (Cohen 1992; Ourston and Mooney 1994; DeJong Mitchell, T. M., R. M. Keller, and S. T. Kedar-Cabelli. (1986). 1997). Example-based generalization: a unifying view. Machine The rules produced by an EBL module resemble the Learning 1(1): 47–80. “macro-operators” built by the Abstrips planning system Muggleton, S. H. (1992). Inductive Logic Programming. Orlando, FL: Academic Press. (Fikes, Hart, and Nilsson 1972), as well as the “chunks” Ourston, D. and R. J. Mooney. (1994). Theory refinement combin- built by the Soar system (Laird and Rosenbloom 1986). ing analytical and empirical methods. Artificial Intelligence (Note that Rosenbloom showed that this “chunking” can 66(2): 273–310. model the practice effects in humans.) Samuelsson, C., and M. Rayner. (1991). Quantitative evaluation of See also EXPLANATION; KNOWLEDGE REPRESENTATION; explanation-based learning as an optimization tool for a large- PROBLEM SOLVING scale natural language system. In Proceedings of the 12th Inter- national Joint Conference on Artificial Intelligence. Los Ange- les, CA: Morgan Kaufmann, pp. 609–615. —Russell Greiner Segre, A. M., G. J. Gordon, and C. P. Elkan. (1996). Exploratory analysis of speedup learning data using expectation maximiza- tion. Artificial Intelligence 85(1–2): 301–319. References Tambe, M., A. Newell, and P. Rosenbloom. (1990). The problem Cohen, W. W. (1990). Using distribution-free learning theory to of expensive chunks and its solution by restricting expressive- analyze chunking. Proceeding of CSCSI–90 177–83. ness. Machine Learning 5(3): 299–348. Cohen, W. W. (1992). Abductive explanation-based learning: a Wogulis, J., and M. J. Pazzani. (1993). A methodology for evaluat- solution to the multiple inconsistent explanation problems. ing theory revision systems: results with Audrey II. Proceed- Machine Learning 8(2): 167–219. ings of IJCAI–93: 1128–1134. Craw, S., and D. Sleeman. (1990). Automating the refinement of Zweben, M., E. Davis, B. Daun, E. Drascher, M. Deale, and M. knowledge-based systems. In L. C. Aiello, Ed., Proceedings of Eskey. (1992). Learning to improve constraint-based schedul- ECAI 90. Pitman. ing. Artificial Intelligence 58: 271–296. 304 Explanatory Gap obtain, but we have an experience that is like what seeing red Explanatory Gap or yellow is like. Because we can imagine a device that pro- cessed the same information as our visual systems but was not conscious at all, it is also clear that we do not really know The MIND-BODY PROBLEM—the problem of understanding why our systems give rise to conscious experience of any the relation between physical and mental phenomena—has sort, much less of this specific sort. Again, the contrast with both a metaphysical side and an epistemological side. On the boiling point of water is instructive. Once we fill in the the metaphysical side, there are arguments that purport to appropriate microphysical details, it does not seem conceiv- show that mental states could not be (or be realized in) able that water should not boil at 212° F. at sea level. physical states, and therefore some version of dualism must One final example is quite helpful to make the point. be true. On the epistemological side, there are arguments to Frank Jackson (1982) imagines a neuroscientist, Mary, who the effect that even if in fact mental states are (or are real- knows all there is to know about the physical mechanisms ized in) physical states, there is still a deep problem about underlying color vision. However, she has been confined all how we can explain the distinctive features of mental states her life to a black and white environment. One day she is in terms of their physical properties. In other words, there allowed to emerge into the world of color and sees a ripe seems to be an “explanatory gap” between the physical and tomato for the first time. Does she learn anything new? the mental (see Levine 1983 and 1993). Jackson claims that obviously she does; she learns what red The distinctive features that seem to give rise to the looks like. But if all the information she had before really explanatory gap are the qualitative characters of conscious explained what red looked like, it seems as if she should experiences (or QUALIA), such as the smell of a rose or the have been able to predict what it would look like. Thus her way the blue sky looks on a clear day. With conscious crea- revelation on emerging from the colorless world supports tures it seems sensible to ask, with regard to their conscious the existence of the explanatory gap. mental states, WHAT-IT’S-LIKE to have them. The answers to There are various responses to the explanatory gap. One these questions refer to properties that seem quite unlike the view (see McGinn 1991) is that it reflects a limitation on our sorts of properties described by neurophysiologists or com- cognitive capacities. We just do not have, and are constitu- putational psychologists. It is very hard to see how the quali- tionally incapable of forming, the requisite concepts to tative character of seeing blue can be explained by reference bridge the gap. Others argue that the gap is real but that it is to neural firings, or even to the information flow that encodes to be expected given certain peculiarities associated with the spectral reflectance properties of distal objects. It always our first-person access to experience (see Lycan 1996; Rey seems reasonable to ask, but why should a surface with this 1996; Tye 1995). Just as one cannot derive statements specific spectral reflectance look like that (as one internally involving indexicals from those that do not—e.g., “I am points at one’s experience of blue)? For that matter, it seems here now” from “Joe Levine is at home on Saturday, June reasonable to ask why there should be anything it is like at 27, 1997”—so one cannot derive statements containing all to see blue, inasmuch as detecting and encoding informa- terms like “the way blue looks” from those that contain only tion about the external world does not automatically entail neurological or computational terms. Still others (see having a genuine experience. After all, thermometers and Churchland 1985) argue that advocates of the explanatory desktop computers detect and encode information, but few gap just do not appreciate how much one could explain people are tempted to ascribe experience to them. given a sufficient amount of neurological detail. Finally, Traditionally, dualists have employed conceivability some see in the explanatory gap evidence that the very arguments to demonstrate a metaphysical distinction notion of qualitative character at issue is confused and prob- between the mental and the physical. Whether or not these ably inapplicable to any real phenomenon (see Dennett arguments work to establish the metaphysical thesis of dual- 1991 and Rey 1996). On this view qualia literally do not ism (for arguments pro and con see Chalmers 1996; Jackson exist. Although we have experiences, they do not actually 1993; Block and Stalnaker forthcoming), they can be possess the features we naively take to be definitive of them. employed to support the existence of the explanatory gap. There are complex and subtle issues involved with each We start with the assumption that adequate explanations of these responses. For instance, the precise role of identity reveal a necessary relation between the factors cited in the statements in explanations, and the degree to which identi- explanation (the explanans) and the phenomenon to be ties themselves are susceptible of explanation, must be explained (the explanandum). For example, suppose we explored more fully. (For a lengthy discussion of the con- want to know why water boils at 212° F. at sea level. Given ceivability argument that deals with these issues, see Levine the molecular analysis of water, boiling, and temperature, forthcoming.) Suffice it to say that no consensus yet exists together with the relevant physical and chemical laws, it on the best way to respond to the explanatory gap. becomes apparent that under these conditions water just has to boil. The point is, we can see why we should not expect See also CONSCIOUSNESS; EXPLANATION; INTENTIONAL- anything else. ITY; PHYSICALISM Contrast this example with what it is like to see blue. After —Joseph Levine an exhaustive specification of both the neurological and the computational details, we really do not know why blue References should look the way it does, as opposed, say, to the way red or yellow looks. That is, we can still conceive of a situation in Block, N. J., and R. Stalnaker. (Forthcoming). Conceptual Analysis and the Explanatory Gap. which the very same neurological and computational facts Extensionality, Thesis of 305 ing are equivalent because no matter how far Jody actually Chalmers, D. (1996). The Conscious Mind. Oxford: Oxford Uni- versity Press. ran, the sentences are either both true or both false: Churchland, P. (1985). Reduction, qualia, and the direct introspec- Jody just jogged 3.1 miles. tion of brain states. Journal of Philosophy 82: 8–28. Jody just jogged 5 kilometers. Dennett, D. C. (1991). Consciousness Explained. Boston: Little, Brown. When the replacement of a component of a sentence with Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quar- another component always results in a sentence that has the terly 32: 127–136. same truth-value as the original (true if the original is true Jackson, F. (1993). Armchair metaphysics. In J. O’Leary-Haw- and false if the original if false) this is a replacement that thorne and M. Michael, Eds., Philosophy in Mind. Dordrecht: preserves truth-value. Kluwer. Terms of different kinds have extensions of different Levine, J. (1983). Materialism and qualia: the explanatory gap. Pacific Philosophical Quarterly 64: 354–361. kinds. The extension of a name or description is the individ- Levine, J. (1993). On leaving out what it’s like. In M. Davies and ual or individuals to which the name or description applies. G. Humphreys, Eds., Consciousness: Psychological and Philo- The extension of a one-place predicate such as “is a syn- sophical Essays. Oxford: Blackwell, pp. 121–136. apse” applies to the class of all the individuals, in this case, Levine, J. (Forthcoming). Conceivability and the metaphysics of all the synapses, to which the predicate applies. The exten- mind. sion of an n-place predicate is a set of ordered n-tuples to Lycan, W. G. (1996). Consciousness and Experience. Cambridge, which the predicate applies. The extension of a declarative MA: Bradford Books/MIT Press. sentence, which one can regard as a zero-place predicate, is McGinn, C. (1991). The Problem of Consciousness. Oxford: its truth value. Blackwell. Two terms are coextensive only when they have the same Rey, G. (1996). Contemporary Philosophy of Mind: A Conten- tiously Classical Approach. Oxford: Blackwell. extension, two sentences that have the same truth value, Tye, M. (1995). Ten Problems of Consciousness: A Representa- two names that refer to exactly the same thing (or things), tional Theory of the Phenomenal Mind. Cambridge, MA: Brad- and so on. ford Books/MIT Press. A sentence is extensional only when each and every replacement of a component of a sentence with a coexten- Further Readings sive term preserves truth value. If “Stan’s car” and “the oldest Volvo in North Carolina” Clark, A. (1993). Sensory Qualities. Oxford: Oxford University are coextensive, then replacing the first with the second in Press. (a) “Stan’s car is in the driveway” preserves truth-value. So Dretske, F. (1995). Naturalizing the Mind. Cambridge, MA: Brad- ford Books/MIT Press. far as this replacement shows, then, (a) is extensional. A Flanagan, O. (1992). Consciousness Reconsidered. Cambridge, similar replacement in (b) “Hillis thinks that Stan’s car is a MA: Bradford Books/MIT Press. new Jaguar” does not preserve truth value. Hillis does not Hardcastle, V. G. (1995). Locating Consciousness. Amsterdam/ think that the oldest Volvo in North Carolina is a new Jag- Philadelphia: John Benjamins. uar. Statements about PROPOSITIONAL ATTITUDES such as Hardin, C. L. (1987). Qualia and materialism: closing the thinking, believing, fearing, and hoping are typically not explanatory gap. Philosophy and Phenomenological Research extensional. 47(2). If “The Mercury Track Team” and “The Vanguard Video Levin, J. (1983). Functionalism and the argument from conceiv- Club” are coextensive because each expression names a ability. Canadian Journal of Philosophy, sup. vol. 11. group with exactly the same members, then replacing the Loar, B. (1990). Phenomenal states. In J. Tomberlin, Ed., Action Theory and Philosophy of Mind; Philosophical Perspectives 4. first with the second in (c) “Nobody in the Mercury Track Atascadero, CA: Ridgeview Publishing Co., pp. 81–108. Team smokes cigars” preserves truth-value. A similar Metzinger, T., Ed. (1995). Conscious Experience. Paderborn: Fer- replacement in (d) “The Captain of the Mercury Track Team dinand Schöningh/ Imprint Academic. is Lou Silver” will not preserve truth-value if (d) is true and Yablo, S. (1993). Is conceivability a guide to possibility? Philoso- the Vanguard Video Club does not have a captain. Examples phy and Phenomenological Research 53 (1): 1–42. of this sort show at least that clubs are not sets. “Vicki will discover the greatest prime number” and Explicit Memory “Vicki will win the New Jersey Lottery” are coextensive because they are both false. Replacing the first with the sec- ond in (e) “It isn’t so that Vicki will discover the greatest See IMPLICIT VS. EXPLICIT MEMORY prime number” preserves truth-value. A similar replacement in (f), “It is absolutely impossible that Vicki will discover the Extensionality, Thesis of greatest prime number,” does not preserve truth value. How- ever unlikely, it is still possible that Vicki will hit the jackpot. The thesis of extensionality says that every meaningful The existence of a largest prime number, in contrast, is not declarative sentence is equivalent to some extensional sen- merely unlikely; it is impossible. Modal statements about tence. Understanding this thesis requires understanding the what is possible or necessary, treated by MODAL LOGIC, are terms in italics. often not extensional. The connectives of standard proposi- Two sentences are equivalent if and only if they have the tional LOGIC such as not, and, or, if, and if and only if are same truth-value in all possible circumstances. The follow- truth-functional. Replacement in a truth-functional sentence 306 Externalism of any component with another with the same truth-value that Carnap and others proposed. (See Hahn 1998 for a preserves the truth-value of the original. Chisholm bibliography.) The thesis of extensionality, a version of REDUCTIONISM, Clause (A), however, cannot be taken for granted. The says that every meaningful, declarative sentence is equiva- project of finding extensional equivalents of nonextensional lent to some extensional sentence. This does not require that modal sentences also faces difficulties. Quine (1953), like every sentence be extensional but rather that every nonex- Carnap, attempts syntactic translations that are about lan- tensional sentence about psychological attitudes, modality, guage. (See chapter 10, section 3, “The Problems of Inten- laws, counterfactual conditionals, and so forth, have an sionality,” in Kneale and Kneale 1962.) Montague (1963) extensional equivalent. derives significant negative results that are beyond the scope An historically important statement of the thesis of of this article. extensionality appears in Wittgenstein (1922): “A proposi- See also UNITY OF SCIENCE; INDEXICALS AND DEMON- tion is a truth-function of elementary propositions (Proposi- STRATIVES; FREGE, GOTTLOB tion 5).” —David H. Sanford Wittgenstein’s later philosophy abandons both elemen- tary propositions and the thesis of extensionality. Russell expresses sympathy for the thesis in several places including References “Truth-Functions and Others,” Appendix C of Whitehead Brentano, F. (1874). Psychologie vom empirischen Standpunkt. and Russell (1927). Vienna. Rudolf Carnap formulates the thesis of extensionality Carnap, R. (1937). The Logical Syntax of Language. London: (hereafter abbreviated TOE) as a relation between exten- Routledge and Kegan Paul. First published in German in sional and nonextensional languages (see especially Carnap 1934. 1937, sect. 67). The truth of TOE promises a greater intelli- Carnap, R. (1947). Meaning and Necessity. Chicago: University of gibility of the world. Extensional languages have “radically Chicago Press. simpler structures and hence simpler constitutive rules” Carnap, R. (1958). Introduction to Symbolic Logic and Its Applica- than nonextensional languages (Carnap 1958: 42). If “the tions. New York: Dover. First published in German in 1954. Chisholm, R. M. (195556). Sentences about believing. Proceed- universal language of science” (Carnap 1937: 245) is ings of the Aristotelian Society 56: 125–148. extensional, therefore, we can discuss exhaustively every Hahn, L. E., Ed. (1998). The Philosophy of Roderick M. Chisholm: scientific phenomenon in a language that has a radically The Library of Living Philosophers. Peru, IL: Open Court. simple structure. Kneale, W., and M. Kneale. (1962). The Development of Logic. Carnap defends TOE by finding an extensional sentence Oxford: Oxford University Press. about language that he thinks is equivalent to a given non- Montague, R. (1963). Syntactical treatments of modality, with cor- extensional sentence. The modal, nonextensional sentence ollaries on reflexion principles and finite axiomatizability. “Necessary, if you steal this book, then you steal this book” Reprinted in R. H. Thomason, Ed., 1974, Formal Philosophy; is equivalent to the extensional sentence “‘If you steal this Selected Papers of Richard Montague. New Haven: Yale Uni- book, then you steal this book’ is true in a certain formal versity Press. Quine, W. (1953). Three grades of modal involvement. Reprinted metalanguage L” (cf. Carnap 1958: 42). The psychological, in Quine (1966), The Ways of Paradox. New York: Random nonextensional sentence “John believes that raccoons have House. knocked over the garbage cans” is equivalent, perhaps, to Whitehead, A. N., and B. Russell. (1927). Principia Mathematica. the extensional sentence “John is disposed to an affirmative 2nd ed. Cambridge: Cambridge University Press. response to some sentence in some language which Wittgenstein, L. (1922). Tractatus Logico-Philosophicus. First expresses the proposition that raccoons have knocked over published in German in 1921. Recent English translation by D. the garbage cans” (cf. Carnap 1947: 55). F. Pears and B. F. McGuinness (1961), London: Routledge and Although he devotes much time and effort to defend Kegan Paul. TOE, Carnap does not claim to establish it. He regards it as a likely conjecture. The project, however, now appears to be Externalism more difficult than Carnap predicted. A successful transla- tion of a nonextensional sentence must satisfy two require- ments: (1) the new sentence must really be equivalent to the See INDIVIDUALISM; MENTAL CAUSATION; NARROW CON- original, and (2) the new sentence must really be exten- TENT sional. In the example above, if someone can have such a belief about raccoons without being disposed to respond to Eye Movements and Visual Attention any sentences, (1) is violated. If one cannot understand affir- mative response except as a nonextensional notion, then (2) is violated. Visual scenes typically contain more objects than can ever In terms of INTENTIONALITY, Chisholm (1955–56) for- be recognized or remembered in a single glance. Some kind mulates a thesis resembling that of Brentano (1874) about of sequential selection of objects for detailed processing is the distinctiveness of the psychological: (A) TOE is true for essential if we are to cope with this wealth of information. all nonpsychological sentences, and (B) TOE is false for all Built into the earliest levels of vision is a powerful means of psychological sentences. In the 1950s, Chisholm defended accomplishing the selection, namely, the heterogeneous (B) by attacking translations of sentences about believing RETINA. Fine grain visual resolution is possible only within Eye Movements and Visual Attention 307 Figure 1. Sequence of saccadic eye movements during reading. The graph on top shows horizontal (top trace) and vertical (bottom trace) eye movements over time. The abrupt changes in eye position are the saccades. The figure shows the sequences of rightward saccades (upward deflections in the trace) made to read a line of text, followed by large leftward resetting saccades made to the beginning of each successive line of text. The locations of the saccadic endpoints are shown in numbered sequence, superimposed on the text, at the bottom of the figure. The figure was made by J. Epelboim from recordings made with R. Steinman’s Revolving Magnetic Field Sensor Coil monitor at the University of Maryland (see Epelboim et al. 1995, for a description of the instrument). the central retinal region known as the fovea, whose diame- movements provide overt indicators of the locus of attention ter is approximately 2 degrees of visual angle (about the size during performance of complex cognitive tasks, such as of eight letters on a typical page of text). Eye movements reading or visual search. are important because they bring selected images to the Consider first the role of attention in programming fovea, and also keep them there for as long as needed to rec- smooth eye movements. When we walk forward in an other- ognize the object. wise stationary scene, trying to keep our gaze fixed on our Eye movements fall into two broad classes. Saccadic eye goal ahead, the flow of image motion generated on the retina movements (saccades) are rapid jumps of the eye used to by our own forward motion creates a large array of motion shift gaze to any chosen object. In READING, for example, signals that could potentially drag the line of sight away saccades typically occur about three times each second and from its intended goal (a problem described originally by are generally made to look from one word to the next (see Ernst Mach in 1906). Laboratory simulations of this com- figure 1). Smooth eye movements keep the line of sight on mon situation show that smooth eye movements can main- the selected object during the intervals between saccades, tain a stable line of sight on a small, attended, stationary compensating for motion on the retina that might be caused target superimposed on a large, vivid moving background. either by motion of the object or by motion of the head or Similarly, smooth eye movements can accurately track a tar- body. Intervals between saccades can be as long as several get moving across a stationary background. With more com- seconds during steady fixation of stationary or moving plex stimuli (letters, for example), perceptual identification objects. Saccades can be made in any chosen direction, even of tracked targets is better than identification of untracked in total darkness, whereas directed smooth eye movements backgrounds (a result that holds after any differences in iden- cannot be initiated or maintained without some kind of tification due to different retinal velocities of target and motion signal. background are taken into account). The greater perceptibil- There are two natural links between eye movements and ity of the target compared to the background implies that the visual attention. One is the role played by ATTENTION in same attentional mechanism serves both perception and eye OCULOMOTOR CONTROL. The other is the way in which eye movements (Khurana and Kowler 1987). 308 Eye Movements and Visual Attention Attention contributes to the control of saccades in an in which subjects look at or point to targets presented briefly analogous way, namely, attention is allocated to the chosen during saccades suggests that stored representations of ocu- target shortly before the saccade is made to look at it (Hoff- lomotor commands (“efferent copies”) are used to take the man and Subramaniam 1995; Kowler et al. 1995). Some effect of eye movements into account and create a represen- attention can be transferred to nontargets, with no harmful tation of target location with respect to the head or body effect on the latency or accuracy of the eye movements, (Hansen and Skavenski 1977). Other evidence suggests that showing that the attentional demands of eye movements are shifts of the retinal image are effectively ignored. According modest (Kowler et al. 1995; Khurana and Kowler 1987). to these views, visual analysis begins anew each time the On the whole, the arrangement is very efficient. By line of sight arrives at a target, with attended visual informa- allowing oculomotor and perceptual systems to share a tion converted rapidly to a high-level semantic code that can common attentional filter, the eye will be directed to the be remembered across sequences of saccades (e.g., object we are most interested in without the need for a sepa- O’Regan 1992). rate selective attentional decision. At the same time, the The advantages to visual and cognitive systems of having modest attentional requirements of effective oculomotor a fovea are evidently so profound that it has been worth the control mean that it is very likely that we can look wherever cost of developing both the capacity for accurate control of we choose with little danger of the eye’s being drawn to eye movements and a tolerance for the retinal perturbations background objects, regardless of how large, bright, or vivid that eye movements produce. Visual attention is crucial for they may be. Modest attentional requirements also imply accomplishing both. that there will be ample cognitive resources left over for See also OBJECT RECOGNITION, ANIMAL STUDIES; OBJECT identification and recognition; all our efforts need not be RECOGNITION, HUMAN NEUROPSYCHOLOGY; ATTENTION IN devoted to targeting eye movements. THE HUMAN BRAIN; VISUAL WORD RECOGNITION; TOP-DOWN The close link between attention and eye movements is PROCESSING IN VISION supported by neurophysiology. Cortical centers containing —Eileen Kowler neurons that are active before eye movements also contain neurons (sometimes the same ones) that are active before References shifts of attention while the eye is stationary (Colby and Duhamel 1996; Andersen and Gnadt 1989). Some have Andersen, R. A., and J. W. Gnadt. (1989). Posterior parietal cortex. gone so far as to consider whether shifting attention to an In R. H. Wurtz and M. E. Goldberg, Eds., Reviews of Oculomo- eccentric location while the eye remains stationary is equiv- tor Research, vol. 3: The neurobiology of saccadic eye move- alent to planning a saccadic eye movement (Kustov and ments. Amsterdam: Elsevier, pp. 315–336. Ballard, D. H., M. M. Hayhoe, and J. B. Pelz. (1995). Memory rep- Robinson 1997; Rizzolatti et al. 1987; Klein 1980). resentation in natural tasks. Journal of Cognitive Neuroscience Attention is involved in the programming of eye move- 7: 66–80. ments, and at the same time observations of eye movements Colby, C. L., and J.-R. Duhamel. (1996). Spatial representations for provide a record of where someone chooses to attend during action in parietal cortex. Cognitive Brain Research 5: 105–115. performance of complex cognitive tasks. Yarbus’s (1967) Epelboim, J., R. M. Steinman, E. Kowler, M. Edwards, Z. Pizlo, C. well-known recordings of eye movements made while J. Erkelens, and H. Collewijn. (1995). The function of visual inspecting various paintings show systematic preferences to search and memory in sequential looking tasks. Vision repeatedly look at those elements that would seem to be Research 35: 3401–3422. most relevant to evaluating the content of the picture. Epelboim, J., and P. Suppes. (1996). Window on the mind? What Despite the detailed record of preferences that eye move- eye movements reveal about geometrical reasoning. Proceed- ings of the Cognitive Science Society 18: 59. ments provide, it has nevertheless proven to be surprisingly Hansen, R. H., and A. A. Skavenski. (1977). Accuracy of eye posi- difficult to develop valid models of underlying cognitive tion information for motor control. Vision Research 17: 919– processing based on eye movements alone (Viviani 1990). 926. More recent work has taken a different tack by using highly Hoffman, J., and B. Subramaniam. (1995). Saccadic eye move- constrained and novel tasks. Sequences of fixations have ments and visual selective attention. Perception and Psycho- been used to study the modularity of syntactic processing physics 57: 7787–7795. during reading (Tanenhaus et al. 1995), the role of WORKING Khurana, B., and E. Kowler. (1987). Shared attentional control of MEMORY during visual problem-solving tasks (Ballard, smooth eye movements and perception. Vision Research 27: Hayhoe, and Pelz 1995; Epelboim and Suppes 1996), the 1603–1618. coordination of eye and arm movements (Epelboim et al. Klein, R. (1980). Does oculomotor readiness mediate cognitive control of visual attention? In R. Nickerson, Ed., Attention and 1995), and the size of the effective processing region during Performance III. Hillsdale, NJ: Erlbaum, pp. 259–276. reading or search (McConkie and Rayner 1975; Motter and Kowler, E. (1990). The role of visual and cognitive processes in the Belky 1998; O’Regan 1990). control of eye movement. In E. Kowler, Ed., Reviews of Oculo- This article has emphasized the importance of eye move- motor Research, vol. 4: Eye Movements and Their Role in Visual ments for selecting a subset of the available information for and Cognitive Processes. Amsterdam: Elsevier, pp. 1–70. detailed processing. The price paid for having this valuable Kowler, E., E. Anderson, B. Dosher and E. Blaser. (1995). The role tool is that the visual system must cope with the continual of attention in the programming of saccades. Vision Research shifts of the retinal image that eye movements will produce. 1897–1916. Remarkably, despite the retinal perturbations, the visual Kustov, A. A., and D. L. Robinson. (1997). Shared neural control scene appears stable and unimpaired. Evidence from studies of attentional shifts and eye movements. Nature 384: 74–77. Face Recognition 309 Appreciation of the specialness of the face for social Mach, E. (1906/1959). Analysis of Sensation. New York: Dover. McConkie, G. W., and K. Rayner. (1975). The span of the effective organisms can be traced back at least to Darwin’s The stimulus during a fixation in reading. Perception and Psycho- Expression of Emotion in Man and Animals (1872). More- physics 17: 578–586. over, face agnosia—a selective deficit in recognizing faces— Motter, B. C., and E. J. Belky. (1998). The zone of focal attention has been inferred from the clinical literature since the turn of during active visual search. Vision Research 38: 1007–1022. the century. Intensive research on face perception of normal O’Regan, J. K. (1990). Eye movements and reading. In E. Kowler, individuals has a more recent history, linked to a growing Ed., Reviews of Oculomotor Research, vol. 4: Eye Movements interest in INFANT COGNITION and perception. One early and Their Role in Visual and Cognitive Processes. Amsterdam: milestone was Yin’s (1969) demonstration of the inversion Elsevier, pp. 395–453. effect, the tendency for recognition of faces to be differen- O’Regan, J. K. (1992). Solving the “real” mysteries of visual per- tially impaired (relative to that of other “mono-oriented” ception: the world as an outside memory. Canadian Journal of Psychology 46: 461–488. stimuli such as houses) by turning the stimulus upside down. Rizzolatti, G., L. Riggio, I. Dascola, and C. Umita. (1987). Reori- This finding has been widely interpreted to mean that face enting attention across the horizontal and vertical meridians: recognition depends on specialized mechanisms for config- evidence in favor of a premotor theory of attention. Neuropsy- ural processing (i.e., analysis of small differences in details chologia 25: 31–40. and spatial relations of features within a prototypical organi- Tanenhaus, M. K., M. J. Spivey-Knowlton, K. M. Eberhard, and J. zation). Face “specialness” is also supported by data show- C. Sedivy. (1995). Integration of visual and linguistic informa- ing that infants preferentially look at or track facelike tion in spoken language comprehension. Science 268: 1632– arrangements of features relative to jumbles of features or 1634. control stimuli. Nonetheless, the protracted development of Viviani, P. (1990). Eye movements in visual search: cognitive, per- adult levels of performance indicates that face recognition ceptual and motor control aspects. In E. Kowler, Ed., Reviews of Oculomotor Research, vol. 4: Eye Movements and Their additionally involves either a long period of NEURAL DEVEL- Role in Visual and Cognitive Processes. Amsterdam: Elsevier, OPMENT and/or cognitive processing capacity, specific expe- pp. 353–393. rience with faces, or both. Exactly what improves with Yarbus, A. L. (1967). Eye Movements and Vision. New York: Ple- maturation or experience remains unclear (Chung and num Press. Thomson 1995). The fascinating syndrome of face agnosia (prosopagno- Further Readings sia) has spurred considerable controversy regarding the “specialness” of faces. The degree of DOMAIN SPECIFICITY Hoffman, J. E. (1997). Visual attention and eye movements. In H. present in prosopagnosia is relevant to whether face recog- Pashler, Ed., Attention. London: University College London Press. nition is best viewed as a unique capacity, or merely as an Kowler, E. (1995). Eye movement. In S. Kosslyn, Ed., Invitation to example of general mechanisms of OBJECT RECOGNITION Cognitive Science, vol. 2. Cambridge, MA: MIT Press, pp. (Farah 1996). Although prosopagnosics are aware that faces 215–265. are faces—that is, they know the basic level category—they Rayner, K, Ed. (1992). Eye Movements and Visual Cognition: fail to identify reliably or achieve a sense of familiarity from Scene Perception and Reading. New York: Springer. faces of family members, famous persons, and other indi- Steinman, R. M., and J. Z. Levinson. (1990). The role of eye move- viduals they previously knew well. Typically, they also have ment in the detection of contrast and spatial detail. In E. trouble forming memories of new faces, even if other new Kowler, Ed., Reviews of Oculomotor Research, vol. 4: Eye objects are learned. However, prosopagnosics may identify Movements and Their Role in Visual and Cognitive Processes. individuals by salient details such as clothing and hairstyle, Amsterdam: Elsevier, pp. 115–212 Suppes, P. (1990). Eye movement models for arithmetic and read- or by nonvisual features such as voice. ing performance. In E. Kowler, Ed., Reviews of Oculomotor Varying patterns of deficits in processing faces occur in Research, vol. 4: Eye Movements and Their Role in Visual and brain-damaged individuals, and these differing patterns pro- Cognitive Processes. Amsterdam: Elsevier, pp. 455–477. vide evidence for dissociable component operations in face recognition. Some patients show sparing of ability to judge the age, gender, and even emotional expression of faces Face Recognition whose identity they cannot grasp. Others have difficulty with all aspects of face processing; such patients are unable Analysis and retention of facial images is a crucial skill for even to analyze facial features normally, a necessary pre- primates. The survival value of this skill is reflected in our condition to identification. Finally, some brain-damaged extraordinary MEMORY for faces, in the visual preferences for patients can perceive structural attributes of faces ade- face stimuli shown by infants, and in our remarkable quately for judgments about emotion and gender, and even sensitivity to subtle differences among faces. Striking judge if a face is familiar, but show a specific inability to parallels have emerged between the results of perceptual, recall the associated name. developmental, neuropsychological, neurophysiological, and In both humans and monkeys, faces are analyzed in sub- functional neuroimaging studies of face recognition. These regions of the visual-cortical object recognition pathway. In indicate that face recognition in primates is a specialized particular, the temporal neocortex in nonhuman primates capacity consisting of a discrete set of component processes (notably inferior temporal cortex or “area TE”) contains neu- with neural substrates in ventral portions of occipito-temporal rons that fire selectively to face stimuli (Gross and Sergent and frontal cortices and in the medial temporal lobes. 1992). The question arises as to whether such cells truly 310 Face Recognition respond to visual information unique to faces, or whether jects either made discriminations of the gender of faces (per- their selectivity is more parsimoniously explained as respon- ceptual task) or judged their identity. In the perceptual task, siveness to features shared by faces and other object classes; selective activation was found in the right ventral occipito- the bulk of the evidence supports the former description. For temporal cortex, to a lesser extent in the same area on the example, although face-selective neurons vary in the degree left, and in a more lateral left focus as well. These areas over- of their preference for face stimuli, many respond to both lap, but are generally anterior to, domains activated by other real faces and pictures of faces, but give nearly no response categories of objects. Judgments of face identity, requiring to any other stimuli tested, including other complex objects reactivation of stored information about individuals, also and pictures in which features making up the face are rear- activated the right parahippocampal gyrus, anterior temporal ranged or “scrambled.” Moreover, for many such neurons, cortex, and temporal pole on both sides. These studies and specificity of response is maintained over transformations in those of Haxby and colleagues have additionally implicated size, stimulus position, angle of lighting of the face, blurring, lateralized portions of frontal cortex in face encoding, per- and so forth. Thus, at least some face-selective neurons are ceptual judgments of faces, and subsequent recognition of sensitive to the global aspects of a face, such as prototypical faces. In particular, a right frontal focus appears to be configuration of stimulus features. Finally, face-selective involved in recognizing facial emotion. Finally, the HIPPOC- neurons are present very early in life (Rodman, Gross, and Ó AMPUS appears to participate, along with parahippocampal Scalaidhe 1993), consistent with the idea that they represent cortices, primarily at the time of encoding new faces. inborn “prototypes” for faces. Neuropathological data are generally consistent with Subsets of face-selective neurons appear to participate in results of imaging and evoked potential studies regarding specific aspects of face coding. For example, some respond anatomical substrates for face recognition. Initially, selectively to distinctive features, such as eyes per se, dis- prosopagnosia was associated clinically with right posterior tance between the eyes, or extent of the forehead. Others do cortical damage. In the 1980s, a number of cases came to not respond to isolated features but instead are selective for autopsy. The damage in these lay very ventrally and medially orientation of a face (e.g., profile or frontal view). Still oth- at the occipitotemporal junction, in roughly the same region ers have responses specific for particular expressions; a final activated in recent PET studies. However, in all such cases subset are particularly sensitive to eye gaze direction (look- this area (or underlying white matter) was damaged bilater- ing back or looking to the side), an important social signal ally, and consequently bilateral damage became thought of in both monkeys and humans. Cells selectively responsive as a necessary precondition to prosopagnosia. Recently, to faces have a localized distribution in several senses. First, cases with proposagnosia and right cortical damage alone, although face cells make up only a tiny fraction of neurons along with the results of imaging and evoked potential stud- (1–5%) within TE and adjacent areas as a whole, their con- ies reviewed above, have reaffirmed the critical role of the centration is much higher in irregular localized clumps. Sec- right hemisphere in face recognition in humans. ond, different types of face-selective cells are found in Many explanations have been given for the apparent “spe- different regions, such that cells sensitive to facial expres- cialness” of faces. Face perception and recognition may sion and gaze direction tend to be found within the superior indeed be unique behavioral capacities and reflect dedicated temporal sulcus, whereas cells more generally selective for neural circuits that can be selectively damaged. Such unique- faces and, purportedly, for individuals tend to be located in ness of faces might result both from their behavioral signifi- TE on the inferior temporal gyrus. cance and from the fact that they differ structurally from most Electrophysiological correlates of face recognition have other object classes, necessitating different perceptual strate- also been obtained from humans (Allison et al. 1994). A gies (such as encoding on the basis of prototypical configura- large evoked potential (called the N200) is generated by tion), strategies selectively lost in prosopagnosia. An faces (but not other stimulus types) at small sites in the ven- alternative explanation is that faces are processed and stored tral occipitotemporal cortex. These sites or “modules” may in a manner similar to that for other objects, but faces are be comparable to the clumps of face neurons found in mon- simply harder to tell apart than other kinds of objects; this keys. Longer-latency face-specific potentials were also view is consistent with observations that prosopagnosia is recorded from the anterior portions of ventral temporal cor- often accompanied by some degree of general object agnosia. tex activated by face recognition in POSITRON EMISSION A related account holds that face processing requires subtle TOMOGRAPHY (PET) studies. Moreover, N200s to inverted discriminations between highly similar exemplars within a faces recorded from the right hemisphere were smaller and category, and that it is this capacity, not processing of the longer than for normally oriented faces; the left hemisphere facial configuration, that is disrupted in prosopagnosia. Inter- generated comparable N200s under both conditions. These estingly, recent studies show that face processing is still dis- studies thus provide correlates of both the “inversion effect” proportionately impaired when discriminations of face and and of HEMISPHERIC SPECIALIZATION for some aspects of nonface stimuli are equated for difficulty, so this view has face processing noted in the clinical literature and in tachis- lost some force. A final suggestion is that face processing toscopic studies of face recognition in normal humans. represents acquisition of EXPERTISE associated with very pro- Brain imaging studies in normal humans provide con- tracted experience with a category of complex visual stimuli verging evidence for the involvement of the ventral occipito- (Carey 1992). Prosopagnosics with deficits in other object temporal cortices in face recognition and for the existence of recognition domains in which they had previously acquired dissociable component operations. Sergent et al. (1992), for expertise over long periods (e.g., a show dog expert who lost example, used PET to compare brain activation while sub- the ability to differentiate breeds) support this idea. Feature Detectors 311 Growing acceptance of faces as a distinct stimulus type Desimone, R., T. D. Albright, C. G. Gross, and C. Bruce. (1984). Stimulus selective properties of inferior temporal neurons in the has led, along with growing evidence for component macaque. J. Neurosci. 4: 2051–2062. processes, to emergence of theoretical accounts of face Dror, I., F. L. Florer, D. Rios, and M. Zagaeski. (1996). Using arti- recognition tied to central ideas in COMPUTATIONAL NEURO- ficial bat sonar neural networks for complex pattern recogni- SCIENCE. For example, drawing on the notion of a tion: recognizing faces and the speed of a moving target. Biol. computational model advanced by David MARR, Bruce and Cybern. 74: 331–338. Young (1986) analyzed face processing as a set of seven types Farah, M. J. (1991). Patterns of co-occurrence among the associa- of information code (or representation). In their scheme, tive agnosias: implications for visual object representations. which fits well with neuropsychological dissociations, Cog. Neuropsychol. 8:1–19. everyday face recognition involves the use of “structural” Farah, M. J., K. L. Levinson, and K. L. Klein. (1994). Face percep- codes to access identity-specific semantic information, and tion and within-category discrimination in prosopagnosia. Neu- ropsychologia 33: 661–674. then finally the attachment of a name to the percept. Other Field, T. M., R. Woodson, R. Greenberg, and D. Cohen. (1982). recent theoretical accounts have modeled face recognition Discrimination and imitation of facial expressions by neonates. using artificial neural network architectures derived from Science 218: 179–181. parallel-distributed processing accounts of complex systems. Flude, B. M., A. W. Ellis, and J. Kay. (1989). Face processing and Future advances in understanding face recognition will likely name retrieval in an anomic aphasic: names are stored sepa- require further incorporation of data on self-organizing rately from semantic information about familiar people. Brain (developmental and environmental) and modulatory and Cog. 11: 60–72. (emotional and motivational) aspects of face processing into George, M. S., T. A. Ketter, D. S. Gill, J. V. Haxby, L. G. Unger- existing models. leider, P. Herscovitch, and R. I. Post. (1993). Brain regions involved in recognizing emotion or identity: an oxygen-15 PET See also AMYGDALA, PRIMATE; COGNITIVE ARCHITEC- study. J. Neuropsychiat. Clin. Neurosci. 5: 384–394. TURE; EMOTION AND THE HUMAN BRAIN; HIGH-LEVEL Gross, C. G., H. R. Rodman, P. M. Gochin, and M. W. Colombo. VISION; MID-LEVEL VISION (1993). Inferior temporal cortex as a pattern recognition device. —Hillary R. Rodman In E. Baum, Ed., Computational Learning and Cognition. Phil- adelphia: SIAM Press. References Haxby, J. V., C. L. Grady, B. Horwitz, L. G. Ungerleider, M. Mish- kin, R. E. Carson, P. Herscovitch, M. B. Schapiro, and S. I. Allison, T., C. McCarthy, A. Nobre, A. Puce, and A. Belger. Rapoport. (1991). Dissociation of object and spatial vision pro- (1994). Human extrastriate visual cortex and the perception of cessing pathways in human extrastriate cortex. Proc. Natl. faces, words, numbers and colors. Cereb. Cortex 5: 544–554. Acad. Sci. 88: 1621–1625. Bruce, V., and A. Young. (1986). Understanding face recognition. Heywood, C. A., and A. Cowey. (1992). The role of the “face cell” Br. J. Psych. 77: 305–327. area in the discrimination and recognition of faces by monkeys. Carey, S. (1992). Becoming a face expert. Phil. Trans. R. Soc. Phil. Trans. Roy. Soc. Lond. B 335: 31–38. Lond. B 335: 95–103. Perrett, D. I., P. A. J. Smith, D. D. Potter, A. J. Mistlin, A. S. Head, Chung, M-S., and D. M. Thomson. (1995). Development of face A. D. Milner, and M. A. Jeeves. (1985). Visual cells in the tem- recognition. Br. J. Psych. 86: 55–87. poral cortex sensitive to face view and gaze direction. Proc. Farah, M. J. (1996). Is face recognition “special”? Evidence from Roy. Soc. Lond B 223: 293–317. neuropsychology. Beh. Brain Res. 76: 181–189. Rodman, H. R. (1994). Development of inferior temporal cortex in Gross, C. G., and J. Sergent. (1992). Face recognition. Curr. Opin. the monkey. Cereb. Cortex 5: 484–498. Neurobiol. 2: 156–161. Rolls, E. T., and G. C. Baylis. (1986). Size and contrast have only Haxby, J. V., L. G. Ungerleider, B. Horwitz, J. M. Maisog, S. I. small effects on the responses to faces of neurons in the cortex Rapoport, and C. L. Grady. (1996). Face encoding and recogni- of the superior temporal sulcus of the monkey. Exp. Brain Res. tion in the human brain. Proc. Nat. Acad. Sci. 93: 922–927. 65: 38–48. Rodman, H. R., C. G. Gross, and S. P. Ó Scalaidhe. (1993). Devel- Yamane, S., S. Kaji, and K. Kawano. (1988). What facial features opment of brain substrates for pattern recognition in primates: activate face neurons in inferotemporal cortex of the monkey. physiological and connectional studies of inferior temporal cor- Exp. Brain Res. 73: 209–214. tex in infant monkeys. In B. de Boysson-Bardies, S. de Schonen, P. Jusczyk, P. MacNeilage, and J. Morton, Eds., Developmental Feature Detectors Neurocognition: Speech and Face Processing in The First Year of Life. Dordrecht: Kluwer Academic, pp. 63–75. Sergent, J., S. Ohta, and B. MacDonald. (1992). Functional neu- The existence of feature detectors is based on evidence roanatomy of face and object processing: a positron emission tomography study. Brain 115: 15–36. obtained by recording from single neurons in the visual Yin, R. K. (1969). Looking at upside-down faces. J. Exp. Psychol. pathways (Barlow 1953; Lettvin et al. 1959; Hubel and Wie- 81: 141–145. sel 1962; Waterman and Wiersma 1963; see also SINGLE- NEURON RECORDING). It was found that responses from Further Readings many types of NEURON do not correlate well with the straightforward physical parameters of the stimulus, but Benton, A. L. (1980). The neuropsychology of facial recognition. instead require some specific pattern of excitation, often a American Psychologist 35: 176–186. spatio-temporal pattern that involves movement. The rab- Damasio, A. R., D. Tranel, and H. Damasio. (1990). Face agnosia bit retina provides some well-documented examples, and the neural substrates of memory. Ann. Rev. Neurosci. 13: though they were not the first to be described. Direction- 89–109. 312 Feature Detectors ally selective ganglion cells respond to movements of the cumstances. One example is the red dot on a herring gull’s image in one direction, but respond poorly to the reverse beak, which has been shown to cause the chick to open its motion, however bright or contrasty the stimulus; there are bill to receive regurgitated food from its mother. Another two classes of these ganglion cells, distinguished by the fast example is the stimulus for eliciting the rather stereotyped or slow speed of movement that each prefers, and within feeding behavior shown by many vertebrates: a small mov- each class there are groups responding to different directions ing object first alerts the animal, then causes it to orient of motion (Barlow, Hill, and Levick 1964). Another class, itself toward the stimulus, next to approach it, and finally to found mainly in the central zone of the RETINA, are local snap at it. It was early suggested that the retinal ganglion edge detectors that respond only to an edge moving very cells in the frog that respond to small moving objects might slowly over precisely the right position in the visual field act as such bug detectors (Barlow 1953; Lettvin et al. 1959). (Levick 1967). These units are often highly specific in their Such feature detectors must be related to the specific stimulus requirements, and it can be a difficult task to find requirements of particular species in particular ecological out what causes such a unit to fire reliably; yet once the niches, but feature detection may have a more general role appropriate trigger feature has been properly defined it will in perception and classification. A clue to their significance work every time. Another class, the fast movement detectors, in object recognition may be found in the early attempts by respond only to very rapid image movements, and yet computer scientists to recognize alphanumeric characters another, the uniformity detectors, fire continuously at a high (Grimsdale et al. 1959; Selfridge and Neisser 1960; rate except when patterned stimulation is delivered to the Kamentsky and Liu 1963). It was found that fixed tem- part of the retina they are connected to. plates, one for each letter, perform very badly because the All classes have a restricted retinal region where the representation of the same character varies in different appropriate feature has to be positioned, and this is fonts. Performance could be much improved by detecting described as the unit’s receptive field, even though the oper- the features (bars, loops, intersections etc.) that make up the ations being performed on the input are very different from characters, for latitude could then be allowed in the posi- simple linear summation of excitatory and inhibitory influ- tioning of these relative to each other. This is the germ of an ences, which the term receptive field is sometimes thought idea that seems to provide a good qualitative explanation for to imply. Units often show considerable invariance of the feature detectors found at successive levels in the visual response for changes of the luminance, contrast, and even system: operations that restrict response or increase selec- polarity of the light stimulus, while maintaining selectivity tivity for one aspect of the stimulus are combined with oper- for their particular patterned spatio-temporal feature. There ations that generalize or relax selectivity for another aspect. is also evidence for feature detectors in auditory (Evans In the preceding example the components of letters vary less 1974; Suga 1994; see also DISTINCTIVE FEATURES) and tac- from font to font than the overall pattern of the letters, so the initial feature detectors can be rather selective. But having tile pathways. found that certain features are present, the system can be Feature detection in the retina makes it clear that com- less demanding about how they are positioned relative to plicated logical operations can be achieved in simple neural each other, and this achieves some degree of font-invariant circuits, and that these processes need to be described in character recognition. computational, rather than linear, terms. For a time there In the primate visual system some retinal ganglion cells was some rivalry between feature creatures, who espoused are excited by a single foveal cone, so they are very selec- this logical, computational view of the operation of visual tive for position. Several of these connect, through the lat- neurons, and frequency freaks, who were devoted to the use eral geniculate nucleus, to a single neuron in primary visual of sine wave spatial stimuli and Fourier interpretations. The cortex, but the groups that so connect are arranged along latter had genuine successes that yielded valid new insights lines: the cortical neuron thus maintains selectivity for posi- (Braddick, Campbell, and Atkinson 1978), but their tion orthogonal to the line, but relaxes selectivity and sum- approach works best for systems that operate nearly lin- mates along the line (Hubel and Wiesel 1962). This makes early. Object recognition is certainly not a linear process, each unit selectively responsive to lines of a particular ori- and the importance of feature detectors lies in the insight entation, and they may be combined together at later stages they give into how the brain achieves this very difficult to generalize in various ways. task. But first, glance back at the history of feature detec- The best-described examples of units that generalize for tion before single-neuron feature-detectors were discov- position are provided by the cortical neurons of area MT or ered. V5 that specialize in the analysis of image motion: these Sherrington found that to elicit the scratch reflex—the collect together information from neurons in cortical area rhythmical scratching movements made by a dog’s hind V1 that come from a patch several degrees in diameter in leg—a tactile stimulus had to be applied to a particular the visual field (Newsome et al. 1990; Raiguel et al. 1995), region of the flank, and it was most effective if it was but all the neurons converging on one MT neuron signal applied to several neighboring cutaneous regions in succes- movements of similar direction and velocity. Thus all the sion. This must require a tactile feature detector not unlike information about motion with a particular direction and some of those discovered in visual pathways. velocity occurring in a patch of the visual field is pooled Some years later the ethologists Lorenz (1961) and Tin- onto a single MT neuron, and such neurons have been bergen (1953) popularized the notion of innate releasers: shown to be as sensitive to weak motion cues as the intact, these are special sensory stimuli that trigger specific behav- behaving animal (Newsome, Britten, and Movshon 1989). ioral responses when delivered under the appropriate cir- Feature Detectors 313 Possibly the whole sensory cortex should be viewed as an References immense bank of tuned filters, each collecting the informa- Barlow, H. B. (1953). Summation and inhibition in the frog’s ret- tion that enables it to detect with high sensitivity the occur- ina. Journal of Physiology, London 119: 69–88. rence of a patterned feature having characteristics lying Barlow, H. B. (1972). Single units and sensation: a neuron doctrine within a specific range (Barlow and Tripathy 1997). The for perceptual psychology? Perception 1: 371–394. existence of this enormous array of near-optimal detectors, Barlow, H. B. (1990). A theory about the functional role and syn- all matched to stimuli of different characteristics, would aptic mechanism of visual after-effects. In C. B. Blakemore, explain why the mammalian visual system can perform Ed., Vision: Coding and Efficiency. Cambridge: Cambridge detection and discrimination tasks with a sensitivity and University Press. speed that computer vision finds hard to emulate. Barlow, H. B., R. M. Hill, and W. R. Levick. (1964). Retinal gan- glion cells responding selectively to direction and speed of Another aspect of the problem is currently arousing motion in the rabbit. Journal of Physiology, London 173: 377– interest: Why do we have detectors for some features, but 407. not others? What property of a spatio-temporal pattern Barlow, H. B., and S. P. Tripathy. (1997). Correspondence noise and makes it desirable as a feature? A suggestion (Barlow 1972; signal pooling as factors determining the detectability of coher- Field 1994) currently receiving some support (Bell and ent visual motion. Journal of Neuroscience 17 7954–7966. Sejnowski 1995; Ohlshausen and Field 1996) is that the fea- Bell, A. J., and T. J. Sejnowski. (1995). An information maximisa- ture detectors we possess are able to create a rather com- tion approach to blind separation and blind deconvolution. Neu- plete representation of the current sensory scene using the ral Computation 7: 1129–1159. principle of sparse coding; this means that at any one time Braddick, O. J., F. W. Campbell, and J. Atkinson. (1978). Channels only a small selection of all the units is active, yet this small in vision: basic aspects. In R. Held, H. W. Leibowicz, and H. L. Teuber, Eds., Handbook of Sensory Physiology, New York: number firing in combination suffices to represent the scene Springer, pp. 1–38. effectively. The types of feature that will achieve this double Evans, E. F. (1974). Feature- and call-specific neurons in auditory criterion, sparsity with completeness, can be described as pathways. In F. C. S. and F. G. Worden, Eds., The Neuro- suspicious coincidences: they are local patterns in the image sciences: Third Study Program. Cambridge, MA: MIT Press. that would be expected, from the probabilities of their con- Field, D. J. (1994). What is the goal of sensory coding? Neural stituent elements, to occur rarely, but in fact occur more Computation 6: 559–601. commonly. Grimsdale, R. L., F. H. Sumner, C. J. Tunis, and T. Kilburn. (1959). Sparse coding goes some way toward preventing acci- A system for the automatic recognition of patterns. Proceed- dental conjunctions of attributes, which is the basis for the ings of the Institute of Electrical Engineers, B 106: 210–221. so-called BINDING PROBLEM. Although sparsely coded fea- Hubel, D. H., and T. N. Wiesel. (1962). Receptive fields, binocular interaction, and functional architecture in the cat’s visual cor- tures are not mutually exclusive, they nonetheless occur tex. Journal of Physiology, London 195: 215–243. infrequently: hence accidental conjunctions of them will Hubel, D. H., and T. N. Wiesel. (1970). The period of susceptibil- only occur very infrequently, possibly no more often than ity to the physiological effects of unilateral eye closure in kit- they do in fact occur. tens. Journal of Physiology, London 206: 419–436. Like the basis functions that are used for image com- Kamentsky, L. A., and C. N. Liu. (1963). Computer-automated pression, those suitable for sparse coding achieve their design of multifont print recognition logic. IBM Journal of result through being adapted to the statistical properties of Research and Development 7: 2–13. natural images. This adaptation must be done primarily Lettvin, J. Y., H. R. Maturana, W. S. McCulloch, and W. H. Pitts. through evolutionary selection molding their pattern selec- (1959). What the frog’s eye tells the frog’s brain. Proceedings tive mechanisms, though it is known that they are also of the Institute of Radio Engineers 47: 1940–1951. Levick, W. R. (1967). Receptive fields and trigger features of gan- modified by experience during the critical period of devel- glion cells in the visual streak of the rabbit’s retina. Journal of opment of the visual system (Hubel and Wiesel 1970; Physiology, London 188: 285–307. Movshon and Van Sluyters 1981), and perhaps also through Lorenz, K. (1961). King Solomon’s Ring. Trans. M. K. Wilson. short term processes of contingent adaptation (Barlow Cambridge: Cambridge University Press. 1990). Feature detectors that exploit statistical properties of Movshon, J. A., and R. C. Van Sluyters. (1981). Visual neural natural images in this way could provide a representation development. Annual Review of Psychology 32: 477–522. that is optimally up-to-date, minimizes the effects of delays Newsome, W. T., K. H. Britten, and J. A. Movshon. (1989). Neu- in afferent and efferent pathways, and perhaps also ronal correlates of a perceptual decision. Nature 341: 52–54. achieves some degree of prediction (see also CEREBRAL Newsome, W. T., K. H. Britten, C. D. Salzman, and J. A. Movshon. (1990). Neuronal mechanisms of motion perception. Cold CORTEX). Spring Harbor Symposia on Quantitative Biology 55: 697–705. Although we are far from being able to give a complete Olshausen, B. A., and D. J. Field. (1996). Emergence of simple- account of the physiological mechanisms that underlie even cell receptive-field properties by learning a sparse code for nat- the simplest examples of object recognition, the existence of ural images. Nature 381: 607–609. feature detecting neurons, and these theories about their Raiguel, S., M. M. Van Hulle, D.-K. Xiao, V. L. Marcar, and G. A. functional role, provide grounds for optimism. Orban. (1995). Shape and spatial distribution of receptive fields See also OBJECT RECOGNITION, ANIMAL STUDIES; OBJECT and antagonistic motion surrounds in the middle temporal area RECOGNITION, HUMAN NEUROPSYCHOLOGY; VISUAL ANAT- (V5) of the macaque. European Journal of Neuroscience 7: OMY AND PHYSIOLOGY; VISUAL PROCESSING STREAMS 2064–2082. Selfridge, O., and U. Neisser. (1960). Pattern recognition by —Horace Barlow machine. Scientific American 203(2): 60–68. 314 Features you pass the salt?”), or stating a simple fact (e.g., “It seems Suga, N. (1994). Multi-function theory for cortical processing of auditory information: implications of single-unit and lesion cold in here” meaning “Go close the window”). data for future research. Journal of Comparative Physiology A One traditional assumption, still held in some areas of 175: 135–144. cognitive science, is that figurative language is deviant and Tinbergen, N. (1953). The Herring Gull’s World. London: Collins. requires special cognitive processes to be understood. Waterman, T. H., and C. A. G. Wiersma. (1963). Electrical res- Whereas literal language can be understood via normal ponses in decapod crustacean visual systems. Journal of Cellu- cognitive mechanisms, listeners must recognize the devi- lar and Comparative Physiology 61: 1–16. ant nature of a figurative utterance before determining its nonliteral meaning (Grice 1989; Searle 1979). For Further Readings instance, understanding a metaphorical comment, such as “Criticism is a branding iron,” requires that listeners must Ballard, D. H. (1997). An Introduction to Natural Computation. first analyze what is stated literally, then recognize that the Cambridge, MA: MIT Press. literal meaning (i.e., that criticism is literally a tool to Barlow, H. B. (1995). The neuron doctrine in perception. In M. Gazzaniga, Ed., The Cognitive Neurosciences. Cambridge, mark livestock) is contextually inappropriate, and then MA: MIT Press, pp. 415–435. infer some meaning consistent with the context and the idea that the speaker must be acting cooperatively and Features rationally (i.e., criticism can psychologically hurt the per- son who receives it, often with long-lasting consequences). This traditional view suggests, then, that figurative lan- See DISTINCTIVE FEATURES; FEATURE DETECTORS guage should always be more difficult to process than roughly equivalent literal speech. Feedforward Networks But the results of many psycholinguistic experiments have shown this idea to be false (see Gibbs 1994 for a review). Listeners/readers can often understand the figura- See PATTERN RECOGNITION AND FEEDFORWARD NETWORKS; tive interpretations of metaphors, irony/sarcasm, idioms, SUPERVISED LEARNING IN MULTILAYER NEURAL NETWORKS proverbs, and indirect speech acts without having to first analyze and reject their literal meanings when these expres- Figurative Language sions are seen in realistic social contexts. People can read figurative utterances as quickly as, sometimes even more Figurative language allows speakers/writers to communi- quickly than, they read literal uses of the same expressions cate meanings that differ in various ways from what they lit- in different contexts, or equivalent nonfigurative expres- erally say. People speak figuratively for reasons of sions. These experimental findings demonstrate that the tra- politeness, to avoid responsibility for the import of what is ditional view of figurative language as deviant and communicated, to express ideas that are difficult to commu- ornamental, requiring additional cognitive effort to be nicate using literal language, and to express thoughts in a understood, has little psychological validity. Although peo- compact and vivid manner. Among the most common forms ple may not always process the complete literal meanings of of figurative language, often referred to as “tropes” or “fig- different figurative expressions before inferring their nonlit- ures of speech,” are metaphor, where ideas from dissimilar eral interpretations, people may analyze aspects of word knowledge domains are either explicitly, in the case of sim- meaning as part of their understanding of what different ile (e.g., “My love is like a red, red rose”), or implicitly phrases and expressions figuratively mean as wholes (e.g., “Our marriage is a roller-coaster ride”) compared; (Blasko and Connine 1993; Cacciari and Tabossi 1988). At metonymy, where a salient part of a single knowledge the same time, listeners/readers certainly may slowly pon- domain is used to represent or stand for the entire domain der the potential meanings of a figurative expression, such (e.g., “The White House issued a statement”); idioms, as the literary metaphor from Shakespeare, “The world is an where a speaker’s meaning cannot be derived from an anal- unweeded garden.” It is this conscious experience that pro- ysis of the words’ typical meanings (e.g., “John let the cat vides much of the basis for the mistaken assumption that out of the bag about Mary’s divorce”); proverbs, where figurative language always requires “extra work” to be speakers express widely held moral beliefs or social norms properly understood. (e.g.,“The early bird captures the worm”); irony, where a A great deal of empirical research from all areas of cog- speaker’s meaning is usually, but not always, the opposite of nitive science has accumulated on how people learn, pro- what is said (e.g., “What lovely weather we’re having” duce, and understand different kinds of figurative language stated in the midst of a rainstorm); hyperbole, where a (see ANALOGY and METAPHOR). Several notable findings speaker exaggerates the reality of some situation (e.g., “I have emerged from this work. To give just a few examples, have ten thousand papers to grade by the morning”); under- many idioms are analyzable with their individual parts con- statement, where a speaker says less than is actually the case tributing something to what these phrases figuratively mean, (e.g., “John seems a bit tipsy” when John is clearly very contrary to the traditional view (Gibbs 1994). People also drunk); oxymora, where two contradictory ideas/concepts learn and make sense of many conventional and idiomatic are fused together (e.g., “When parting is such sweet sor- phrases, not as “frozen” lexical items, but because they tac- row”); and indirect requests, where speakers make requests itly recognize the metaphorical mapping of information of others in indirect ways by asking questions (e.g., “Can between two conceptual domains (e.g., “John spilled the Focus 315 beans” maps our knowledge of someone tipping over a con- Clark, H., and R. Gerrig. (1984). On the pretense theory of irony. Journal of Experimental Psychology: General 113: 121–126. tainer of beans to a person revealing some previously hidden Fernandez, J., Ed. (1991). Beyond Metaphor: The Theory of Tropes secret; Gibbs 1994). Ironic and sarcastic expressions are in Anthropology. Stanford, CA: Stanford University Press. understood when listeners recognize the pretense underly- Gibbs, R. (1994). The Poetics of Mind: Figurative Thought, Lan- ing a speaker’s remark. For instance, a speaker who says guage and Understanding. New York: Cambridge University “What lovely weather we’re having” in the midst of a rain- Press. storm pretends to be an unseeing person, perhaps a weather Gibbs, R., J. Bogdonovich, J. Sykes, and D. Barr. (1997). Meta- forecaster, exclaiming about the beautiful weather to an phor in idiom comprehension. Journal of Memory and Lan- unknown audience (Clark and Gerrig 1984). In many cases, guage 36. ironic utterances accomplish their communicative intent by Glucksberg, S., and B. Keysar. (1990). Understanding metaphori- reminding listeners of some antecedent event or statement cal comparisons: beyond literal similarity. Psychological Review 97: 3–18. (Sperber and Wilson 1986), or by reminding listeners of a Glucksberg, S., M. Brown, and M. McGlone. (1993). Conceptual belief or social norm jointly held by a speaker and listener metaphors are not automatically accessed during idiom com- (Kreuz and Glucksberg 1989). prehension. Memory and Cognition 21: 711–719. Some cognitive scientists now argue that metaphor, Grice, H. P. (1989). Studies in the Ways of Words. Cambridge, MA: metonymy, irony, and other tropes are not linguistic distor- Harvard University Press. tions of literal, mental thought, but constitute basic schemes Happe, F. (1994). Understanding minds and metaphors: insights by which people conceptualize their experience and the from the study of figurative language in autism. Metaphor and external world (Gibbs 1994; Johnson 1987; Lakoff and Symbolic Activity 10: 275–295. Johnson 1980; Lakoff 1987; Lakoff and Turner 1989; Sweet- Johnson, M. (1987). The Body in the Mind. Chicago: University of ser 1990; Turner 1991). Speakers cannot help but employ Chicago Press. Kreuz, R., and S. Glucksberg. (1989). How to be sarcastic: the figurative language in conversation and writing because they echoic reminder theory of verbal irony. Journal of Experimen- conceptualize much of their experience through the figura- tal Psychology: General 120: 374–386. tive schemes of metaphor, metonymy, irony, and so on. Lis- Lakoff, G., (1987). Woman, Fire and Dangerous Things. Chicago: teners often find figurative discourse easy to understand University of Chicago Press. precisely because much of their thinking is constrained by Lakoff, G., and M. Johnson. (1980). Metaphors We Live By. Chi- figurative processes. For instance, people often talk about the cago: University of Chicago Press. concept of time in terms of the widely shared conceptual Lakoff, G., and M. Turner. (1989). No Cool Reason: The Power of metaphor time is money (e.g., “I saved some time,” I wasted Poetic Metaphor. Chicago: University of Chicago Press. my time,” “I invested time in the relationship,” “We can’t McGlone, M. (1996). Conceptual metaphors and figurative lan- spare you any time”). These conventional expressions are not guage interpretation: food for thought. Journal of Memory and Language 35: 544–565. “dead metaphors,” but reflect metaphorical conceptualiza- Searle, J. (1979). Metaphor. In A. Ortony, Ed., Metaphor and tions of experience that are very much alive and part of ordi- Thought. New York: Cambridge University Press, pp. 92– nary cognition, one reason why these same metaphors are 123. frequently seen in novel expressions and poetic language Sperber, D. (1994). Understanding verbal understanding. In J. (Lakoff and Turner 1989). There is much debate over Khalfa, Ed., What is Intelligence? New York: Cambridge Uni- whether people’s understanding of various conventional versity Press, pp. 179–198. expressions, idioms, proverbs, and metaphors necessarily Sperber, D., and D. Wilson. (1986). Relevance: Communication requires activation of underlying conceptual metaphors that and Cognition. Oxford: Blackwell. may motivate the existence of these statements in the lan- Sweetser, E. (1990). From Etymology to Pragmatics: The Mind- guage (Gibbs 1994; Gibbs et al. 1997; Glucksberg, Brown, Body Metaphor in Semantic Structure and Semantic Change. Cambridge: Cambridge University Press. and McGlone 1993; McGlone 1996). Nevertheless, there is a Turner, M. (1991). Reading Minds: The Study of English in the Age growing appreciation from scholars in many fields that meta- of Cognitive Science. Princeton: Princeton University Press. phors and other tropes not only serve as the foundation for much everyday thinking and reasoning, but also contribute to scholarly theory and practice in a variety of disciplines, as fMRI well as providing much of the foundation for our understand- ing of culture (see Fernandez 1991; Gibbs 1994). See MAGNETIC RESONANCE IMAGING See also CONCEPTS; MEANING; METAPHOR AND CUL- TURE; SEMANTICS Focus —Raymond W. Gibbs References The term focus is used to refer to the highlighting of parts of utterances for communicative purposes, typically by accent. Blasko, D., and C. Connine. (1993). Effects of familiarity and apt- For example, a question like Who did Mary invite for dinner ness of metaphor processing. Journal of Experimental Psychol- is answered by Mary invited BILL for dinner, not by Mary ogy: Learning, Memory and Cognition 19: 295–308. invited Bill for DINner (capitals mark the syllable with main Cacciari, C., and P. Tabossi. (1988). The comprehension of idioms. accent, cf. STRESS, LINGUISTIC and PROSODY AND INTONA- Journal of Memory and Language 27: 668–683. TION). A contrastive statement like Mary didn’t invite BILL 316 Focus for dinner, but JOHN is also fine, whereas Mary didn’t Mary invited X for dinner, where X applies to some alter- invite Bill for DINner, but JOHN is odd. Finally, notice that native to Bill. And a sentence like Mary only invited BILL Mary only invited BILL for dinner means something differ- for dinner says that Mary did not invite any alternative to ent from Mary only invited Bill for DINner. Expressions like Bill to dinner. Other focus-sensitive operators can be only that depend on the choice of focus are said to associate explained similarly. For example, Mary also invited BILL with focus (Jackendoff 1972). for dinner presupposes that there is an alternative X to Bill Focus is typically expressed in spoken language by pitch such that Mary invited X for dinner is true. And Mary movement, duration, or intensity on a syllable (cf. Ladd unfortunately invited BILL for dinner presupposes that 1996). In addition, we often find certain syntactic construc- there is an alternative X to Bill such that it would have tions, like cleft sentences (It was BILL that she invited for been more fortunate for Mary to invite X for dinner. dinner). There are languages that make use of specific syn- Sedivy et al. (1994) have used eyetracking techniques to tactic positions (e.g., the preverbal focus position in Hun- observe the construction of such alternative sets during garian), dedicated particles (e.g., Quechua), or syntactic sentence processing. movement of nonfocused expressions from their regular The two lines of research sometimes lead to different position (e.g., Catalan; cf. É Kiss 1995). In American Sign analyses. Consider the following exchange: A: My car Language (see SIGN LANGUAGES), focus is marked by a non- broke down. B: What did you do? A can answer with (a) I manual gesture, the brow raise (cf. Wilbur 1991). called a meCHAnic or with (b) I FIXed the car. If focus Focus marking is often ambiguous, which gives rise to expresses newness, (a) should have focus on called a misunderstandings and jokes. When the notorious bank rob- mechanic, and (b) should have focus just on fixed, as the ber Willie Sutton was asked by a reporter, Why do you rob car is given. But if focus indicates the presence of alterna- banks? he replied: Because that’s where the money is. The tives, (b) should have focus on fixed the car, as the question answer makes sense with focus on banks, but the intended asks for an activity. The lack of accent on the car in (b) focus clearly was on rob banks; Sutton was asked why he shows that even focus theories based on alternatives must robs banks in contrast to doing other things. The focus is allow for givenness as a factor in accentuation. Notice that marked by accent on banks in both cases. In general, accent there are expressions that are never accentuated, for exam- on a syntactic argument often helps to mark broad focus on ple, the indefinite pronoun something, as in A: What did predicate + argument (cf. Schmerling 1976; Gussenhoven you do? B: I FIXed something. 1984; Selkirk 1984). Take the difference between (a) John Focus is of interest for the study of the SYNTAX- has PLANS to leave and (b) John has plans to LEAVE. (a) is SEMANTICS INTERFACE, as focus-sensitive operators require understood as John has to leave plans, with plans as object a liberal understanding of the principle of COMPOSITIONAL- argument, whereas (b) is understood as John plans to leave, ITY. Take the view that focus indicates the presence of with VP argument to leave. In both cases, plans to leave is alternatives. As the VPs only invited BILL for dinner and in focus. only invited Bill for DINner differ in meaning, the place- On the semantic side, one influential line of research has ment of focus must lead to differences in the interpretation been to analyze focus as expressing what is new in an utter- of the embedded VP invited Bill for dinner. One proposal ance (DISCOURSE; cf. Halliday 1967; Sgall, Hajicová, and assumes that the item in focus is somehow made “visible,” Panenová 1986; Rochemont 1986). The question Who did for example by movement on the syntactic level of LOGI- Mary invite for dinner? can be answered by Mary invited CAL FORM (cf. MINIMALISM). In this theory, a sentence like BILL for dinner, inasmuch as it presupposes that Mary Mary only invited BILL for dinner means something like invited someone for dinner, and the new information is that “The only X such that invited X for dinner is true of Mary is this person was Bill. Consequently, Bill is accented, and the Bill” (cf., e.g., von Stechow 1990; Jacobs 1991). A prob- other constituents, which are given information, are deac- lem is that association with focus seems to disregard syn- cented. What should count as “given” often requires infer- tactic islands (cf. WH-MOVEMENT), as in Mary only invited encing, as in the following example: Many tourists visit (a) [BILL’s mother] for dinner. Another proposal assumes that expressions in focus introduce alternatives, which leads to Israel / (b) Jerusalem. When BILL arrived in the Holy City, all hotels were booked. In the (a) case, Holy City is alternatives for the expressions with embedded focus con- accented, while in the (b) case, it is deaccented because it is stituents (“Alternative Semantics,” cf. Rooth 1992). Our mentioned before, though not literally. example is analyzed as “The only predicate of the form Another influential research program sees focus as invite X for dinner that applies to Mary is invite Bill for din- indicating the presence of alternatives to the item in focus ner.” In general, alternative semantics is more restrictive, (cf. Rooth 1992, 1995). For example, a question like Who but it may not be sufficient for more complex cases in did Mary invite for dinner? asks for answers of the form which multiple foci are involved, as in, A: Mary only Mary invited X for dinner, where X varies over persons. invited BILL for dinner. She also1 only2 invited BILL2 for The focus in the answer, Mary invited BILL for dinner, LUNCH1, where the second sentence presupposes that identifies a particular answer of this form. In general, there is another person X besides Bill such that Mary focus on an expression marks the fact that alternatives to invited only Bill to x. this expression are under consideration. This idea natu- There is another use of the term focus, unrelated to the one rally also applies to the contrastive use of focus and to discussed here, in which it refers to discourse referents that association with focus. A sentence like Mary invited BILL are salient at the current point of discourse and are potential for dinner can be used in contrast to sentences of the type antecedents for pronouns (cf. Grosz and Sidner 1986). Folk Biology 317 See also DYNAMIC SEMANTICS; SEMANTICS; SYNTAX Folk Biology —Manfred Krifka Folk biology is the cognitive study of how people classify References and reason about the organic world. Humans everywhere Grosz, B., and C. Sidner. (1986). Attention, intention and the struc- classify animals and plants into species-like groups as obvi- ture of discourse. Journal of Computational Linguistics 12: ous to a modern scientist as to a Maya Indian. Such groups 175–204. are primary loci for thinking about biological causes and Gussenhoven, C. (1984). On the Grammar and Semantics of Sen- relations (Mayr 1969). Historically, they provided a transthe- tence Accent. Dordrecht: Foris. oretical base for scientific biology in that different theories— Halliday, M. A. K. (1967). Notes on transitivity and theme in including evolutionary theory—have sought to account for English, part 2. Journal of Linguistics 3: 199–244. the apparent constancy of “common species” and the organic Jacobs, J. (1991). Focus ambiguities. Journal of Semantics 8: 1–36. processes centering on them. In addition, these preferred Jackendoff, R. (1972). Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. groups have “from the most remote period . . . been classed É Kiss, K., Ed. (1995). Discourse Configurational Languages. in groups under groups” (Darwin 1859: 431). This taxo- Oxford: Oxford University Press. nomic array provides a natural framework for inference, and Ladd, R. (1996). Intonational Phonology. Cambridge: Cambridge an inductive compendium of information, about organic cat- University Press. egories and properties. It is not as conventional or arbitrary in Rochemont, M. (1986). Focus in Generative Grammar. Amster- structure and content, nor as variable across cultures, as the dam: John Benjamins. assembly of entities into cosmologies, materials, or social Rooth, M. (1992). A theory of focus interpretation. Natural Lan- groups. From the vantage of EVOLUTIONARY PSYCHOLOGY, guage Semantics 1: 75–116. such natural systems are arguably routine “habits of mind,” Rooth, M. (1995). Focus. In S. Lappin, Ed., Handbook of Contem- in part a natural selection for grasping relevant and recurrent porary Semantic Theory. London: Blackwell, pp. 271–298. Schmerling, S. (1976). Aspects of English Sentence Stress. Austin: “habits of the world.” University of Texas Press. The relative contributions of mind and world to folk biol- Sedivy, J., G. Carlson, M. Tanenhaus, M. Spivey-Knowlton, and K. ogy are current research topics in COGNITIVE ANTHROPOL- Eberhard. (1994). The cognitive function of contrast sets in OGY and COGNITIVE DEVELOPMENT (Medin and Atran processing focus constructions. In P. Bosch and R. van der 1998). Ethnobiology is the anthropological study of folk Sandt, Eds., Focus and Natural Language Processing. IBM biology; a research focus is folk taxonomy, which describes Deutschland Informationssysteme GmbH, Institute for Logic the hierarchical structure, organic content, and cultural and Linguistics, pp. 611–620. function of folk biological classifications the world over. Selkirk, E. (1984). Phonology and Syntax: The Relation Between Naive biology is the psychological study of folk biology in Sound and Structure. Cambridge, MA: MIT Press. industrialized societies; a research focus is category-based Sgall, P., E. Hajicová, and J. Panenová. (1986). The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Dordrecht: induction, which concerns how children and adults learn Reidel. about, and reason from, biological categories. Von Stechow, A. (1990). Focusing and backgrounding operators. Ethnobiology roughly divides into adherents of cultural In W. Abraham, Ed., Discourse Particles. Amsterdam: John universals versus CULTURAL RELATIVISM (debated also as Benjamins, pp. 37–84. “intellectualism” versus “utilitarianism,” Brown 1995). Wilbur, R. (1991). Intonation and focus in American Sign Lan- Universalists highlight folk taxonomic principles that are guage. In Y. No and M. Libucha, Eds., ESCOL ‘90: Proceed- only marginally influenced by people’s needs and uses to ings of the Seventh Eastern States Conference on Linguistics. which taxonomies are put (Berlin 1992). Relativists empha- Columbus: Ohio State University Press, pp. 320–331. size those structures and contents of folk biological catego- Further Readings ries that are fashioned by cultural interest, experience, and use (Ellen 1993). Universalists grant that even within a cul- Bayer, J. (1995). Directionality and Logical Form. On the Scope of ture there may be different special-purpose classifications Focussing Particles and Wh-in-Situ. Dordrecht: Kluwer. (beneficial/noxious, domestic/wild, edible/inedible, etc.). König, E. (1991). The Meaning of Focus Particles: A Comparative However, there is only one cross-culturally universal kind of Perspective. London: Routledge. general-purpose taxonomy, which supports the widest pos- Lambrecht, K. (1994). Information Structure and Sentence Form. sible range of inductions about living kinds. This distinction Topic, Focus and the Mental Representation of Discourse Ref- between special- and general-purpose folk biological classi- erents. Cambridge: Cambridge University Press. Selkirk, E. (1995). Sentence prosody. In J. A. Goldsmith, Ed., fications parallels the distinction in philosophy of science Handbook of Phonological Theory. London: Blackwell, pp. between artificial versus natural classification (Gilmour and 550–569. Walters 1964). Taglicht, J. (1984). Message and Emphasis: On Focus and Scope A culture’s general-purpose folk taxonomy is composed in English. London: Longman. of a stable hierarchy of inclusive groups of organisms, or Von Stechow, A. (1991). Current issues in the theory of focus. In taxa, which are mutually exclusive at each level of the hier- A. v. Stechow and D. Wunderlich, Eds., Semantik: Ein interna- archy. These absolutely distinct levels, or ranks, are: folk tionales Handbuch der zeitgenössischen Forschung. Berlin: de kingdom (e.g., animal, plant), life form (e.g., bug, fish, bird, Gruyter, pp. 804–825. mammal/animal, tree, herb/grass, bush), generic species Winkler, S. (1997). Focus and Secondary Predication. Berlin: (gnat, shark, robin, dog, oak, clover, holly), folk specific Mouton de Gruyter. 318 Folk Biology (poodle, white oak), and folk varietal (toy poodle; swamp underlying nature, or biological essence. Such an essence white oak). Ranking is a cognitive mapping that projects liv- may be considered domain-specific insofar as it is an intrin- ing kind categories onto fundamentally different levels of sic (i.e., nonartifactual) teleological agent, which physically reality. Ranks, not taxa, are universal. Taxa of the same rank (i.e., nonintentionally) causes the biologically relevant parts tend to display similar linguistic, psychological, and biolog- and properties of a generic species to function and cohere ical characteristics. For example, most generic species are “for the sake of” the generic species itself. Thus, American labeled by short, simple words (i.e., unanalyzable lexical preschoolers consistently judge that thorns on a rose bush stems: “oak,” “dog”). In contrast, subordinate specifics are exist for the sake of there being more roses, whereas physi- usually labelled binomially (i.e., attributive + lexical stem: cally similar depictions of barbs on barbed wire or the pro- “white oak”) unless culturally very salient (in which case tuberances of a jagged rock do not elicit indications of they may also merit simple words: “poodle,” “collie”). Rel- inherent purpose and design (Keil 1994). People every- ativists agree there is a preferred taxonomic level roughly where expect the disparate properties of a generic species to corresponding to that of the scientific species (e.g., dog) or be integrated without having to know the precise causal genus (e.g., oak). Phenomenally salient species for humans, chains linking universally recognized relationships of mor- including most species of large vertebrates and trees, belong pho-behavioral functioning, inheritance and reproduction, to monospecific genera in any given locale, hence the term disease and death. “generic species” for this preferred taxonomic level (also This essentialist concept shares features with the broader called “folk generic” or “specieme”). Nevertheless, relativ- philosophical notion NATURAL KIND in regard to category- ists note that even in seemingly general-purpose taxono- based induction. Thus, on learning that one cow is suscepti- mies, categories superordinate or subordinate to generic ble to “mad cow” disease, one might reasonably infer that species can reflect “special-purpose” distinctions of cultural all cows, but not all mammals or animals, are susceptible to practice and expertise. For example, the Kalam of New the disease. This is presumably because disease is related to Guinea deny that cassowaries fall under the bird life form, “deep” biological properties, and because cow is a generic not only because flightless cassowaries are physically species with a fairly uniform distribution of such properties. unlike other birds, but also because they are ritually prized The taxonomic arrangement of generic species systemati- objects of the hunt (Bulmer 1967). cally extends this inductive power: it is more “natural” to Universalism in folk biology may be further subdivided infer a greater probability that all mammals share the dis- into tendencies that parallel philosophical and psychological ease than that all animals do. Taxonomic stability allows distinctions between RATIONALISM VS. EMPIRICISM (Malt formulation of a general principle of biological induction: a 1995). Empiricists claim that universal structures of folk tax- property found in two organisms is most likely found in all onomy owe primarily to perceived structures of “objective organisms belonging to the lowest-ranked taxon containing discontinuities” in nature rather than to the mind’s conceptual the two. This powerful inferential principle also underlies structure. On this view, the mind/brain merely provides systematics, the scientific classification of organic life (War- domain-general mechanisms for assessing perceptual simi- burton 1967). Still, relativists can point to cultural and his- larities, which are recursively applied to produce the embed- torical influences on superordinate and subordinate taxa as ded similarity-structures represented in folk taxonomy (Hunn suggesting that biologically relevant properties can be 1976). Rationalists contend that higher-order cognitive prin- weighted differently for induction in different traditions. ciples are needed to produce regularities in folk biological See also CONCEPT; COLOR CLASSIFICATION; NAIVE SOCI- structures (Atran 1990). For example, one pair of principles OLOGY is that every object is either an animal or plant or neither, and —Scott Atran that no animal or plant can fail to belong uniquely to a generic species. Thus, the rank of folk kingdom—the level of References plant and animal—is a category of people’s intuitive ontol- ogy, and conceiving an object as plant or animal entails Atran, S. (1990). Cognitive Foundations of Natural History. Cam- notions about generic species that are not applied to objects bridge: Cambridge University Press. thought to belong to other ontological categories, such as Berlin, B. (1992). Ethnobiological Classification. Princeton: Prin- person, substance, or artifact. Although such principles may ceton University Press. be culturally universal, cognitively compelling, and adaptive Brown, C. (1995). Lexical acculturation and ethnobiology: utilitar- for everyday life, they no longer neatly accord with the ianism and intellectualism. Journal of Linguistic Anthropology known scientific structure of the organic world. 5: 51–64. Bulmer, R. (1967). Why is the cassowary not a bird? Man 2: 5–25. In the study of naive biology, disagreement arises over Carey, S. (1995). On the origins of causal understanding. In S. whether higher-order principles evince strong or weak Sperber, D. Premack, and A. Premack, Eds., Causal Cognition. NATIVISM; that is, whether they reflect the innate modularity Oxford: Clarendon Press. and DOMAIN SPECIFICITY of folk biology (Inagaki and Darwin, C. (1859). On the Origins of Species by Natural Selection. Hatano 1996), or are learned on the basis of cognitive prin- London: Murray. ciples inherent to other domains, such as NAIVE PHYSICS or Ellen, R. (1993). The Cultural Relations of Classification. Cam- FOLK PSYCHOLOGY (Carey 1995). One candidate for a bridge: Cambridge University Press. domain-specific principle involves a particular sort of Gilmour, J., and S. Walters. (1964). Philosophy and classification. ESSENTIALISM, which carries an invariable presumption that In W. Turrill, Ed., Vistas in Botany, vol. 4: Recent Researches in the various members of each generic species share a unique Plant Taxonomy. Oxford: Pergamon Press. Folk Psychology 319 folk psychology(2). To attribute a belief is to make an Hunn, E. (1976). Toward a perceptual model of folkbiological classification. American Ethnologist 3: 508–524. hypothesis about the internal state of the putative believer. Inakagi, K., and G. Hatano. (1996). Young children’s recognition Some psychologists (e.g., Astington, Harris, and Olson of commonalities between plants and animals. Child Develop- 1988) as well as philosophers simply assume the theory- ment 67: 2823–2840. theory interpretation, and some, though not all, fail to distin- Keil, F. (1994). The birth and nurturance of concepts by domains. guish between folk psychology(1) and folk psychology(2). In L. Hirschfeld and S. Gelman, Eds., Mapping the Mind: The second is the so-called status issue. To what extent is Domain Specificity in Cognition and Culture. New York: Cam- the commonsense belief/desire framework correct? The bridge University Press. “status” issue has turned on this question: To what extent Malt, B. (1995). Category coherence in crosscultural perspective. will science vindicate (in some relevant sense) common- Cognitive Psychology 29: 85–148. sense psychology? The question of scientific vindication Mayr, E. (1969). Principles of Systematic Zoology. New York: McGraw-Hill. arises when commonsense psychology is understood as folk Medin, D., and S. Atran, Eds. (1998). Folk Biology. Cambridge, psychology(2). On one side are intentional realists like MA: MIT Press. Fodor (1987) and Dretske (1987), who argue that science Warburton, F. (1967). The purposes of classification. Systematic will vindicate the conceptual framework of commonsense Zoology 16: 241–245. psychology. On the other side are proponents of ELIMINA- TIVE MATERIALISM like Churchland (1981) and Stich Folk Psychology (1983), who argue that as an empirical theory, common- sense psychology is susceptible to replacement by a better In recent years, folk psychology has become a topic of theory with radically different conceptual resources (but see debate not just among philosophers, but among develop- Stich 1996 for a revised view). Just as other folk theories mental psychologists and primatologists as well. Yet there (e.g., FOLK BIOLOGY) have been overthrown by scientific are two different things that “folk psychology” has come to theories, we should be prepared for the overthrow of folk mean, and they are not always distinguished: (1) common- psychology by a scientific theory—scientific psychology or sense psychology that explains human behavior in terms of neuroscience. Eliminative materialists make the empirical beliefs, desires, intentions, expectations, preferences, hopes, prediction that science very probably will not vindicate the fears, and so on; (2) an interpretation of such everyday framework of commonsense psychology. explanations as part of a folk theory, comprising a network The question of scientific vindication, however, does not of generalizations employing concepts like belief, desire, by itself decide the “status” issue. To see this, consider an and so on. The second definition—suggested by Sellars argument for eliminative materialism (EM): (1963) and dubbed “theory-theory” by Morton (1980)— is a a. Folk psychology will not be vindicated by a physicalis- philosophical account of the first. tic theory (scientific psychology or neuroscience). Folk psychology(1) concerns the conceptual framework b. Folk psychology is correct if and only if it is vindicated of explanations of human behavior: If the explanatory (in some relevant sense) by a physicalistic theory. framework of folk psychology(1) is correct, then “because Nan wants the baby to sleep,” which employs the concept of So, wanting, may be a good (partial) explanation of Nan’s turn- c. Folk psychology is incorrect. ing the TV off. Folk psychology(2) concerns how folk- psychological(1) explanations are to be interpreted: If folk Premise (b), which plays an essential role in the argu- psychology(2) is correct, then “because Nan wants the baby ment, has largely been neglected (but see Baker 1995; Hor- to sleep” is an hypothesis that Nan had an internal (brain) gan and Graham 1991). If premise (b) refers to folk state of wanting the baby to sleep and that state caused Nan psychology(2), then premise (b) is plausible; but then the to turn the TV off. conclusion would establish only that commonsense psy- Although the expression folk psychology came to promi- chology interpreted as a theory is incorrect. However, if nence as a term for theory-theory, that is, folk psychology(2), premise (b) refers to folk psychology(1), then premise (b) is it is now used more generally to refer to commonsense psy- very probably false. If folk psychology is not a putative sci- chology, that is, folk psychology(1). This largely unnoticed entific theory in the first place, then there is no reason to broadening of the term has made for confusion in the litera- think that a physicalistic theory will reveal it to be incorrect. ture. Folk psychology (in one or the other sense, or some- (Similarly, if cooking, say, is not a scientific theory in the times equivocally) has been the focus of two debates. first place, then we need not fear that chemistry will reveal The first is the so-called use issue: What are people that you cannot really bake a cake.) So, the most that (EM) doing when they explain behavior in terms of beliefs, could show would be that if the theory-theory is the correct desires, and so on? Some philosophers (Goldman 1993; philosophical account of folk psychology(1), then folk psy- Gordon 1986) argue that folk psychology, in sense (1) is a chology is a false theory. (EM) would not establish the matter of simulation. Putting it less precisely than either incorrectness of commonsense psychology on other philo- Goldman or Gordon would, to use commonsense psychol- sophical accounts (as, say, understood in terms of Aristotle’s ogy is to exercise a skill; to attribute a belief is to project account of the practical syllogism). oneself into the situation of the believer. The dominant view, Other positions on the “status” issue include these: com- however, is that users of concepts like believing, desiring, monsense psychology—folk psychology(1)—will be partly intending—folk psychology(1)—are deploying a theory— confirmed and partly disconfirmed by scientific psychology 320 Form/Content (von Eckardt 1994, 1997); commonsense psychology is so Further Readings robust that we should affirm its physical basis regardless of Baker, L. R. (1988). Saving Belief: A Critique of Physicalism. the course of scientific psychology (Heil 1992); common- Princeton: Princeton University Press. sense psychology is causal, and hence, though attributions Burge, T. (1979). Individualism and the mental. Studies in Meta- of attitudes are interpretive and normative, explanations of physics: Midwest Studies in Philosophy, vol. 4. Minneapolis: behavior in terms of attitudes are backed by strict laws University of Minnesota Press. (Davidson 1980); commonsense psychology is useless as Churchland, P. S. (1988). Neurophilosophy: Toward a Unified Sci- science, but remains useful in everyday life (Dennett 1987; ence of the Mind/Brain. Cambridge, MA: MIT Press. Wilkes 1991). Still others (Baker 1995; Horgan and Graham Dennett, D. C. (1978). Brainstorms: Philosophical Essays on Mind 1991) take the legitimacy of commonsense psychology to and Psychology. Montgomery, VT: Bradford Books. Fodor, J. A. (1990). A Theory of Content and Other Essays. Cam- be borne out in everyday cognitive practice—regardless of bridge, MA: MIT Press. the outcome of scientific psychology or neuroscience. Goldman, A. (1989). Interpretation psychologized. Mind and Lan- See also AUTISM; FUNCTIONALISM; INTENTIONALITY; LAN- guage 4: 161–185. GUAGE OF THOUGHT; PHYSICALISM; PROPOSITIONAL ATTI- Graham, G. L., and T. Horgan. (1988). How to be realistic about TUDES; SIMULATION VS. THEORY-THEORY; THEORY OF MIND folk psychology. Philosophical Psychology 1: 69–81. Greenwood, J. D. (1991). The Future of Folk Psychology: Inten- —Lynne Rudder Baker tionality and Cognitive Science. Cambridge: Cambridge Uni- versity Press. References Horgan, T., and J. Woodward. (1985). Folk psychology is here to stay. Philosophical Review 94: 197–225. Astington, J. W., P. L. Harris, and D. R. Olson, Eds. (1988). Devel- Kitcher, P. (1984). In defense of intentional psychology. Journal of oping Theories of Mind. Cambridge, MA: Cambridge Univer- Philosophy 71: 89–106. sity Press. Lewis, D. (1972). Psychophysical and theoretical identifications. Baker, L. R. (1995). Explaining Attitudes: A Practical Approach to Australasian Journal of Philosophy 50: 249–258. the Mind. Cambridge, MA: Cambridge University Press. Premack, D., and G. Woodruff. (1978). Does the chimpanzee have Churchland, P. M. (1981). Eliminative Materialism and the Propo- a theory of mind? Behavioral and Brain Sciences 1: 515–526. sitional Attitudes. Journal of Philosophy 78: 67–90. Putnam, H. (1988). Representation and Reality. Cambridge, MA: Churchland, P. M. (1989). Eliminative materialism and the propo- MIT Press. sitional attitudes. In A Neurocomputational Perspective: The Ramsey, W., S. Stich, and J. Garon. (1990). Connectionism, elimi- Nature of Mind and the Structure of Science. Cambridge, MA: nativism, and the future of folk psychology. In J. E. Tomberliin, MIT Press. Ed., Action Theory and Philosophy of Mind—Philosophical Davidson, D. (1980). Essays on Actions and Events. Oxford: Clar- Perspectives 4. Atascadero, CA: Ridgeview. endon Press. Ryle, G. (1949). The Concept of Mind. London: Hutchinson. Dennett, D. C. (1987). The Intentional Stance. Cambridge, MA: Searle, J. (1980). Minds, brains and programs. Behavioral and MIT Press. Brain Sciences 3: 417–424. Dretske, F. (1987). Explaining Behavior: Reasons in a World of Wellman, H. (1990). The Child’s Theory of Mind. Cambridge, Causes. Cambridge, MA: MIT Press. MA: MIT Press. Fodor, J. A. (1987). Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, MA: MIT Press. Goldman, A. I. (1993). The Psychology of Folk Psychology. Form/Content Behavioral and Brain Sciences 16: 15–28. Gordon, R. M. (1986). Folk Psychology as Simulation. Mind and Language 1: 158–171. See COMPUTATIONAL THEORY OF MIND; INTENTIONALITY; Heil, J. (1992). The Nature of True Minds. Cambridge, MA: Cam- LOGICAL FORM, ORIGINS OF bridge University Press. Horgan, T., and G. Graham. (1991). In defense of southern funda- Formal Grammars mentalism. Philosophical Studies 62: 107–134. Morton, A. (1980). Frames of Mind: Constraints on the Common- sense Conception of the Mental. Oxford: Clarendon Press. A grammar is a definition of a set of linguistic structures, Sellars, W. (1963). Empiricism and the philosophy of mind. In Sci- where the linguistic structures could be sequences of words ence, Perception and Reality. London: Routledge and Kegan (sentences), or “sound-meaning” pairs (that is, pairs Paul. where s is a representation of phonetic properties and m is a Stich, S. P. (1983). From Folk Psychology to Cognitive Science: representation of semantic properties), or pairs where The Case Against Belief. Cambridge, MA: MIT Press. t is a tree and p is a probability of occurrence in a discourse. Stich, S. P. (1996). Deconstructing the Mind. Oxford: Oxford Uni- versity Press. A formal grammar, then, is a grammar that is completely von Eckardt, B. (1994). Folk psychology and scientific psychol- clear and unambiguous. Obviously, this account of what ogy. In S. Guttenplan, Ed., A Companion to the Philosophy of qualifies as “formal” is neither formal nor rigorous, but in Mind. Oxford: Blackwell, pp. 300–307. practice there is little dispute. von Eckardt, B. (1997). The empirical naivete of the current philo- It might seem that formalization would always be desir- sophical conception of folk psychology. In M. Carrier and P. K. able in linguistic theory, but there is little point in spelling Machamer, Eds., Mindscapes: Philosophy, Science, and the out the details of informal hypotheses when their weak- Mind. Pittsburgh, PA: University of Pittsburgh Press, pp. 23–51. nesses can readily be ascertained and addressed without Wilkes, K. V. (1991). The relationship between scientific psychol- working out the details. In fact, there is considerable varia- ogy and common-sense psychology. Synthese 89: 15–39. Formal Grammars 321 tion in the degree to which empirical proposals about human become prominent in OPTIMALITY THEORY. Parts of the grammars are formalized, and there are disputes in the liter- transformational grammar tradition called MINIMALISM have ature about how much formalization is appropriate at this a fundamentally similar character, with constraints that can stage of linguistic theory (Pullum 1989; Chomsky 1990). be violated when there is no better option. Given this controversy, and given the preliminary and The various different kinds of formal grammars have all changing nature of linguistic theories, formal studies of been studied formally, particularly with regard to the com- grammar have been most significant when they have plexities of sets they can define, and with regard to the suc- focused not on the details of any particular grammar, but cinctness of the definitions of those sets. rather on the fundamental properties of various kinds of The study of various kinds of formal grammars has led grammars. Taking this abstract, metagrammatical approach, linguists to consider whether we can determine, in advance formal studies have identified a number of basic properties of knowing in full detail what the grammars for human lan- of grammars that raise new questions about human lan- guages are like, whether human languages have one or guages. another of the fundamental properties that are well under- One basic division among the various ways of defining stood in the context of artificial languages. Regardless of sets of linguistic structures classifies them as generative or how a set of linguistic structures is defined, whether by a constraint-based. A GENERATIVE GRAMMAR defines a set generative grammar or constraint grammar or over- of structures by providing some basic elements and apply- constrained grammar, we can consider questions like the ing rules to derive new elements. Again, there are two following about the complexity of the defined sets of basic ways of doing this. The first approach, common in “grammatical” structures. (Of course, such questions apply “formal language theory” involves beginning with a “cate- only to formal grammars for which some significant crite- gory” like “sentence,” applying rules that define what parts rion of “grammaticality” can be formulated, which is a con- the sentence has, what parts those parts have, and so on troversial empirical question.) until the sentence has been specified all the way down to Is the set of linguistic structures finite? Although there the level of words. This style of language definition has are obvious practical limitations on the lengths of sentences proven to be very useful, and many fundamental results that any human will ever pronounce, these bounds do not have been established (Harrison 1978; Rozenberg and seem to be linguistic in nature, but rather derive from limita- Salomaa 1997). tions in our life span, requirements for sleep, and so on. As A second “bottom-up” approach, more common in CATE- far as the grammars of natural languages go, there seems to GORIAL GRAMMAR and some related traditions, involves be no longest sentence, and consequently no maximally starting with some lexical items (“generators”) and then complex linguistic structure, and we can conclude that applying rules to assemble them into more complex struc- human languages are infinite. This assumption makes all of tures. This style of language definition comes from LOGIC the following questions more difficult, because it means that and algebra, where certain sets are similarly defined by the languages that people speak contain structures that no “closing” a set of basic elements with respect to some gen- human will ever produce or understand. This basic point is erating relations. In these formal grammars, the structures also one of the basic motivations for the competence/perfor- of the defined language are analogous to the theorems of mance distinction. formal logic in that they are derived from some specified Is the set of linguistic structures “recursive”? That is, is basic elements by rigorously specified rules. A natural step there an algorithm that can effectively decide whether a from this idea is to treat a grammar explicitly as a logic given structure is in the set or not? This question is more (Lambek 1958; Moortgat 1997). interesting than it looks, because the mere fact that humans Unlike the generative methods, which define a language use languages does not show they are recursive (Matthews by applying rules to a set of initial elements of some kind, a 1979). However, there seems to be no good reason to constraint grammar specifies a set by saying what proper- assume that languages are not recursive. ties the elements of that set must have. In this sort of defini- Is the set of linguistic structures one that is recognized tion, the structures in the language are not like the by a finite computing device? As Chomsky (1956) pointed (generated, enumerable) theorems of a logic, but more like out, as far as the principles of language are concerned, not the sentences that could possibly be true (the “satisfiable” only are there sentences that are too long for a human to sentences of a logic). This approach to grammar is particu- pronounce, but there are sentences that require more larly prominent in linguistic traditions like HEAD-DRIVEN memory to recognize than humans have. To recognize an arbitrary sentence of a human language requires infinite PHRASE STRUCTURE GRAMMAR (Pollard and Sag 1994). memory, just as computing arbitrary multiplication prob- However, most linguistic theories use both generative and lems does. (In a range of important cases, the complexity constraint-based specifications of structure. of grammatical descriptions of formal languages corre- Recently, linguists have also shown interest in a special sponds to the complexity of the machines needed to rec- variety of constraint grammar that is sometimes called ognize those languages, as we see here. The study of “over-constrained.” These grammars impose constraints formal languages overlaps extensively with the theory of that cannot all be met, and the interest then is in the linguis- tic structures that meet the constraints of the grammar to the AUTOMATA.) greatest degree, the structures that are “most economical” or Is the set of linguistic structures generated by a “context “most optimal” in some sense. Systems of this kind have free grammar”? Because “context free grammars” can been studied in a variety of contexts, but have recently define languages that cannot be accepted by a finite machine, 322 Formal Systems, Properties of this more technical question has received a lot of attention, Further Readings particularly in studies of the syntax of human languages. The Basic Mathematical Properties of Generative Grammars received view is that the sentences of natural languages are not generally definable by context free grammars (Chomsky Harrison, M. A. (1978). Introduction to Formal Language Theory. 1956; Savitch et al. 1987). So the human languages seem to Reading, MA: Addison-Wesley. be more complex than context free languages, but still recur- Hopcroft, J. E., and J. D. Ullman. (1979). Introduction to Automata sive. There have been a number of attempts to pin the matter Theory, Languages and Computation. Reading, MA: Addison- Wesley. down more precisely (Joshi 1985; Vijay-Shanker and Weir Moll, R. N., M. A. Arbib, and A. J. Kfoury (1988). An Introduction 1994). to Formal Language Theory. New York: Springer. Is the set of linguistic structures “efficiently parsable”? That is, roughly, can the structures be computed efficiently On Formal Properties of Human Languages from their spoken or written forms? For human languages, Barton, E., R. C. Berwick, and E. S. Ristad (1987). Computational the answer appears to be negative, as argued for example in Complexity and Natural Language. Cambridge, MA: MIT Barton, Berwick, and Ristad (1987) and in Ristad (1993), Press. but the matter remains controversial. Questions like this one Savitch, W. J., E. Bach, W. Marsh, and G. Safran-Naveh, Eds. are at the foundations of work in NATURAL LANGUAGE PRO- (1987). The Formal Complexity of Natural Language. Boston: CESSING. Reidel. Is a set of formal languages “learnable”? Results in for- mal studies of learning often depend on the particular sets Basic Mathematical Properties of Some Linguistic Theories that are learned (in some precise sense of “learned”), and so Perrault, R. C. (1983). On the mathematical properties of linguistic formal grammars play a fundamental role in LEARNING SYS- theories. Proceedings of the Association for Computational TEMS; COMPUTATIONAL LEARNING THEORY; SUPERVISED Linguistics 21: 98–105. LEARNING; UNSUPERVISED LEARNING; MACHINE LEARNING; Recent Work on Grammars-as-Logic STATISTICAL LEARNING THEORY. Moortgat, M. (1997). Categorial type logics. In J. van Benthem —Edward Stabler and A. ter Meulen, Eds., Handbook of Logic and Language. New York: Elsevier. References Recent Work on Over-Constrained Systems Barton, E., R. C. Berwick, and E. S. Ristad (1987). Computational Complexity and Natural Language. Cambridge, MA: MIT Jampel, M., E. Freunder, and M. Maher, Eds. (1996). Over- Press. Constrained Systems. New York: Springer. Chomsky, N. (1956). Three models for the description of language. Prince, A., and P. Smolensky. (1993). Optimality Theory: Con- IRE Transactions on Information Theory IT-2: 113–124. straint Interaction in Generative Grammar. Technical Report 2. Chomsky, N. (1990). On formalization and formal linguistics. Nat- Center for Cognitive Science, Rutgers University. ural Language and Linguistic Theory 8: 143–147. Harrison, M. A. (1978). Introduction to Formal Language Theory. Formal Systems, Properties of Reading, MA: Addison-Wesley. Joshi, A. (1985). How much context-sensitivity is necessary for characterizing structural descriptions? In D. Dowty, L. Kart- Formal systems or theories must satisfy requirements that tunen, and A. Zwicky, Eds., Natural Language Processing: are sharper than those imposed on the structure of theories Theoretical, Computational and Psychological Perspectives. by the axiomatic-deductive method, which can be traced New York: Cambridge University Press. Lambek, J. (1958). The mathematics of sentence structure. Ameri- back to Euclid’s Elements. The crucial additional require- can Mathematical Monthly 65: 154–170. ment is the regimentation of inferential steps in proofs: not Matthews, R. (1979). Do the grammatical sentences of language only axioms have to be given in advance, but also the logical form a recursive set? Synthese 40: 209–224. rules representing argumentative steps. To avoid a regress in Moortgat, M. (1997). Categorial type logics. In J. van Benthem the definition of proof and to achieve intersubjectivity on a and A. ter Meulen, Eds., Handbook of Logic and Language. minimal basis, the rules are to be “mechanical” and must New York: Elsevier. take into account only the syntactic form of statements. Pollard, C., and I. Sag. (1994). Head-driven Phrase Structure Thus, to exclude any ambiguity, a precise symbolic lan- Grammar. Chicago: University of Chicago Press. guage is needed and a logical calculus. Both the concept of Pullum, G. K. (1989). Formal linguistics meets the boojum. Natu- a “formula” (i.e., statement in the symbolic language) and ral Language and Linguistic Theory 7: 137–143. Ristad, E. S. (1993). The Language Complexity Game. Cambridge, that of a “rule” (i.e., inference step in the logical calculus) MA: MIT Press. have to be effective; by the Church-Turing Thesis, this Rozenberg, G., and A. Salomaa, Eds. (1997). Handbook of Formal means they have to be recursive. Languages. New York: Springer. FREGE (1879) presented a symbolic language (with rela- Savitch, W. J., E. Bach, W. Marsh, and G. Safran-Naveh, Eds. tions and quantifiers) together with an adequate logical calcu- (1987). The Formal Complexity of Natural Language. Boston: lus, thus providing the means for the completely formal Reidel. representation of mathematical proofs. The Fregean frame- Vijay-Shanker, K., and D. Weir (1994). The equivalence of four work was basic for the later development of mathematical extensions of context free grammar formalisms. Mathematical logic; it influenced the work of Whitehead and Russell that Systems Theory 27: 511–545. Formal Systems, Properties of 323 culminated in Principia Mathematica. The next crucial step Modern proof theory—by using stron- GÖDEL’S THEOREMS. was taken most vigorously by Hilbert; he built on Whitehead ger than finitist, but still “constructive” means—has been and Russell’s work and used an appropriate framework for able to prove the consistency of significant parts of analysis. the development of parts of mathematics, but took it also as In pursuing this generalized consistency program, important an object of mathematical investigation. The latter metamath- insights have been gained into structural properties of proofs ematical perspective proved to be extremely important. in special calculi (”normal form” of proofs in sequent and Clearly, in a less rigorous way it goes back to the investiga- natural deduction calculi; cf. Gentzen 1934–35, 1936; Praw- tions concerning non-Euclidean geometry and Hilbert’s own itz 1966). These structural properties are fundamental not early work (1899) on independence questions in geometry. only for modern LOGICAL REASONING SYSTEMS, but also for Hilbert’s emphasis on the mathematical investigation of interesting psychological theories of human reasoning (see formal systems really marked the beginning of mathemati- DEDUCTIVE REASONING and Rips 1994). cal logic. In the lectures (1918), prepared in collaboration Hilbert’s Entscheidungsproblem, the decision problem with Paul Bernays, he isolated the language of first order for first order logic, was one issue that required a precise logic as the central language (together with an informal characterization of “effective methods”; see CHURCH- semantics) and developed a suitable logical calculus. Cen- TURING THESIS. Though partial positive answers were tral questions were raised and partially answered; they con- found during the 1920s, Church and TURING proved in cerned the completeness, consistency, and decidability of 1936 that the general problem is undecidable. The result such systems and are still central in mathematical logic and and the techniques involved in its proof (not to mention other fields, where formal systems are being explored. the very mathematical notions) inspired the investigation Some important results will be presented paradigmatically; of the recursion theoretic complexity of sets that led at for a real impression of the richness and depth of the subject first to the classification of the arithmetical, hyperarith- readers have to turn to (classical) textbooks or to up-to-date metical, and analytical hierarchies, and later to that of the handbooks (see Further Readings.) computational complexity classes. Completeness has been used in a number of different Some general questions and results were described for senses, from the quasi-empirical completeness of Zermelo particular systems; as a matter of fact, questions and results Fraenkel set theory (being sufficient for the formal develop- that led to three branches of modern logic: model theory, ment of mathematics) to the syntactic completeness of for- proof theory, and computability theory. However, to reem- mal theories (shown to be impossible by Gödel’s First phasize the point, from an abstract recursion theoretic point Theorem for theories containing a modicum of number the- of view any system of “syntactic configurations” whose ory). For logic the central concept is, however, Semantic “formulas” and “proofs” are effectively decidable (by a Tur- completeness: a calculus is (semantically) complete, if it ing machine) is a formal system. In a footnote to his 1931 allows to prove all statements that are true in all interpreta- paper added in 1963, Gödel made this point most strongly: tions (models) of the system. In sentential logic these state- “In my opinion the term ‘formal system’ or ‘formalism’ ments are the tautologies; for that logic Hilbert and Bernays should never be used for anything but this notion. In a lec- (1918) and Post (1921) proved the completeness of appro- ture at Princeton . . . I suggested certain transfinite generali- priate calculi; for first order logic completeness was estab- zations of formalisms; but these are something radically lished by Gödel (1930). Completeness expresses obviously different from formal systems in the proper sense of the the adequacy of a calculus to capture all logical conse- term, whose characteristic property is that reasoning in quences and entails almost immediately the logic’s com- them, in principle, can be completely replaced by mechani- pactness: if every finite subset of a system has a model, so cal devices.” Thus, formal systems in this sense can in prin- does the system. Ironically, this immediate consequence of ciple be implemented on computers and provide (at least its adequacy is at the root of real inadequacies of first order partial) models for a wide variety of mental processes. logic: the existence of nonstandard models for arithmetic See also COMPUTATION; COMPUTATIONAL THEORY OF and the inexpressibility of important concepts (like “finite,” MIND; LANGUAGE AND THOUGHT; MENTAL MODELS; RULES “well-order”). The relativity of “being countable” (leading AND REPRESENTATIONS to the so-called Skolem paradox) is a direct consequence of —Wilfried Sieg the proof of the completeness theorem. Relative consistency proofs were obtained in geometry by References semantic arguments: given a model of Euclidean geometry one can define a Euclidean model of, say, hyperbolic geome- Frege, G. (1879). Begriffsschrift, eine der arithmetischen nachge- try; thus, if an inconsistency could be found in hyperbolic bildete Formelsprache des reinen denkens. Halle: Nebert. Frege, G. (1893). Grundgesetze der Arithmetik, begriffsschriftlich geometry it could also be found in Euclidean geometry. Hil- abgeleitet, vol. 1. Jena: Pohle. bert formulated as the central goal of his program to estab- Gentzen, G. (1934–35). Untersuchungen über das logische lish by elementary, so-called finitist means the consistency of Schliessen 1, 2. Math. Zeitschrift 39: 176–210, 405–431. Trans- formal systems. This involved a direct examination of formal lated in Gentzen (1969). proofs; the strongest results before 1931 were obtained by Gentzen, G. (1936). Die Widerspruchsfreiheit der reinen Zahlen- Ackermann, VON NEUMANN, and Herbrand: they established theorie. Mathematische Annalen 112: 493–565. Translated in the consistency of number theory with a very restricted Gentzen (1969). induction principle. A basic limitation had indeed been Gentzen, G. (1969). The Collected Papers of Gerhard Gentzen. reached, as was made clear by Gödel’s Second Theorem; see M. E. Szabo, Ed. Amsterdam: North-Holland. 324 Formal Theories scheme that was completely different from Gödel, K. (1930). Die Vollständigkeit der Axiome des logischen RESENTATION Funktionenkalküls. Monatshefte für Mathematik und Physik formalisms used in those days, namely, rule-based and 37: 349–360. Translated in Collected Works 1. logic-based formalisms. Minsky proposed organizing Gödel, K. (1931). Über formal unentscheidbare Sätze der Prin- knowledge into chunks called frames. These frames are sup- cipia Mathematica und verwandter Systeme 1. Monatshefte für posed to capture the essence of concepts or stereotypical sit- Mathematik und Physik 38: 173–198. Translated in Collected uations, for example being in a living room or going out for Works 1. dinner, by clustering all relevant information for these situa- Gödel, K. (1986). Collected Works 1. Oxford: Oxford University tions together. This includes information about how to use Press. the frame, information about expectations (which may turn Hilbert, D. (1899). Grundlagen der Geometrie. Leipzig: Teubner. out to be wrong), information about what to do if expecta- Hilbert, D. (1918). Die Prinzipien der Mathematik. Lectures given tions are not confirmed, and so on. This means, in particu- during the winter term 1917–18. Written by Paul Bernays. Mathematical Institute, University of Göttingen. lar, that a great deal of procedurally expressed knowledge Hilbert, D., and W. Ackermann. (1928). Grundzüge der theoret- should be part of the frames. Collections of such frames are ichen Logik. Berlin: Springer. to be organized in frame systems in which the frames are Post, E. (1921). Introduction to a general theory of elementary interconnected. The processes working on such frame sys- propositions. Amer. J. Math. 43: 163–185. tems are supposed to match a frame to a specific situation, Prawitz, D. (1966). Natural Deduction: A Proof-Theoretical Study. to use default values to fill unspecified aspects, and so on. If Stockholm: Almqvist and Wiskell. this brief summary sounds vague, it correctly reproduces the Rips, L. (1994). The Psychology of Proof-Deductive Reasoning in paper’s general tone. Despite the fact that this paper was a Human Thinking. Cambridge, MA: MIT Press. first approach to the idea of what frames could be, Minsky explicitly argued in favor of staying flexible and nonformal. Further Readings Details that had been left out in Minsky’s 1975 paper Barwise, J., Ed. (1977). Handbook of Mathematical Logic. were later filled in by knowledge representation systems Amsterdam: North-Holland. that were inspired by Minsky’s ideas—the most prominent Börger, E., E. Graedel, and Y. Gurevich. (1997). The Classical being FRL and KRL (Bobrow and Winograd 1977). KRL Decision Problem. New York: Springer. was one of the most ambitious projects in this direction. It Kleene, S. C. (1952). Introduction to Metamathematics. Gronin- addressed almost every representational problem discussed gen: Wolters-Noordhoff Publishing. in the literature. The net result is a very complex language Rogers, H. Jr. (1967). Theory of Recursive Functions and Effective with a very rich repertoire of representational primitives and Computability. New York: McGraw Hill. almost unlimited flexibility. Shoenfield, J. R. (1967). Mathematical Logic. Reading, MA: Addison-Wesley. Features that are common to FRL, KRL, and later frame- van Dalen, D. (1989). Logic and Structure. New York: Springer. based systems (Fikes and Kehler 1985) are: (1) frames are organized in (tangled) hierarchies; (2) frames are composed out of slots (attributes) for which fillers (scalar values, refer- Formal Theories ences to other frames or procedures) have to be specified or computed; and (3) properties (fillers, restriction on fillers, SeeACQUISITION, FORMAL THEORIES OF; FORMAL GRAM- etc.) are inherited from superframes to subframes in the hierarchy according to some inheritance strategy. These MARS; FORMAL SYSTEMS, PROPERTIES OF; LEARNING SYS- TEMS organizational principles turned out to be very useful, and, indeed, the now popular object-oriented languages have adopted these organizational principles. Frame-Based Systems From a formal point of view, it was unsatisfying that the semantics of frames and of inheritance was specified only operationally. So, subsequent research in the area of knowl- Frame-based systems are knowledge representation systems edge representation addressed these problems. In the area of that use frames, a notion originally introduced by Marvin defeasible inheritance, principles based on nonmonotonic Minsky, as their primary means to represent domain knowl- logics together with preferences derived from the topology edge. A frame is a structure for representing a CONCEPT or of the inheritance network were applied in order to derive a situation such as “living room” or “being in a living room.” formal semantics (Touretzky 1986; Selman and Levesque Attached to a frame are several kinds of information, for 1993). The task of assigning declarative semantics to instance, definitional and descriptive information and how to frames was addressed by applying methods based on first use the frame. Based on the original proposal, several knowl- order LOGIC. edge representation systems have been built and the theory of frames has evolved. Important descendants of frame- Hayes (1980) argued that “most of frames is just a new based representation formalisms are description logics that syntax for parts of first order logic.” Although this means capture the declarative part of frames using a logic-based that frames do not offer anything new in expressiveness, semantics. Most of these logics are decidable fragments of there are two important points in which frame-based sys- first order logic and are very closely related to other formal- tems may have an advantage over systems using first-order isms such as modal logics and feature logics. logic. Firstly, they offer a concise way to express knowl- In the seminal paper “A framework for representing edge in an object-oriented way (Fikes and Kehler 1985). knowledge,” Minsky (1975) proposed a KNOWLEDGE REP- Secondly, by using only a fragment of first order logic, Frame-Based Systems 325 frame-based systems may offer more efficient means for heit et al. 1994). Second, there is a very strong connection to reasoning. modal logics and to dynamic logics (Schild 1991). In fact, These two points are addressed by the so-called descrip- the “standard” description logic ALC (Schmidt-Schauss and tion logics (also called terminological logics, concept lan- Smolka 1991) is simply a notational variant of the MODAL guages, and attributive description languages; Nebel and LOGIC K with multiple agents. Third, feature logics, which Smolka 1991), which formalize the declarative part of are the constraint logic part of so-called unification gram- frame-based systems and grew out of the development of mars such as HPSG, are very similar to description logics. the frame-based system KL-ONE (Brachman and Schmolze The only difference is that attributes in feature logics are sin- 1985). In description logics, it is possible to build up a con- gle-valued, whereas they are multivalued in description log- cept hierarchy out of atomic concepts (interpreted as unary ics. Although this seems to be a minor difference, it can predicates and denoted by capitalized words) and attributes, make the difference between decidable and undecidable rea- usually called roles (interpreted as binary predicates and soning problems (Nebel and Smolka 1991). denoted by lowercase words). The intended meaning of See also KNOWLEDGE-BASED SYSTEMS; KNOWLEDGE atomic concepts can be specified by providing concept ACQUISITION; NONMONOTONIC LOGICS; SCHEMATA descriptions made up of other concepts and role restric- —Bernhard Nebel tions, as in the following informal example: References Woman = Person and Female Parent = Person with some child Bobrow, D. G., and T. Winograd. (1977). An overview of KRL-0, Grandmother = Woman with some child who is a Parent a knowledge representation language. Cognitive Science 1(1): 3–46. One of the most important reasoning tasks in this context is Brachman, R. J. (1992). Reducing CLASSIC to practice: knowl- the determination of subsumption between two concepts, edge representation theory meets reality. In B. Nebel, W. that is, determining whether all instances of one concept are Swartout, and C. Rich, Eds., Principles of Knowledge Repre- necessarily instances of the other concept taking into sentation and Reasoning: Proceedings of the 3rd International account the definitions. For example, “Grandmother” is Conference (KR-92). Cambridge, MA, pp. 247–258. subsumed by “Parent” because everything that is a “Grand- Brachman, R. J., and H. J. Levesque. (1984). The tractability of mother” is—by definition—also a “Parent.” Similar to the subsumption in framebased description languages. In Proceed- subsumption task is the instance-checking task, where one ings of the 4th National Conference of the American Association wants to know if a given object is an instance of the speci- for Artificial Intelligence (AAAI-84). Austin, TX, pp. 34–37. fied concept. Brachman, R. J., and J. G. Schmolze. (1985). An overview of the KL-ONE knowledge representation system. Cognitive Science Starting with a paper by Brachman and Levesque (1984), 9(2): 171–216. the COMPUTATIONAL COMPLEXITY of subsumption determi- Buchheit, M., M. A. Jeusfeld, W. Nutt, and M. Staudt. (1994). Sub- nation for different variants of description logics has exten- sumption between queries to object-oriented databases. In K. sively been analyzed and a family of algorithms for solving Jeffery, M. Jarke, and J. Bubenko, Eds., Advances in Database the subsumption problem has been developed (Schmidt- Technology—EDBT-94. 4th International Conference on Schauss and Smolka 1991). Extending Database Technology. Cambridge, pp. 15–22. Although in most cases subsumption is decidable, that is, Donini, F. M., M. Lenzerini, D. Nardi, and W. Nutt. (1991). Tracta- easier than inference in full first order logic, there are cases ble concept languages. In Proceedings of the 12th International when subsumption becomes undecidable (Schmidt-Schauss Joint Conference on Artificial Intelligence (IJCAI-91). Sydney, 1989). Aiming for polynomial-time decidability, however, pp. 458–465. Fikes, R. E., and T. Kehler. (1985). The role of frame-based repre- leads to very restricted description logics, as shown by sentation in knowledge representation and reasoning. Commu- Donini et al. (1991). Further, if definitions of concepts as in nications of the ACM 28(9): 904–920. the example above are part of the language, even the weak- Hayes, P. J. (1980). The logic of frames. In D. Metzing, Ed., Frame est possible description logic has an NP-hard subsumption Conceptions and Text Understanding. Berlin: deGruyter, pp. problem (Nebel 1990). Although these results seem to sug- 46–61. gest that description logics are not usable because the com- Heinsohn, J., D. Kudenko, B. Nebel, and H.-J. Profitlich. (1994). putational complexity of reasoning is too high, experience An empirical analysis of terminological representation systems. with implemented systems shows that moderately expres- Artificial Intelligence 68(2): 367–397. sive description logics are computationally feasible (Hein- Minsky, M. (1975). A framework for representing knowledge. In P. sohn et al. 1994). In fact, current frame-based systems are Winston, Ed., The Psychology of Computer Vision. New York: McGraw-Hill, pp. 211–277. efficient enough to support large configuration systems that Nebel, B. (1990). Terminological reasoning is inherently intracta- are in everyday use at AT&T (Brachman 1992). ble. Artificial Intelligence 43: 235–249. In the course of analyzing the logical and computational Nebel, B., and G. Smolka. (1991). Attributive description formal- properties of frame-based systems, it turned out that descrip- isms . . . and the rest of the world. In O. Herzog and C.-R. Roll- tion logics are very similar to other formalisms used in com- inger, Eds., Text Understanding in LILOG. Berlin: Springer, puter science and COMPUTATIONAL LINGUISTICS (Nebel and pp. 439–452. Smolka 1991). First of all, the declarative part of object- Schild, K. (1991). A correspondence theory for terminological log- oriented database languages bears a strong resemblance to ics: preliminary report. In Proceedings of the 12th International description logics, and it is possible to apply techniques and Joint Conference on Artificial Intelligence (IJCAI-91). Sydney, methods developed for description logics in this area (Buch- pp. 466–471. 326 Frame Problem enforce? Some favor (2) because the delay occurs before the Schmidt-Schauss, M. (1989). Subsumption in KL-ONE is unde- cidable. In R. Brachman, H. J. Levesque, and R. Reiter, Eds., shooting (e.g., Shoham 1988). Others favor (2) because Principles of Knowledge Representation and Reasoning: Pro- there is no represented reason to believe it violated, although ceedings of the 1st International Conference (KR-89). Toronto, the shooting provides some reason for believing (1) violated pp. 421–431. (e.g., Morgenstern 1996; cf. philosophical discussions of Schmidt-Schauss, M., and G. Smolka. (1991). Attributive concept inference to the best EXPLANATION, e.g., Thagard 1988). descriptions with complements. Artificial Intelligence 48: 1– Work continues in this vein, seeking to formalize the rele- 26. vant temporal and rational notions, and to insure that the Selman, B., and H. J. Levesque. (1993). The complexity of path- strategies apply more broadly than the situation calculus. based defeasible inheritance. Artificial Intelligence 62: 303– Another approach to the frame problem seeks to remain 339. within the strictures of classical (monotonic) logic (Reiter Touretzky, D. S. (1986). The Mathematics of Inheritance Systems. Los Altos, CA: Morgan Kaufmann. 1991). In most circumstances, it avoids the use of huge numbers of axioms about nonchanges, but at the cost of using hugely and implausibly bold axioms about non- Frame Problem changes. For example, it is assumed that all the possible causes of a certain kind of effect are known, or that all the From its humble origins labeling a technical annoyance for actual events or actions operating on a given situation are a particular AI formalism, the term frame problem has known. grown to cover issues confronting broader research pro- Some philosophers of mind maintain that the original grams in AI. In philosophy, the term has come to encompass frame problem portends deeper problems for traditional AI, allegedly fundamental, but merely superficially related, or at least for cognitive science more broadly. (Unless other- objections to computational models of mind in AI and wise mentioned, the relevant papers of the authors cited may beyond. be found in Pylyshyn 1987.) Daniel Dennett wonders how The original frame problem appears within the SITUA- to ignore information obviously irrelevant to one’s goals, as TION CALCULUS for representing a changing world. In such one ignores many obvious nonchanges. John Haugeland systems there are “axioms” about changes conditional on wonders how to keep track of salient side effects without prior occurrences—that pressing a switch changes the illu- constantly checking for them. This includes the “ramifica- mination of a lamp, that selling the lamp changes who owns tion” and “qualification” problems of AI; see Morgenstern it, and so on. Unfortunately, because inferences are to be (1996) for a survey. Jerry Fodor wonders how to avoid the made solely by deduction, axioms are needed for purported use of “kooky” concepts that render intuitive nonchanges as nonchanges—that pressing the switch does not change the changes —for instance, “fridgeon” which applies to physi- owner, that selling the lamp does not change its illumina- cal particles if and only if Fodor’s fridge is on, so that Fodor tion, and so on. Without such “frame axioms,” a system is can “change” the entire universe simply by unplugging his unable strictly to deduce that any states persist. The result- fridge. AI researchers, including Drew McDermott and Pat ing problem is to do without huge numbers of frame axioms Hayes, protest that these further issues are unconnected to potentially relating each representable occurrence to each the original frame problem. representable nonchange. Nevertheless, the philosophers’ challenges must be met A common response is to handle nonchanges implicitly somehow if human cognition is to be understood in compu- by allowing the system to assume by default that a state per- tational terms (see CAUSAL REASONING). Exotic suggestions sists, unless there is an axiom specifying that it is changed involve mental IMAGERY as opposed to a LANGUAGE OF by an occurrence, given surrounding conditions. Because THOUGHT (Haugeland, cf. Janlert in AI), nonrepresenta- such assumptions are not deducible from the axioms of tional practical skills (Dreyfus and Dreyfus), and emotion- change (even given surrounding conditions), and because induced temporary modularity (de Sousa 1987, chap. 7). the licensed conclusions are not cumulative as evidence is The authors of the Yale Shooting Problem argue, as well, added, the frame problem helps motivate the development against the hegemony of logical deduction — whether clas- of special NONMONOTONIC LOGICS intended to minimize the sical or nonmonotonic— in AI simulations of commonsense assumptions that must be retracted given further evidence. reasoning. More conservative proposed solutions appeal to This is related to discussions of defeasibility and ceteris HEURISTIC SEARCH techniques and ideas about MEMORY paribus reasoning in epistemology and philosophy of sci- long familiar in AI and cognitive psychology (Lormand, in ence (e.g., Harman 1986). Ford and Pylyshyn 1996; Morgenstern 1996 provides an A related challenge is to determine which assumptions to especially keen survey of AI proposals). retract when necessary, as in the “Yale Shooting Problem” See also EMOTIONS; FRAME-BASED SYSTEMS; KNOWL- (Hanks and McDermott 1986). Let a system assume by EDGE REPRESENTATION; PROPOSITIONAL ATTITUDES; SCHE- default (1) that live creatures remain alive, and (2) that MATA; SITUATEDNESS/EMBEDDEDNESS loaded guns remain loaded. Confront it with this informa- —Eric Lormand tion: Fred is alive, then a gun is loaded, then, after a delay, the gun is fired at Fred. If assumption (2) is in force through References the delay, Fred probably violates (1). But equally, if assump- tion (1) is in force after the shooting, the gun probably vio- de Sousa, R. (1987). The Rationality of Emotion. Cambridge, MA: lates (2). Why is (2) the more natural assumption to MIT Press. Frege, Gottlob 327 important of these is the distinction between sense and ref- Ford, K., and Z. Pylyshyn, Eds. (1996). The Robot’s Dilemma Revisited. Norwood, NJ: Ablex. erence, which occurs in his 1892 paper, “On Sense and Ref- Hanks, S., and D. McDermott. (1986). Default reasoning, non- erence,” the defining article of the analytic tradition in monotonic logic, and the frame problem. Proceedings of the philosophy (see SENSE AND REFERENCE for an extended dis- American Association for Artificial Intelligence 328-333. cussion). In that paper, he also gave the first modern infor- Harman, G. (1986). Change in View. Cambridge, MA: MIT Press. mal semantical analysis of PROPOSITIONAL ATTITUDES and Morgenstern, L. (1996). The problem with solutions to the frame introduced the notion of presupposition into the literature problem. In K. Ford and Z. Pylyshyn, Eds., (1996), pp. 99–133. (though he introduced the negation test for presupposition, Pylyshyn, Z., Ed. (1987). The Robot’s Dilemma. Norwood, NJ: it is clear that he was unaware of the Projection Problem; Ablex. see PRESUPPOSITION). His 1918 essay “Thoughts” contains a Reiter, R. (1991). The frame problem in the situation calculus: a sophisticated discussion of indexicality (cf. INDEXICALS simple solution (sometimes) and a completeness result for goal regression. In V. Lifschitz, Ed., Artificial Intelligence and AND DEMONSTRATIVES). Though some of the students of Mathematical Theory of Computation: Papers in Honor of John BRENTANO also made distinctions like the one between McCarthy. Boston: Academic Press, pp. 359–380. sense and reference, and even had interesting discussions of Shoham, Y. (1988). Reasoning about Change. Cambridge, MA: indexicals and demonstratives (e.g., Husserl 1903: Book MIT Press. VI), none of them achieved the conceptual clarity of Frege Thagard, P. (1988). Computational Philosophy of Science. Cam- on these topics. Furthermore, Frege’s conception of bridge, MA: MIT Press. thoughts as structured in a way similar to sentences is a pre- cursor to one aspect of Fodor’s (1975) LANGUAGE OF Frege, Gottlob THOUGHT, though Frege’s conception of the ontology of thoughts as abstract, mind independent entities, much like Gottlob Frege (1848–1925) was a professional mathemati- numbers and sets, is incompatible with a Fodorian construal cian who, together with Bertrand Russell, is considered to of them, and indeed with much of what is said on the matter be one of the two grandfathers of modern analytic philoso- in the philosophy of mind today. phy. However, the importance of his work extends far Though Frege’s ideas and discoveries have clearly had a beyond the field of philosophy. Frege first introduced the profound effect on subsequent research in philosophy, com- concepts of modern quantificational LOGIC (1879). Indeed, puter science, and linguistics, his life’s project, logicism, is with apologies to C. S. Pierce, it is no exaggeration to call usually considered to be a failure (for an influential defense modern quantificational logic Frege’s discovery. Frege was of part of Frege’s version of logicism, see Wright 1983). also the first to present a formal system in the modern sense Logicism is the doctrine that arithmetic is reducible to logic. in which it was possible to carry out complex mathematical Frege announced this project in his Foundations of Arith- investigations (cf. FORMAL SYSTEMS, PROPERTIES OF). metic, which contained the most sophisticated discussion of In addition to contemporary, second-order quantifica- the concept of number in the history of philosophy, together tional logic, a host of logical and semantical techniques with an informal description of how the logicist program occur explicitly for the first time in his work. In Part III of could be carried out. In his Magnum Opus, Grundgesetze his 1879 work, he introduced the notion of the ancestral of der Arithmetik (Basic Laws of Arithmetic) (1893, 1903), a relation, which yields a logical characterization of one Frege tried to carry out the logicist program in full detail, important notion of mathematical sequence; for example, attempting to derive the basic laws of arithmetic, and indeed the ancestral can be used to define the notion of natural analysis, within a formal system whose axioms he believed number. Indeed, the ancestral provides a general technique expressed laws of logic. Unfortunately, the theory was for transforming an inductive definition of a concept into inconsistent. This discovery, by Bertrand Russell, devas- an explicit one (Dedekind was the codiscoverer of this tated Frege, and essentially ended his career as a mathemati- notion). Frege’s later work was also of logico-semantical cian. Recent research has shown, however, that there is a significance. The “smooth-breathing” operator of his great deal of interest that is salvageable from his mathemat- Grund-gesetze der Arithmetik, a variable binding device for ical work (Wright 1983; and the essays in Demopoulos the formation of complex names for extensions of func- 1995). tional expressions, is the inspiration for lambda abstrac- Frege is not merely of historical interest for the student tion. The brilliant semantic discussion in Part I, though of cognitive science. Rather than being interested in how we hindered by the lack of an analysis of the consequence rela- in fact reason, he is interested in how we ought to reason, tion, nonetheless anticipated many future developments in and rather than being interested in the biological component logic and semantics. For instance, Frege’s hierarchy of of mentality, he is interested in the abstract structure of functions (see Dummett 1973: chap. 3) could be taken as thought. Studying his works provides a useful curative for the catalyst for CATEGORIAL GRAMMAR. Even the influen- those who need to be reminded about the public and norma- tial technique of treating two-place functional expressions tive aspects of the notions that concern cognitive science. as denoting functions from objects to one-place functions —Jason C. Stanley described in Schoenfinkel (1924) is anticipated by Frege in his discussion of the extensions of two-place functional References expressions (1893: §36). Other ideas of Frege have also had a tremendous impact Demopoulos, W., Ed. (1995). Frege’s Philosophy of Mathematics. on research in the cognitive sciences. Perhaps the most Cambridge, MA: Harvard University Press. 328 Freud, Sigmund between the biological and the “human” sciences. Freud’s Dummett, M. (1973). Frege: Philosophy of Language. London: Duckworth. early training in neurophysiology led him to try to ground Fodor, J. (1975). The Language of Thought. New York: Thomas psychological theorizing in the known structures of the Crowell. brain. His incomplete manuscript, “Project for a Scientific Frege, G. (1879). Begriffsschrift: a formula language, modeled Psychology” attempted to relate specific psychological upon that of arithmetic, for pure thought. In Jean Van Heihe- functions, such as learning and memory, to recently dis- noort, Ed., 1967, From Frege to Goedel: A Source Book in covered properties of the neurons. In this respect, his Mathematical Logic, 1879–1931. Cambridge: Harvard Univer- methodological principles exactly paralleled the current sity Press, pp. 5–82. view in cognitive science that psychological theorizing Frege, G. (1884/1980). The Foundations of Arithmetic. Evanston, must be consistent with and informed by the most recent IL: Northwestern University Press. knowledge in neuroscience. Freud was also an avid sup- Frege, G. (1891). On Sense and Reference. In Peter, Geach, and Max Black, Eds., (1993). Translations from the Philosophical porter of DARWIN and was explicit in stating that his theory Writings of G Frege. Oxford: Blackwell, pp. 56–78 (there trans- of sexual and self-preservative instincts was firmly rooted lated as “On Sense and Meaning”). in (evolutionary) biology. Prototypical of his synthetic Frege, G. (1893, 1903/1966). Grundgesetze der Arithmetik. approach to knowledge, he made a bold conjecture about Hildesheim: Georg Olms Verlag. an important relation between the findings of neurophysi- Frege, G. (1918). Thoughts. In Brian McGuiness, Ed., 1984, Col- ology and evolutionary biology. If, as nearly all psycholo- lected Papers. Oxford: Blackwell, pp. 351–372. gists agreed, the mind functioned as a reflex, then it Frege, G. (1984). Collected Papers. Brian McGuiness, Ed. Oxford: required constant stimulation, and if, as Darwin argued, Blackwell. the sexual instinct is one of the two most important forces Husserl, E. (1903/1980). Logische Untersuchungen. Tuebingen: governing animal life, then these findings could be Max Niemeyer. Schoenfinkel, M. (1924). On the building blocks of mathematical brought together under a more comprehensive theory that logic. In J. Van Heihenoort, Ed., 1967, From Frege to Goedel: A sexual instincts (libido) drove the nervous system. Source Book in Mathematical Logic, 1879–1931. Cambridge, Although Freud, his disciples, and his critics often present MA: Harvard University Press, pp. 357–366. libido theory as an extrapolation from the sexual difficul- Wright, C. (1983). Frege’s Conception of Numbers as Objects. ties of his patients, its real strength and appeal came from Aberdeen: Aberdeen University Press. its plausible biological premises. Thanks to Darwin’s influence, sexuality also played an Freud, Sigmund important role in the social sciences of the late nineteenth century. A methodological imperative of evolutionary A prolific and gifted writer, whose broad learning extended anthropology and sociology was to connect sophisticated from neurophysiology and EVOLUTION to the literature of human achievements to “primitive” conditions shared with six languages, Sigmund Freud (1826–1939) was one of the animals, and sexual behavior was the most obvious point most influential scientists of the late nineteenth and early of connection. Given these trends in social science, Freud twentieth centuries. He was also one of the most controver- was able to make “upward” connections between psycho- sial scientists of any time, so much so that both his critics analysis and the social sciences, as well as “downward” and admirers have occasionally succumbed to the tempta- connections to neurophysiology and biology. In his efforts tion to deny that he was a scientist at all. to find links among all the “mental” sciences, Freud’s Freud’s positive and negative reputations flow from the methodological approach again bears a striking resem- same source—the extraordinary scope of his theories. blance to the interdisciplinary emphasis of current cogni- Although the notions of unconscious ideas and processes tive science. This approach was also the basis of the did not originate with Freud, having philosophical ante- tremendous appeal of psychoanalysis: he believed that he cedents in Gottfried Leibniz’s (1646–1716) theory of had a theory that could provide biologically grounded petites perceptions and psychiatric antecedents in the work explanations, in terms of sexual and self-preservative of, inter alia, Pierre Janet (1859–1947), Freud made them instincts and the various mental processes that operated on the centerpiece of his complex theory of the mind. Unlike them, for everything from psychotic symptoms, dreams, other psychiatrists, Freud took unconscious ideas and pro- and jokes to cultural practices such as art and religion. cesses to be critical in explaining the behavior of all peo- Freud’s theories of CONSCIOUSNESS and the EMOTIONS ple in all circumstances and not merely the outré actions of were also the product of an interdisciplinary synthesis psychotics. Unlike Leibniz and his followers, Freud pre- between psychiatry and philosophy. Individuals whose sented unconscious ideas not merely as a theoretical behavior was driven by natural, but unconscious, emo- necessity, but as the key to human action. Through his tional forces needed treatment in order to gain control of spirited defense of the necessity and importance of uncon- their lives by bringing the forces that govern them to con- scious ideas and processes, he gave these concepts theoret- sciousness. This was possible, Freud believed, because ical respectability, almost in spite of their associations affective states were also cognitive and so could be made with him. conscious through their ideational components. Although The explanatory scope of unconscious ideas and pro- the Project offered some speculations about the qualita- cesses was enormous for Freud, because he saw psycho- tive character of consciousness, Freud’s later approach analysis (see PSYCHOANALYSIS, CONTEMPORARY VIEWS OF was functionalist. Conscious ideas differed from the and PSYCHOANALYSIS, HISTORY OF) as bridging the gap unconscious, because they could be expressed verbally, Functional Decomposition 329 because they were subject to rational constraints such as acteristic domain of application. It assumes that there are a consistency, and because they could interact with sensory variety of functionally independent units, with intrinsically evidence. determined functions, that are minimally interactive. Func- Although Freud regarded the consilience of psycho- tional decomposition plays important roles in engineering, analysis with the biological and social sciences as the physiology, biology, and in artificial intelligence. Functional strongest argument in its favor, important changes in both morphologists, for example, distinguish the causal or func- the biological and social sciences undermined the plausi- tional roles of structures within organisms, the extent to bility of his basic assumptions about how mental processes which one structure may be altered without changing overall worked. Rather than alter the scientific foundations of psy- function, and the effects of these structures for evolutionary choanalysis, he continued to try to increase its scope and change. Within cognitive science, the assumption is that influence, leading to charges of disingenuousness and even there are a variety of mechanisms underlying our mental life, pseudoscience. Despite Freud’s tarnished reputation, many which are domain specific and functionally independent. The of his central substantive and methodological assumptions classical distinction in DESCARTES between understanding, about studying the mind have reemerged with the rise of imagination and will is a functional decomposition, which cognitive science, in particular, the assumption (from his postulates at least three independent faculties responsible for teacher BRENTANO) that mental states are intentional (see specific mental functions; likewise, the distinction drawn by INTENTIONALITY) and must be understood in terms of their KANT between sensation, judgment, understanding, and rea- contents, but that they are likewise physical and must be son offers a partitioning of our mental faculties based on related to neuroscience, the assumption (which he their cognitive functions, and is equally one that postulates a described as an extension of KANT) that most mental pro- variety of independent faculties responsible for specific men- cesses are unconscious, the view that cognition and emo- tal operations. In more recent psychological work, the dis- tion are not separate faculties, but deeply intertwined tinction between sensory stores, short-term MEMORY, and aspects of mentality, and the basic methodological long-term memory elaborated by Richard Atkinson and assumption that the biological and “human” sciences must Richard Shiffrin (1968) is a functional decomposition of learn from each other, because the ultimate goal is to memory, based on their domains of application (for a classic develop a comprehensive theory of the social, psychologi- source, see Neisser 1967). cal, and physical aspects of mentality. Further, although its Functional decomposition typically assumes a hierarchi- emphasis on input-output computation has given cognitive cal organization, though a hierarchical organization is con- science a synchronic time scale for much of its history, sistent with different modes of organization. Thus, the mind recent work on ARTIFICIAL LIFE and EVOLUTIONARY COM- is conceived as having a modular organization (cf. MODU- PUTATION reintroduces the sort of diachronic or genetic LARITY OF MIND), with a variety of faculties, each with approach that Freud thought was essential in understand- independent, intrinsically determined functions. Each of ing the complexities of a mentality that was produced via those modules in turn may have a modular organization, individual development and the evolution of the species. with a variety of independent, intrinsically determined func- tions. Sensory systems are relatively independent of one See also FOLK PSYCHOLOGY; FUNCTIONALISM; UNITY OF another, and independent of memory, language, and cogni- SCIENCE tion. Language in turn may be taken to consist of a variety —Patricia Kitcher of relatively independent subsystems (cf. MODULARITY AND LANGUAGE), including modules responsible for PHONOL- References OGY, PHONETICS, SEMANTICS, and SYNTAX. The extent to which a hierarchical organization, or functional indepen- Ellenberger, H. (1970). The Discovery of Consciousness. New dence, is realistic can be decided only empirically, by seeing York: Basic Books. the extent to which we can approximate or explain system Erdelyi, M. H. (1985). Psychoanalysis: Freud’s Cognitive Psychol- ogy. New York: W. H. Freeman. behavior by assuming it. Freud, S. (1966). Project for a Scientific Psychology. Preliminary Functional decomposition is easily illustrated by appeal- Communication (to Studies in Hysteria, with (Josef Brauer)). ing to the understanding of language. In the early nineteenth Three Essays on Sexuality. The Unconscious. Instincts and century, Franz Joseph Gall (1758–1828) defended the view their Vicissitudes. The Ego and the Id. All can be found in The that the mind consists of a variety of “organs” or “centers,” Complete Psychological Works of Sigmund Freud. James Stra- each subserving specific intellectual or moral (that is, prac- chey., Ed. 24, vol. London: The Hogarth Press. tical) functions, with dedicated locations in the cerebral Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary hemispheres. These intellectual and moral functions were Science of Mind. Cambridge, MA: Bradford/MIT Press. sharply distinguished at a higher level from the “vital” func- Sulloway, F. (1994). Freud: Biologist of the Mind. New York: tions and affections that Gall located in the “lower” portions Basic Books. of the brain, and the specific functions in turn were distin- guished from one another. There were differences between Functional Decomposition Gall and his fellow phrenologists concerning the number and characterization of the specific faculties, but within the Functional decomposition is the analysis of the activity of a intellectual faculties, phrenologists typically distinguished system as the product of a set of subordinate functions per- broadly between the external senses, various “perceptive” formed by independent subsystems, each with its own char- faculties (including faculties for perceiving weight, color, 330 Functional Decomposition tune, and language, among others), and the “reflective” fac- it should be possible to disrupt linguistic abilities without ulties constitutive of reason. The primary faculties were the impairing cognition, and vice versa. The studies of aphasia species of intellection, and they were assumed to belong to exhibit such patterns. A commitment to some form of func- specific organs in the brain. Gall held an extreme view, tional decomposition or modularity might seem inevitable assuming that the basic functions were strictly limited in when dealing with a phenomenon as complex as mental life. application, invariable in their operation, and wholly inde- Herbert Simon (1969) has emphasized the importance of pendent of the activities of other faculties. That is, he simple decomposability and near decomposability, as well assumed that the mind was simply an aggregate of its inde- as of hierarchical organization in complex systems. In pendent functions and that there was no overlap or interac- explaining the behavior of a complex system, it is often pos- tion between these organs. Because he recognized no sible to establish independent functional characterizations interaction between the faculties, complex abilities became for components, ignoring both the contributions of other simply the aggregates of simple abilities. Gall assumed, in components at the same level as well as the influences oper- other words, that the mind was both hierarchical and aggre- ative at higher or lower levels. This is, however, not always gative, or simply decomposable. true, and the cases are often more complex than they might Paul Pierre BROCA (1824–1880) was also a defender of initially appear. “organology,” retaining both the discrete localizations of Numerous examples of functional decomposition are the phrenologists and the view that the size of organs was available from recent work in cognitive science. It is responsible for differing mental abilities. Following Jean common, as noted, to analyze memory into distinctive Baptiste Bouillard (1796–1881), Broca emphasized the subsystems. Commonly, it is assumed that there are at importance of dysfunction in determining functional orga- least two stages, presumably with discrete physiological nization. By August 1861, Broca had described in some mechanisms: the first process is short-lived, lasting from detail the anatomical changes accompanying a disorder of minutes to hours, and the second is of indefinite duration. speech that he called “aphemia,” and that we would Conventionally, this distinction between short-term and describe as an APHASIA. The patient, known as “Tan,” lost long-term memory is assayed by recall tests. This is by the ability to speak by the time he was thirty, and over the no means the only decomposition of memory, and is any- years his case degenerated. Broca relied on interviews thing but unproblematic; more specifically, the experi- with the hospital staff to discover that Tan’s initial “loss of mental evidence leaves it unclear whether the distinction articulate language” was due to a focal lesion in the fron- between short- and long-term memory is a distinction tal lobe. Broca’s conclusion was that there were a variety between modules, or modes of processing, and whether of “organs” corresponding to discrete mental functions. short-term memory is a unitary entity. Experimentation Karl Wernicke (1848–1905) subsequently elaborated the in memory typically involves some measure of retention basic model, reframing it in terms of an associationistic based on recall or recognition of some predetermined psychology rather than a faculty psychology and distin- material, and more recently using dual tasks in parallel. guishing sensory and motor aphasias. Wernicke concluded This research has led to a variety of ways of understand- that there was a series of discrete loci mediating the com- ing the organization of memory, including distinctions prehension and production of speech. On the basis of clin- between working memory, semantic memory, and declar- ical observations, Wernicke concluded there were three ative memory. There is currently no clear consensus con- distinctive “centers” associated with language use: a cen- cerning the most appropriate theory, and no model that ter for the acoustic representations of speech, a center for naturally accommodates the entire range of the phenom- motor representations of speech, and a center for concepts ena. In a similar way, linguistic competence is generally typically mediating between the two. Disruptions of the understood as the product of a set of distinct subsystems. various associations between these centers resulted in the Wernicke’s distinction between comprehension and pro- various aphasias. The resulting functional decomposition duction has been replaced with distinct processes for language use thus had at least three components, and a involved in language use, typically distinguishing linear organization: the output of one “organ” serves as between semantic and syntactic functions. Again, this the input for the next, though the function performed or decomposition is not unproblematic, and there is some realized by each module is intrinsically determined. This evidence suggesting that such decompositions do not basic model has since been elaborated by a number of yield functionally independent subsystems. clinical neurologists, including Norman GESCHWIND. The See also COGNITIVE ARCHITECTURE; DOMAIN SPECIFIC- organization is no longer aggregative, but sequential, with ITY; HEMISPHERIC SPECIALIZATION; IMPLICIT VS. EXPLICIT relatively independent functional units. This is near MEMORY; LANGUAGE, NEURAL BASIS OF; MEMORY, HUMAN decomposability. NEUROPSYCHOLOGY A commitment to functional decomposition has contin- —Robert C. Richardson ued in a variety of forms in more recent work in cognitive science, including the new “organology” of Noam Chomsky References (1980), the “modularity” defended by Jerry Fodor (1983), and the FUNCTIONALISM of William Lycan (1987). Steven Atkinson, R., and R. Shiffrin. (1968). Human memory: A proposed Pinker (1994), for example, argues that language is an abil- system and its control processes. In K. W. Spence and J. T. ity that is relatively independent of cognitive abilities in Spence, Eds., The Psychology of Learning and Motivation, vol. general. The clear implication of such independence is that 2. New York: Academic Press. Functional Role Semantics 331 “use” theory of meaning, according to which the meaning of Broca, P. (1861a). Perte de la parole. Bulletins de la Société Anthropologie 2: 235–238. a word is its use in communication and more generally, in Broca, P. (1861b). Remarques sur le siêge de la faculté suivies social interaction. FRS supplements external use by includ- d’une observation d’aphémie. Bulletin de la Société ing the role of a symbol inside a computer or a brain. The Anatomique de Paris 6: 343–357. uses appealed to are not just actual, but also counterfactual: Chomsky, N. (1980). Rules and Representations. New York: not only what effects a thought does have, but also what Columbia University Press. effects it would have had if stimuli or other states had dif- Fodor, J. A. (1983). Modularity of Mind. Cambridge, MA: MIT fered. The view has arisen separately in philosophy (where Press/Bradford Books. it is sometimes called “inferential,” or “functional” role Lycan, W. (1987). Consciousness. Cambridge, MA: MIT Press/ semantics) and in cognitive science (where it is sometimes Bradford Books. called “procedural semantics”). The view originated with Neisser, U. (1967). Cognitive Psychology. New York: Appleton- Century-Crofts. Wittgenstein and Sellars, but the source in contemporary Pinker, S. (1974). The Language Instinct. New York: William Mor- philosophy is a series of papers by Harman (see his 1987) row. and Field (1977). Other proponents in philosophy have Simon, H. A. (1969). The Sciences of the Artificial. Cambridge, included Block, Horwich, Loar, McGinn, and Peacocke; in MA: MIT Press. cognitive science, they include Woods, Miller, and Johnson- Wernicke, C. (1874). Der Aphasiche Symptomcomplex: Eine Psy- Laird. chologische Studie auf Anatomischer Basis. Breslau: Cohen FRS is motivated in part by the fact that many terms and Weigert. seem definable only in conjunction with one another, and not in terms outside of the circle they form. For example, in Further Readings learning the theoretical terms of Newtonian mechanics— Amundson, R., and G. V. Lauder. (1994). Function without pur- force, mass, kinetic energy, momentum, and so on—we do pose: the uses of causal role function in evolutionary biology. not learn definitions outside the circle. There are no such Biology and Philosophy 9: 443–470. definitions. We learn the terms by learning how to use them Bechtel, W., and R. C. Richardson. (1993). Discovering Complex- in our thought processes, especially in solving problems. ity. Princeton: Princeton University Press. Indeed, FRS explains the fact that modern scientists cannot Bradley, D. C., M. F. Garrett, and E. Zurif. (1980). Syntactic defi- understand the phlogiston theory without learning elements cits in Broca’s aphasia. In D. Caplan, Ed., Biological Studies of of an old language that expresses the old concepts. The Mental Processes. Cambridge, MA: MIT Press, pp. 269–286. functional role of, for example, “principle” as used by phlo- Cummins, R. (1983). The Nature of Psychological Explanation. giston theorists is very different from the functional role of Cambridge, MA: MIT Press. Gregory, R. L. (1961). The brain as an engineering problem. In W. any term or complex of terms of modern physics, and H. Thorpe and O. L. Zangwill, Eds., Current Problems in Ani- hence we must acquire some approximation of the eigh- mal Behaviour. Cambridge: Cambridge University Press, pp. teenth century functional roles if we want to understand 307–330. their ideas. Gregory, R. L. (1968). Models and the localization of function in FRS seems to give a plausible account of the meanings the central nervous system. In C. R. Evans and A. D. J. Robert- of the logical connectives. For example, we could specify son, Eds., Key Papers: Cybernetics. London: Butterworths. the meaning of “and” by noting that certain inferences—for Gregory, R. L. (1981). Mind in Science. Cambridge: Cambridge example, the inferences from sentences p and q to p and q, University Press. and the inference from p and q to p—have a special status Johnson-Laird, P. N. (1983). Mental Models. Cambridge: Harvard (they are “primitively compelling” in Peacocke’s 1992 ter- University Press. Schacter, D. L. (1993). Memory. In M. I. Posner, Ed., Foundations minology). But it may be said that the logical connectives of Cognitive Science. Cambridge, MA: MIT Press, pp. 683– are a poor model for language and for concepts more gener- 725. ally. One of the most important features of our CONCEPTS is that they refer—that is, that they pick out objects in the Functional Explanation world. In part for this reason, many theorists prefer a two-factor version of FRS. On this view, meaning consists of an inter- See EXPLANATION; FUNCTIONAL DECOMPOSITION nal, “narrow” aspect of meaning—which is handled by functional roles that are within the body—and an external Functional Grammar referential/truth-theoretic aspect of meaning. According to the external factor, “Superman flies” and “Clark Kent flies” are semantically the same because Superman = Clark Kent; See LEXICAL FUNCTIONAL GRAMMAR the internal factor is what distinguishes them. But the inter- nal factor counts “Water is more greenish than bluish” as Functional Role Semantics semantically the same in my mouth as in the mouth of my twin on TWIN EARTH. In this case, it is the external factor According to functional role semantics (FRS), the meaning that distinguishes them. of a MENTAL REPRESENTATION is its role in the cognitive Two-factor theories gain some independent plausibility life of the agent, for example in perception, thought and from the need of them to account for indexical thought and DECISION MAKING. It is an extension of the well-known assertions, assertions whose truth depends on facts about 332 Functionalism identity of meaning. That may be all we need for making when and where they were made and by whom. For exam- sense of psychological generalizations, interpersonal com- ple, suppose that you and I say “I am ill.” One aspect of the parisons, and the processes of reasoning and changing one’s meaning of “I” is common to us, another aspect is different. mind. What is the same is that our terms are both used according to the rule that they refer to the speaker; what is different is See also INDIVIDUALISM; NARROW CONTENT; REFER- that the speakers are different. White (1982) generalized this ENCE, THEORIES OF; SEMANTICS; SENSE AND REFERENCE distinction to apply to the internal and external factors for —Ned Block all referring expressions, not just INDEXICALS. In a two-factor account, the functional roles stop at the References skin in sense and effector organs; they are “short-arm” roles. But FRS can also be held in a one-factor version in Block, N. (1987). Functional role and truth conditions. Proceed- ings of the Aristotelian Society LXI: 157–181. which the functional roles reach out into the world—these Field, H. (1977). Logic, meaning and conceptual role. Journal of roles are “long-arm.” Harman (1987) has advocated a one- Philosophy 69: 379–408. factor account that includes in the long-arm roles much of Fodor, J., and E. LePore. (1992). Holism: A Shoppers’ Guide. the machinery that a two-factor theorist includes in the ref- Oxford: Blackwell. erential factor, but without any commitment to a separable Harman, G. (1987). (Non-solipsistic) Conceptual Role Semantics. narrow aspect of meaning. Harman’s approach and the two- In E. Lepore, Ed., New Directions in Semantics. London: Aca- factor theory show that the general approach of FRS is demic Press. actually compatible with metaphysical accounts of refer- Horwich, P. (1994). What it is like to be a deflationary theory of ence such as the causal theory or teleological theories, for meaning. In E. Villanueva, Ed., Philosophical Issues 5: Truth they can be taken to be partial specifications of roles. and Rationality. Ridgeview, pp. 133–154. White, S. (1982). Partial character and the language of thought. Actual functional roles involve errors, even dispositions Pacific Philosophical Quarterly 63: 347–365. to err. For example, in applying the word dog to candidate dogs, one will make errors, for example in mistaking coy- Further Readings otes for dogs (see Fodor 1987). This problem arises in one form or another for all naturalistic theories of truth and ref- Block, N. (1986). Advertisement for a semantics for psychology. erence, but in the case of FRS it applies to erroneous infer- Midwest Studies in Philosophy 10. ences as well as to erroneous applications of words to Devitt, M. (1996). Coming to Our Senses. New York: Cambridge things. Among all the conceptual connections of a symbol University Press. Fodor, J. (1978). Tom Swift and his procedural grandmother. In with other symbols, or (in the case of long-arm roles) with Representations. Sussex: Harvester. the world, which ones are correct and which ones are Fodor, J. (1987). Psychosemantics. Cambridge, MA: MIT Press. errors? One line of reply is to attempt to specify some sort Kripke, S. (1982). Wittgenstein: On Rules and Private Language. of naturalistic idealization that specifies roles that abstract Oxford: Blackwell. away from error, in the way that laws of free fall abstract Johnson-Laird, P. (1977). Procedural Semantics. Cognition 5: 189– away from friction. 214. FRS is often viewed as essentially holistic, but the FRS Loar, B. (1981). Mind and Meaning. Cambridge: Cambridge Uni- theorist does have the option of regarding some proper sub- versity Press. set of the functional roles in which an expression partici- McGinn, C. (1982). The structure of content. In A. Woodfield, Ed., pates as the ones that constitute its meaning. One natural Thought and Object. Oxford: Clarendon Press. Miller, G., and P. Johnson-Laird. (1976). Language and Percep- and common view of what distinguishes the meaning-con- tion. Cambridge, MA: MIT Press. stitutive roles is that they are “analytic.” Proponents of FRS Peacocke, C. (1992). A Theory of Concepts. Cambridge, MA: MIT are thus viewed as having to choose between accepting Press. holism and accepting that this distinction between the ana- Wittgenstein, L. (1953). The Philosophical Investigations. New lytic and synthetic is scientifically respectable, a claim that York: Macmillan. has been challenged by Quine. Indeed, Fodor and Lepore Woods, W. (1981). Procedural Semantics as a theory of meaning. (1992) argue that, lacking an analytic/synthetic distinction, In A. Joshi, B. Webber, and I. Sag, Eds., Elements of Discourse FRS is committed to semantic holism, regarding the mean- Understanding. Cambridge, MA: MIT Press. ing of any expression as depending on its inferential rela- tions to every other expression in the language. This, they Functionalism argue, amounts to the denial of a psychologically viable account of meaning. Compare neurons and neutrons to planets and pendula. Proponents of FRS can reply that the view is not commit- They all cluster into kinds or categories conforming to ted to regarding what is meaning constitutive as analytic. In nomic generalizations and comporting with scientific inves- terms of our earlier two-factor account, they can, for exam- tigation. However, whereas all neurons and neutrons must ple, regard the meaning-constitutive roles as those that are be composed of distinctive types of matter structured in primitively compelling, or perhaps as ones that are explana- ruthlessly precise ways, individual planets and pendula can torily basic: they are the roles that explain other roles (see be made of wildly disparate sorts of differently structured Horwich 1994). Another approach to accommodating stuff. Neurons and neutrons are examples of physical kinds; holism with a psychologically viable account of meaning is planets and pendula exemplify functional kinds. Physical to substitute close enough similarity of meaning for strict Functionalism 333 alism is foundational to those cognitive sciences that would kinds are identified by their material composition, which in abstract from details of physical implementation in order to turn determines their conformity to the laws of nature. discern principles common to all possible cognizers, think- Functional kinds are not identified by their material compo- ers who need not share any physical features immediately sition but rather by their activities or tendencies. All planets, relevant to thought. Such a research strategy befriends Arti- no matter the differences in their composition, orbit or tend ficial Intelligence inasmuch as it attends to algorithms, pro- to. All pendula, no matter the differences in their composi- grams, and computation rather than cortex, ganglia, and tion, oscillate or tend to. neurotransmitters. True, the study of human or mammalian What, then, of minds or mental states, kinds that process cognition might focus on the physical properties of the information and control intelligent activity or behavior? Do brain. But if functionalism is true, the most general features they define physical or functional kinds? Naturally occur- of cognition must be independent of neurology. ring minds, at least those most familiar to us, are brains. The According to functionalism, a mind is a physical system human mind is most certainly the human brain; the mammal or device—with a host of possible internal states—normally mind, the mammal brain (Kak 1996). Hence, under the situated in an environment itself consisting of an array of assumption that brains are physical kinds, we might conjec- possible external states. External states can induce changes ture that all minds must be brains and, therefore, physical in such a device’s internal states, and fluctuations in these kinds. If so, we should study the brain if curious about the internal states can cause subsequent internal changes deter- mind. mining the device’s overt behavior. Standard formulations However, perhaps we are misled by our familiar, local of functionalism accommodate the mind’s management of and possibly parochial sample of minds. If all the pendula at information by treating the internal, that is, cognitive, states hand happened to be aluminum, we might, failing to imag- of the device as its representations, symbols, or signs of its ine copper ones, mistakenly suppose that pendula must—of world (Dennett 1978; Fodor 1980; Dretske 1981). Hence, necessity—be aluminum. Maybe, then, we should ask disciplined change in internal state amounts to change in whether it is possible that minds occur in structures other representation or manipulation of information. than brains. Might there be silicon Martians capable of read- Some (Pylyshyn 1985), but not all (Lycan 1981), formu- ing Finnegan’s Wake and solving differential equations? lations of functionalism model the mind in terms of a TUR- Such fabled creatures would have minds although, being sil- icon instead of carbon, they could not have brains. Moving ING machine (Turing 1950), perhaps in the form of a away from fiction and closer toward fact, what should we classical digital computer. A Turing machine possesses a make of artificially intelligent devices? They can be liber- segmented tape with segments corresponding to a cognitive ated from human biochemistry while exhibiting talents that device’s internal states or representations. The machine is appear to demand the kind of cognition that fuels much of designed to read from and write to segments of the tape what is psychologically distinctive in human activity. according to rules that themselves are sensitive to how the Possibly, then, some minds are not brains. These minds tape may be antecedently marked. If the device is an infor- might be made of virtually any sort of material as long as it mation processor, the marks on the tape can be viewed as should be so organized as to process information, control semantically disciplined symbols that resonate to the envi- behavior and generally support the sort of performances ronment and induce the machine appropriately to respond indicative of minds. Minds would then be functional, not (Haugeland 1981). For functionalism, then, the mind, like a physical, kinds. Respectively like planets and pendula, computer, may process information and control behavior minds might arise naturally or artificially. Their coalescing simply by implementing a Turing machine. into a single unified kind would be determined by their pro- In allowing that minds are functional kinds, one supposes clivity to process information and to control behavior inde- that mental state types (for example, believing, desiring, pendently of the stuff in which individual minds might willing, hoping, feeling, and sensing) are themselves func- happen to reside. Terrestrial evolution may here have settled tionally characterized. Thus belief, as a type of mental state, on brains as the local natural solution to the problem of would be a kind of mental state with characteristic causes evolving minds. Still, because differing local pressures and and effects (Fodor 1975; Block and Fodor 1972). The idea opportunities may induce evolution to offer up alternative can be extended to identify or individuate specific beliefs solutions to the same problem (say, mammals versus marsu- (Harman 1973; Field 1977). The belief, say, that snow is pials), evolution could develop minds from radically diver- white might be identified by its unique causal position in the gent kinds of matter. Should craft follow suit, art might mental economy (see FUNCTIONAL ROLE SEMANTICS). On fabricate intelligence in any computational medium. Func- this model, specific mental states are aligned with the unob- tionalism, then, is the thesis that minds are functional kinds servable or theoretical states of science generally and identi- (Putnam 1960; Armstrong 1968; Lewis 1972; Cummins fied by their peculiar potential causal relations. 1983). Although functionalism has been, and remains, the dom- The significance of functionalism for the study of the inant position in the philosophy of mind since at least 1970, mind is profound, for it liberates cognitive science from it remains an unhappy hostage to several important objec- concern with how the mind is embodied or composed. tions. First, the argument above in favor of functionalism Given functionalism, it may be true that every individual begins with a premise about how it is possible for the mind mind is itself a physical structure. Nevertheless, by the to be realized or implemented outside of the brain. This lights of functionalism, physical structure is utterly irrele- premise is dramatized by supposing, for example, that it is vant to the deep nature of the mind. Consequently, function- possible that carbon-free Martians have minds but lack 334 Functionalism brains. However, what justifies the crucial assumption of the realize, genuinely realize in exactly the manner that you do, real possibility of brainless, silicon Martian minds? that its knight is threatened but that the knight’s loss It is no answer to reply that anything imaginable is possi- ensures ultimate success? Or is the computer a semanti- ble. For in that case, one can evidently imagine that it is nec- cally impoverished device designed merely to mimic you essary that minds are brains. If the imaginable is possible, it and your internal mental states without ever representing would follow that it is possible that it is necessary that its world in anything like the manner in which you repre- minds are brains. However, on at least one version of modal sent and recognize your world through your mental states logic it is axiomatic that whatever is possibly necessary is (Searle 1980; Dennett and Searle 1982)? If you and the simply necessary. Hence, if it is possible that it is necessary computer differ in how you represent the world, if you rep- that minds are brains, it is simply necessary that minds are resent the world but the computer does not, then functional- brains. This, however, is in flat contradiction to the premise ism may have obscured a fundamentally important aspect that launches functionalism, namely the premise that it is of our cognition. possible that minds are not brains! Evidently, what is des- See also COMPUTATION AND THE BRAIN; FOLK PSYCHOL- perately wanting here is a reasonable way of justifying pre- OGY; FUNCTIONAL DECOMPOSITION; MENTAL REPRESENTA- mises about what is genuinely possible or what can be TION; MIND-BODY PROBLEM; PHYSICALISM known to be possible. Until the functionalist can certify the —J. Christopher Maloney possibility of a mind without a brain, the argument from such a possibility to the plausibility of functionalism appears disturbingly inconclusive (Maloney 1987). References Beyond this objection to the functionalist program is the Armstrong, D. (1968). A Materialist Theory of the Mind. London: worry that functionalism, if unwittingly in the service of a Routledge and Kegan Paul. false psychology, could fly in the face of good scientific Block, N. (1978). Troubles with functionalism. In C. W. Savage, Ed., practice. To see this, suppose that minds are defined in terms Perception and Cognition: Issues in the Philosophy of Science. of current (perhaps popular or folk) psychology and that this Minneapolis: University of Minnesota Press, 9: 261–325. psychology turns out, unsurprisingly, to be false. In this Block, N., and J. Fodor. (1972). What psychological states are not. case, minds—as defined by a false theory—would not be Philosophical Review 81: 159–181. real, and that would be the deep and true reason why minds Churchland, P. (1981). Eliminative materialism and the proposi- are not identical with real physical types such as brains. tional attitudes. Journal of Philosophy LXXVIII: 67–90. Nevertheless, a misguided functionalism, because it con- Cummins, R. (1983). The Nature of Psychological Explanation. Cambridge, MA: MIT Press/Bradford Books. strues the mind as “whatever satisfies the principles of (false Dennett, D. (1978). Brainstorms. Montgomery, VT: Bradford current) psychology,” would wrongly bless the discontinu- Books. ity of mind and brain and insist on the reality of mind disen- Dennett, D., and J. Searle. (1982). The myth of the computer: an franchised from any physical kind. Put differently, our exchange. New York Review of Books June 24: 56–57. failure to identify phlogiston with any physical kind prop- Dretske, F. I. (1981). Knowledge and the Flow of Information. erly leads us to repudiate phlogiston rather than to elevate it Cambridge, MA: Bradford Books/MIT Press. to a functional kind. So too, the objection goes, perhaps our Field, H. (1977). Mental representations. Erkenntnis 13: 9–16. failure to identify the mind with a physical type should lead Fodor, J. (1975). The Language of Thought. New York: Thomas us to repudiate the mind rather than elevate it to a functional Crowell. kind (Churchland 1981). Fodor, J. (1980). Methodological solipsism considered as a research strategy in cognitive psychology. The Behavioral and Others object to functionalism charging that it ignores the Brain Sciences 3: 63–109. (presumed) centrality of CONSCIOUSNESS in cognition (Shoe- Harman, G. (1973). Thought. Princeton: Princeton University maker 1975; Block 1978; Lewis 1980). They argue that func- Press. tionally identical persons could differ in how they feel, that is, Haugeland, J. (1981). On the nature and plausibility of cognitiv- in their conscious, qualitative, or affective states. For exam- ism. In J. Haugeland, Ed., Mind Design. Cambridge, MA: MIT ple, you and I might be functionally isomorphic in the pres- Press/Bradford Books, pp. 243–281. ence of a stimulus while we differ in our consciousness of it. Kak, S. C. (1996). Can we define levels of artificial intelligence? You and I might both see the same apple and treat it much the Journal of Intelligent Systems 6: 133–144. same. Yet, this functional congruence might mask dramatic Lewis, D. (1972). Psychophysical and theoretical identifications. differences in our color QUALIA, differences that might have Australasian Journal of Philosophy 50: 249–258. Lewis, D. (1980). Mad pain and martian pain. In N. Block, Ed., no behavioral or functional manifestation. If these conscious, Readings in the Philosophy of Psychology, I. Cambridge, MA: qualitative differences differentiate our mental states, func- MIT Press, pp. 216–222. tionalism would seem unable to recognize them. Lycan, W. (1981). Form, function and feel. Journal of Philosophy Finally, mental states are semantically significant repre- 78: 24–50. sentational states. As you play chess, you are thinking Maloney, J. C. (1987). The Mundane Matter of the Mental Lan- about the game. You realize that your knight is threatened guage. Cambridge: Cambridge University Press. but that its loss shall ensure the success of the trap you have Putnam, H. (1960). Minds and machines. In S. Hook, Ed., Dimen- set. But consider a computer programmed perfectly to sions of Mind. New York: N.Y.U. Press. Reprinted along with emulate you at chess. It is your functional equivalent. other relevant papers in Putnam’s Mind, Language and Reality, Hence, according to functionalism it has the same mental Philosophical Papers 2. Cambridge: Cambridge University Press, 1975. states as do you. But does it think the same as you; does it Fuzzy Logic 335 constraint on the proportion of young students among stu- Pylyshyn, Z. (1985). Computation and Cognition: Toward a Foun- dation for Cognitive Science. Cambridge, MA: MIT Press/ dents, with the fuzzy quantifier “most” playing the role of a Bradford Books. fuzzy constraint on the proportion. The logical facet of FL Searle, J. (1980). Minds, brains and computers. The Behavioral plays a pivotal role in the applications of FL to knowledge and Brain Sciences 3: 417–457 (including peer review). representation and to inference from information that is Shoemaker, S. (1975). Functionalism and qualia. Philosophical imprecise, incomplete, uncertain, or partially true. Studies 27: 291–315. The set-theoretic facet of FL, FL/S, is concerned with Turing, A. (1950). Computing machinery and intelligence. Mind fuzzy sets, that is, classes or sets whose boundaries are not 59: 433–460. sharply defined. The initial development of FL was focused on this facet. Most of the applications of FL in mathematics Fuzzy Logic have been and continue to be related to the set-theoretic facet. Among the examples of such applications are: fuzzy What is fuzzy logic? This question does not have a simple topology, fuzzy groups, fuzzy differential equations, and answer because fuzzy logic, or FL for short, has many dis- fuzzy arithmetic. Actually, any concept, method or theory tinct facets—facets that overlap and have unsharp bound- can be generalized by fuzzification, that is, by replacing the aries (Zadeh 1996a; Dubois, Prade, and Yager 1993). concept of a set with that of a fuzzy set. Fuzzification serves To a first approximation, fuzzy logic is a body of con- an important purpose: it provides a way of constructing the- cepts, constructs, and techniques that relate to modes of rea- ories that are more general and more reflective of the impre- soning that are approximate rather than exact. Much of— cision of the real world than theories in which the sets are perhaps most—human reasoning is approximate in nature. assumed to be crisp. In this perspective, the role model for fuzzy logic is the The relational facet of FL, FL/R, is concerned in the human mind. By contrast, classical LOGIC is normative in main with representation and manipulation of imprecisely spirit in the sense that it is aimed at serving as a role model defined functions and relations. It is this facet of FL that for human reasoning rather than having the human mind as plays a pivotal role in its applications to systems analysis its role model. Fundamentally, fuzzy logic is a generaliza- and control. The three basic concepts that lie at the core of tion of classical logic and rests on the same mathematical this facet of FL are those of a linguistic variable, fuzzy if- foundations. However, as a generalization that reflects the then rule, and fuzzy graph. The relational facet of FL pro- pervasive imprecision of human reasoning, fuzzy logic is vides a foundation for the fuzzy-logic-based methodology much better suited than classical logic to serve as the logic of computing with words (CW). of human cognition. Basically, a linguistic variable is a variable whose values Among the many facets of fuzzy logic there are four that are words drawn from a natural or synthetic language, with stand out in importance. They are the following: words playing the role of labels of fuzzy sets. For example, Height is a linguistic variable if its values are assumed to be: 1. the logical facet, FL/L; tall, very tall, quite tall, short, not very short, and so on. The 2. the set-theoretic facet, FL/S; concept of a linguistic variable plays a fundamentally 3. the relational facet, FL/R; important role in fuzzy logic and in particular, in computing 4. the epistemic facet, FL/E (see figure 1). with words. The use of words instead of—or in addition The logical facet of FL, FL/L, is a logical system or, to—numbers serves two major purposes: (1) exploitation of more accurately, a collection of logical systems that include the tolerance for imprecision; and (2) reflection of the finite as a special case both two-valued and multiple-valued sys- ability of the human mind to resolve detail and store precise tems. As in any logical system, at the core of the logical information. facet of FL lies a system of rules of inference. In FL/L, how- The epistemic facet of FL, FL/E, is linked to its logical ever, the rules of inference play the role of rules that govern facet and is focused on the applications of FL to knowledge propagation of various types of fuzzy constraints. Concomi- representation, information systems, fuzzy databases, and tantly, a proposition, p, is viewed as a fuzzy constraint on an the theories of possibility and probability. A particularly explicitly or implicitly defined variable. For example, the important application area for the epistemic facet of FL proposition “Mary is young” may be viewed as a fuzzy con- relates to the conception and design of information/intelli- straint on the variable Age (Mary), with “young” playing gent systems. the role of a constraining fuzzy relation. Similarly, the prop- At the core of FL lie two basic concepts: (1) fuzziness/ osition “Most students are young” may be viewed as a fuzzy fuzzification; and (2) granularity/granulation. As was al- luded to already, fuzziness is a condition that relates to classes whose boundaries are not sharply defined, whereas fuzzification refers to replacing a crisp set, that is, a set with sharply defined boundaries, with a set whose boundaries are fuzzy. For example, the number 5 is fuzzified when it is transformed into approximately 5. In a similar spirit, granularity relates to clumpiness of structure, whereas granulation refers to partitioning an object into a collection of granules, with a granule being a clump of objects (points) drawn together by indistinguishability, Figure 1. Conceptual structure of fuzzy logic. 336 Game-Playing Systems similarity, proximity, or functionality. For example, the gran- A fuzzy graph is a union of fuzzy points (granules) each ules of an article might be the introduction, section 1, section of which represents a fuzzy if-then rule. A fuzzy graph of a 2, and so forth. Similarly, the granules of a human body function f may be interpreted as a granular approximation to might be the head, neck, chest, stomach, legs, and so on. f. In most of the practical applications of fuzzy logic, fuzzy Granulation may be crisp or fuzzy, dense or sparse, physical graphs are employed in this role as granular approximations or mental. to functions and relations. A concept that plays a pivotal role in fuzzy logic is that In computing with words, the initial data set (IDS) and of fuzzy information granulation, or fuzzy IG, for short. In the terminal data set (TDS) are assumed to consist of collec- crisp IG, the granules are crisp, whereas in fuzzy IG the tions of propositions expressed in a natural language. An granules are fuzzy. For example, when the variable Age is input interface transforms IDS into a system of fuzzy con- granulated into the time intervals {0,1}, {1,2}, {2,3}, . . . , straints that are propagated from premises to conclusions the granules {0,1}, {1,2}, {2,3}, . . . are crisp; when Age is through the use of the inference rules in fuzzy logic. The treated as a linguistic variable, the fuzzy sets labeled young, output interface transforms the conclusions into TDS. middle-aged, old, are fuzzy granules that play the role of The machinery for computing with words instead of or in linguistic values of Age. The importance of fuzzy logic— addition to numbers may be viewed as one of the principal especially in the realm of applications—derives in large contributions of fuzzy logic. In a way, computing with words measure from the fact that FL is the only methodology that may be regarded as a step toward a better understanding of provides a machinery for fuzzy information granulation. In the remarkable human ability to perform complex tasks the figure, the core concept of fuzzy granulation is repre- without any measurements and any numerical computations. sented as the conjunction F. G. See also AMBIGUITY; COMPUTATION; DEDUCTIVE REA- The point of departure in fuzzy logic is the concept of a SONING; UNCERTAINTY fuzzy set. A fuzzy set A in a universe U is characterized by —Lotfi A. Zadeh its grade of membership µA, which associates with every point u in U its grade of membership µA(u), with µA(u) tak- References ing values in the unit interval [0,1]. More generally, µA may take values in a partially ordered set. For crisp sets, the con- Dubois, D., H. Prade, and R. Yager. (1993). Readings in Fuzzy Sets cept of a membership function reduces to the familiar con- for Intelligent Systems. San Mateo: Morgan Kaufmann. cept of a characteristic function, with µA(u) being 1 or 0 Zadeh, L. A. (1996a). Fuzzy Sets, Fuzzy Logic and Fuzzy Systems. depending, respectively, on whether u belongs or does not Singapore: World Scientific. Zadeh, L. A. (1996b). Fuzzy logic and the calculi of fuzzy rules belong to A. and fuzzy graphs: a precis. Multiple Valued Logic 1: 1–38. Two interpretations of A play basic roles in fuzzy logic: Zadeh, L. A. (1996c). Fuzzy logic = computing with words. IEEE possibilistic and veristic. More specifically, assume that X is Transactions on Fuzzy Systems 4(2): 103–111. a variable taking values in U, and A is a fuzzy set in U. In Zadeh, L. A. (1997). Toward a theory of fuzzy information granu- the possibilistic interpretation, in the proposition X is A, A lation and its centrality in human reasoning and fuzzy logic. plays the role of the possibility distribution of X, and µA(u) Fuzzy Sets and Systems 90: 111–127. is the possibility that X can take the value u. In the veristic interpretation, µA(u) is the truth value (verity) of the propo- Further Readings sition X = u. As an illustration, in the proposition Mary is young if µyoung(25) = 0.8, then the possibility that Mary is Bandemer, H., and S. Gottwald. (1995). Fuzzy Sets, Fuzzy Logic and Fuzzy Methods with Applications. Chichester: Wiley. twenty-five given that Mary is young is 0.8. Reciprocally, Bouchon-Meunier, B., R. Yager, and L. A. Zadeh, Eds. (1995). given that Mary is 25, the truth value (verity) of the proposi- Fuzzy Logic and Soft Computing. Singapore: World Scientific tion Mary is young is 0.8. Publishing. In addition to the concept of a fuzzy set, the basic con- Dubois, D., H. Prade, and R. R. Yager, Eds. (1997). Fuzzy Infor- cepts in fuzzy logic are those of a linguistic variable, fuzzy mation Engineering: A Guided Tour of Applications. New if-then rule, and fuzzy graph. In combination, these con- York: Wiley. cepts provide a foundation for the theory of fuzzy informa- Kruse, R., J. Gebhardt, and F. Klawonn. (1994). Foundations of tion granulation (Zadeh 1997), the calculus of fuzzy if-then Fuzzy Systems. Chichester: Wiley. rules (Zadeh 1996a), and, ultimately, the methodology of computing with words (Zadeh 1996b). Most of the practical Game-Playing Systems applications of fuzzy logic, especially in the realm of con- trol and information/intelligent systems, involve the use of the machinery of computing with words. Games have long been popular in Artificial Intelligence as Fuzzy if-then rules can assume a variety of forms. The idealized domains suitable for research into various aspects simplest rule can be expressed as: if X is A then Y is B, of search, KNOWLEDGE REPRESENTATION, and the interac- where X and Y are variables taking values in universes of tion between the two. CHESS, in particular, has been dubbed discourse U and V, respectively; and A and B are fuzzy sets the “Drosophila of Artificial Intelligence” (McCarthy in U and V. Generally, A and B play the role of linguistic 1990), suggesting that the role of games in Artificial Intel- values of X and Y; for example, if Pressure is high then Vol- ligence is akin to that of the fruit fly in genetic research. In ume is low. In practice, the membership functions of A and each case, certain practical advantages compensate for the B are usually triangular or trapezoidal. lack of intrinsic importance of the given problem. In genetic Game-Playing Systems 337 research. In each case, certain practical advantages compen- In spite of the success of high-performance alpha-beta- sate for the lack of intrinsic importance of the given prob- based game-playing systems, there has been limited trans- lem. In genetic research, fruit flies make it easy to maintain ference of ideas generated in this work to other areas of large populations with a short breeding cycle at low cost. In Artificial Intelligence (although see, for example, New- Artificial Intelligence research, games generally have rules born’s work on theorem proving; Newborn 1992). There that are well defined and can be stated in a few sentences, are, however, many alternatives to minimax alpha-beta that thus allowing for a relatively straightforward computer have been examined. Conspiracy numbers search (McAll- implementation. Yet the combinatorial complexity of inter- ester 1988) counts the number of positions whose evalua- esting games can create immensely difficult problems. It has tions must change in order for a different move choice to be taken many decades of research combined with sufficiently made. This idea has led to methods that are capable of solv- powerful computers in order to approximate the level of ing some interesting nontrivial games (Allis 1994). leading human experts in many popular games. And in some Decision-theoretic approaches to game playing, particularly games, human players still reign supreme. under constraints of limited resources (BOUNDED RATIO- Games can be classified according to a number of crite- NALITY), are promising and have broad applicability outside ria, among them number of players, perfect versus hidden the game-playing area. For example, Russell and Wefald information, presence of a stochastic element, zero-sum ver- (1991) reasoned specifically about when to terminate a sus non-zero-sum, average branching factor, and the size of search, based on the expected utility of further search and the state space. Different combinations of characteristics the cost of the time required for the additional work. Statis- emphasize different research issues. Much of the early tical methods for search and evaluation (Baum and Smith research in game-playing systems concentrated on two- 1997) also have shown promise, adding uncertainty to a person zero-sum games of perfect information with low or standard evaluation function and then using the inexact sta- moderate branching factors, in particular chess and check- tistical information to approximate the exploration of the ers. Claude Shannon’s 1950 paper on programming a com- most important positions. puter to play chess mapped out much territory for later MACHINE LEARNING has a long history of using games as researchers. Alan TURING wrote a chess program (Turing et domains for experimentation. Samuel’s checkers program al. 1953), which he hand-simulated in the early 1950s. The (Samuel 1959), originally developed in the 1950s, employed earliest fully functioning chess program was described in both a rote-learning scheme and a method for tuning the Bernstein et al. (1958). The first chess program demonstra- coefficients of his evaluation function. Many current chess bly superior to casual human chess players appeared in the and Othello programs use forms of rote learning to avoid los- mid-sixties (Greenblatt et al. 1967), and progress continued ing the same game twice. The Logistello program has also as faster machines became available and algorithms were used an automatically tuned evaluation function with excel- refined (Slate and Atkin 1977; Condon and Thompson lent results. However, the most noteworthy example of learn- 1982; Hyatt, Gower, and Nelson 1990; Berliner and Ebeling ing in game-playing systems is TD-Gammon (Tesauro 1995), 1990; Hsu et al. 1990). But it took until 1997 for a com- a neural network program for playing backgammon. For puter, the IBM Deep Blue chess machine, to defeat the games with stochastic elements, for instance the dice in back- human world chess champion, Gary Kasparov, in a regula- gammon, forward searching approaches are less efficient, tion match. which places a premium on the quality of the evaluation Much of the success in game-playing systems has come function. TD-Gammon used a reinforcement learning algo- from approaches based on depth-first minimax search with rithm to train its neural network solely by playing against alpha-beta pruning in two-person zero-sum games of perfect itself and learning from the results. This network, with or information. This is essentially a brute-force search tech- without some limited search, produces world-class play in nique, searching forward as many moves as possible in an backgammon (Tesauro and Galperin 1997). Reinforcement allotted time, assessing positions according to an evaluation learning also has applications in other decision making and function, and choosing the best move based on the minimax scheduling problems. principle. The evaluation function captures essential domain Some games are very difficult for computers at the knowledge. In fact, there is often a trade-off between the present time. Go programs are actively being developed quality of the evaluation function and the depth of search (Chen et al. 1990), but are far from the level of the best required to achieve a given level of play. Minimax search is human players. The alpha-beta search paradigm, which is so made more efficient through the use of alpha-beta pruning successful in chess, checkers, and the like, is not directly (Knuth and Moore 1975), which allows searching roughly applicable to Go because of the large branching factor. More twice as deep as would be possible in a pure minimax search. subtle approaches involving decomposition of a game state Notable examples of this approach include Deep Blue; Chi- into subproblems appear to be necessary. Hidden informa- nook (Schaeffer et al. 1992), which has defeated the world’s tion games, such as bridge, are also not appropriate for best checkers players; and Logistello (Buro 1995), which has direct application of the alpha-beta ALGORITHM, and have easily beaten the top human Othello players. The methods begun to receive more attention (Ginsberg 1996). Some used in these programs and others of this type have been other games are too difficult for even initial attempts. For constantly improved and refined, and include such tech- example, in the game of Nomic (Suber 1990) a player’s turn niques as iterative deepening, transposition tables, null-move involves an amendment or addition to an existing set of pruning, endgame databases, and singular extensions, as well rules. Initially players vote on proposed changes, but the as increasingly sophisticated evaluation functions. game can evolve into something completely different. There 338 Game Theory is no clear way at present to design a game-playing system Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development to play a game such as Nomic due to the tremendous 3: 210–229. amount of world knowledge required. Schaeffer, J., J. Culberson, N. Treloar, B. Knight, P. Lu, and D. Game-playing systems have helped illustrate the role of Szafron. (1992). A world championship caliber checkers pro- search and knowledge working together in systems for solv- gram. Artificial Intelligence 53: 273–290. ing complex problems, but games have also been useful as Shannon, C. (1950). Programming a computer for playing chess. domains for experimentation with various types of machine Philosophical Magazine 41: 256–275. learning and HEURISTIC SEARCH. The absence of significant Slate, D., and L. Atkin. (1977). Chess 4.5—The Northwestern learning capabilities in most game-playing systems, as well Chess Program. In P. Frey, Ed., Chess Skill in Man and as the difficulty in creating high-performance programs for Machine. Berlin: Springer, pp. 82–118. games such as Go, suggest that games are still fertile Suber, P. (1990). The Paradox of Self-Amendment: A Study of Law, Logic, Omnipotence and Change. New York: Peter Lang. domains for Artificial Intelligence research. Tesauro, G. (1995). Temporal difference learning and TD-Gam- See also EXPERTISE; GREEDY LOCAL SEARCH; NEURAL mon. Communications of the ACM 38: 58–68. NETWORKS; PROBLEM SOLVING Tesauro, G., and R. Galperin. (1997). On-line policy improvement using Monte-Carlo search. In M. I. Jordan and M. C. Mozer, Eds., —Murray Campbell Advances in Neural Information Processing Systems 9: Proceed- ings of the 1996 Conference. Cambridge, MA: MIT Press. References Turing, A., C. Strachey, M. Bates, and B. Bowden. (1953). Digital computers applied to games. In B. Bowden, Ed., Faster Than Allis, V. (1994). Searching for Solutions in Games and Artificial Thought. New York: Pitman, pp. 286–310. Intelligence. Maastricht: University of Limburg. Baum, E. B., and W. D. Smith. (1997). A bayesian approach to Game Theory game playing. Artificial Intelligence 97: 195–242. Berliner, H., and C. Ebeling. (1990). Hitech. In T. A. Marsland and J. Schaeffer, Eds., Computers, Chess, and Cognition. New Game theory is a mathematical framework designed for ana- York: Springer, pp. 79–110. lyzing the interaction between several agents whose deci- Bernstein, A., M. deV. Roberts, T. Arbuckle, and M. A. Belsky. sions affect each other. In a game-theoretic analysis, an (1958). A chess-playing program for the IBM 704. Proceedings interactive situation is described as a game: an abstract of the Western Joint Computer Conference. New York: The American Institute of Electrical Engineers. description of the players (agents), the courses of actions Buro, M. (1995). Statistical feature combination for the evaluation available to them, and their preferences over the possible out- of game positions. Journal of Artificial Intelligence Research 3: comes. The game-theoretic framework assumes that the play- 373–382. ers employ RATIONAL DECISION MAKING, that is, they act so Chen, K., A. Kierult, M. Muller, and J. Nievergelt. (1990). The as to achieve outcomes that they prefer (VON NEUMANN and design and evolution of Go explorer. In T. A. Marsland and Morgenstern 1944). Typically, preferences are modeled J. Schaeffer, Eds., Computers, Chess, and Cognition. New using numeric utilities, and players are assumed to be York: Springer, pp. 271–286. expected utility maximizers. Condon, J., and K. Thompson. (1982). Belle chess hardware. In M. Unlike decision making for a single agent, in the multi- Clarke, Ed., Advances in Computer Chess 3. New York: Perga- agent case this assumption is not enough to define an “opti- mon, pp. 45–54. Ginsberg, M. (1996). Partition search. Proceedings of AAAI-96. mal decision,” because the agent cannot unilaterally control Greenblatt, R. D., D. E. Eastlake, and S. D. Crocker. (1967). The the outcome. One of the roles of game theory is to define Greenblatt Chess Program. Proceedings of the Fall Joint Com- notions of “optimal solution” for different classes of games. puter Conference. The American Federation of Information These solutions assume that players reason strategically, Processing Societies. basing their decisions on their expectations regarding the Hsu, F., T. Anantharman, M. Campbell, and A. Nowatzyk. (1990). behavior of other players. Typically, players are assumed to Deep Thought. In T. A. Marsland and J. Schaeffer, Eds., Com- have common knowledge that they are all rational (see puters, Chess, and Cognition. New York: Springer, pp. 55–78. MODAL LOGIC). Hyatt, R., A. Gower, and H. Nelson. (1990). Cray Blitz. In T. A. A large part of game theory deals with noncooperative Marsland and J. Schaeffer, Eds., Computers, Chess, and Cogni- situations, where each player acts independently. In such a tion. New York: Springer, pp. 111–130. Knuth, D., and R. Moore. (1975). An analysis of alpha-beta prun- game, a strategy si for player i specifies the action player i ing. Artificial Intelligence 6: 293–326. should take in any state of the game. A solution to the game McAllester, D. (1988). Conspiracy numbers for min-max search. is a strategy combination s1, . . . ,sn satisfying certain opti- Artificial Intelligence 35: 287-310 mality conditions. McCarthy, J. (1990). Chess as the Drosophila of AI. In T. A. Most abstractly, a situation is represented as a strategic Marsland and J. Schaeffer, Eds., Computers, Chess, and Cogni- form game, where the possible strategies for the players are tion. New York: Springer, pp. 227–238. simply enumerated. Each strategy combination s1, . . . ,sn Newborn, M. (1992). A theorem-proving program for the IBM PC. leads to some outcome, whose value to player i is a payoff In D. Kopec and R. B. Thompson, Eds., Artificial Intelligence ui(s1, . . . ,sn). A two-player strategic form game is often and Intelligent Tutoring Systems. Upper Saddle River, NJ: represented by a matrix where the rows are player 1 strate- Prentice Hall, pp. 65–92. gies, the columns are player 2 strategies, and the matrix Russell, S., and E. Wefald. (1991). Principles of metareasoning. entries are the associated payoff pairs. Artificial Intelligence 49: 361–395. Game Theory 339 Figure 1. (a) Flipping Pennies. (b) Prisoner’s Dilemma: Two conspirators, in prison, are each given the opportunity to confess in return for a reduced prison sentence; the payoffs correspond to numbers of years in prison. The simplest game is a two-player zero-sum game, where Furthermore, many equilibria are unintuitive. In game the players’ interests are completely opposed (see figure 1). (b), the Nash equilibrium has a utility of –8,–8, which is Because any gain for one player corresponds to a loss for the worse for both players than the “desired” outcome –1,–1. other, each player should make a worst-case assumption There have been many attempts to find alternative models about the behavior of the other. Thus, it appears that player 1 where the desired outcome is the solution. Some success has should choose the maximin strategy s* that achieves max s1 been achieved in the case of infinitely repeated play where i 1 ij min s1j u1( s1, s1); player 2 has an analogous rational strategy the players have BOUNDED RATIONALITY (e.g., when their s* . Common knowledge of rationality implies that the “ratio- strategies are restricted to be finite-state AUTOMATA; Abreu 2 nal” strategy of one player will be known to the other. How- and Rubenstein 1992). ever, s* may not be optimal against s* , making it irrational for A more refined representation of a game takes into con- 1 2 player 1 to play as anticipated. This circularity can be avoided sideration the evolution of the game. A game in extensive if we allow the players to use randomness. In game (a), player form (Kuhn 1953) is a tree whose edges represent possible 2 can play the mixed strategy µ* where he chooses heads and actions and whose leaves represent outcomes. Most simply, 2 tails each with probability 1/2. If each player plays the maxi- the players have perfect information—any action taken is min mixed strategy µ*, we avoid the problem: µ* and µ* are revealed to all players. i 1 2 in equilibrium, that is, each is optimal against the other (von The notion of equilibrium often leads to unintuitive Neumann and Morgenstern 1944). results when applied to extensive-form games. For example, The equilibrium concept can be extended to general n- in game (a), the strategy combination (R, a) is one equilib- player games. A strategy combination µ* . . . µ* is said to rium: given that player 2 will play a, player 1 prefers R, and 1 n be in Nash equilibrium (Nash 1950) if no player i can bene- given that player 1 plays R, player 2’s choice is irrelevant. fit by deviating from it (playing a strategy µi ≠ µ* ). In game This equilibrium is sustained only by a noncredible “threat” i (b), the (unique) equilibrium is the nonrandomized strategy by player 2 to play the suboptimal move a. The extensive- combination (confess, confess). An equilibrium in mixed form equilibrium concept has been refined to deal with this strategies is always guaranteed to exist. issue, by adding the requirement that, at each point in the The Nash equilibrium is arguably the most fundamental game, the player’s chosen action be optimal (Selten 1965). concept in noncooperative games. However, several problems In our example, the optimal move for player 2 is b, and reduce its intuitive appeal. Many games have several very dif- therefore player 1’s optimal action is L. This process, ferent equilibria. In such games, it is not clear which equilib- whereby optimal moves are selected at the bottom of the rium should be the “recommended solution,” nor how the tree, and these determine optimal moves higher up, is called players can pick a single equilibrium without communication. backward induction. In the context of zero-sum games, the One important mechanism that addresses these concerns is algorithm is called the minimax algorithm, and is the funda- based on the assumption that the game is played repeatedly, mental technique used in GAME-PLAYING SYSTEMS. allowing players to adapt their strategies gradually over time The extensive form also includes imperfect information (Luce and Raiffa 1957; Battigalli, Gilli, and Milinari 1992). games (Kuhn 1953): the tree is augmented with information This process is a variant of multiagent REINFORCEMENT sets, representing sets of nodes among which the player can- LEARNING. Some variants are related to BAYESIAN LEARNING not distinguish. For example, a simultaneous-move game (Milgrom and Roberts 1989). Others are related to the evolu- such as Flipping Pennies can be represented as in figure (2). tion of biological populations, and have led to a branch of game Clearly, imperfect information games require the use of ran- theory called evolutionary games (Maynard Smith 1982). domization in order to guarantee the existence of a Nash Figure 2. (a) Unintuitive equilibrium. (b) Flipping Pennies. 340 Gender equilibrium. Refinements of the Nash equilibrium concept Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge: Cambridge University Press. addressing the temporal sequencing of decisions have also Milgrom, P., and J. Roberts. (1989). Adaptive and Sophisticated been proposed for imperfect information games (Kreps and Learning in Repeated Normal Forms. Mimeo, Stanford Univer- Wilson 1982; Selten 1975; Fudenberg and Tirole 1991), sity. largely based on the intuition that the player’s actions must Nash, J. F. (1950). Equilibrium points in n-person games. Proceed- be optimal relative to his or her beliefs about the current state. ings of the National Academy of Sciences 36: 48–49. Game theory also deals with cooperative games, where Nash, J. F. (1951). Non-cooperative games. Annals of Mathematics players can form coalitions and make binding agreements 54: 286–295. about their choice of actions (von Neumann and Morgen- Rosenschein, J. S., and G. Zlotkin. (1994). Consenting agents: stern 1944; Luce and Raiffa 1957; Shapley and Shubik designing conventions for automated negotiation. AI Magazine 1953). In this case, the outcome of the game is determined 15(3): 29–46. Selten, R. (1965). Spieltheoretische behandlung eines oligopol- by the coalition that forms and the joint action it takes. The modells mit nachfragetragheit. Zeitschrift für die gesamte game-theoretic models for such situations typically focus on Staatswissenschaft 121: 301–324. how the payoff resulting from the optimal group action is Selten, R. (1975). Reexamination of the perfectness concept for divided between group members. Various solution concepts equilibrium points in extensive games. International Journal of have been proposed for coalition games, essentially requir- Game Theory 4: 25–55. ing that the payoff division be such that no subgroup is bet- Shapley, L. S., and M. Shubik. (1953). Solutions of n-person ter off by leaving the coalition and forming another (see games with ordinal utilities. Econometrica 21: 348–349. Aumann 1989 for a survey). von Neumann, J., and O. Morgenstern. (1944). Theory of Games Game theory is a unified theory for rational decision and Economic Behavior. Princeton: Princeton University Press. making in multiagent settings. It encompasses bargaining, Further Readings negotiation, auctions, voting, deterrence, competition, and more. It lies at the heart of much of economic theory, but it Aumann, R. J. (1985). What is game theory trying to accomplish? has also been used in political science, government policy, In K. J. Arrow and S. Honkapohja, Eds., Frontiers of Econom- law, military analysis, and biology (Aumann and Hart 1992, ics. Oxford: Blackwell, pp. 28–76. 1994, 1997). One of the most exciting prospects for the Aumann, R. J. (1987). Game theory. In J. Eatwell, M. Milgate, and future is the wide-scale application of game theory in the P. Newman, Eds., The New Palgrave, vol. 2. New York: Mac- domain of autonomous computer agents (Rosenschein and millan, pp. 460–482. Zlotkin 1994; Koller and Pfeffer 1997). Fudenberg, D., and J. Tirole. (1991). Game Theory. Cambridge, MA: MIT Press. See also ECONOMICS AND COGNITIVE SCIENCE; MULTI- Fudenberg, D. (1992). Explaining cooperation and commitment in AGENT SYSTEMS; RATIONAL CHOICE THEORY repeated games. In J.-J. Laffont, Ed., Advances in Economic Theory, vol. 1. Cambridge: Cambridge University Press. —Daphne Koller Kreps, D. M. (1990). Game Theory and Economic Modelling. Oxford: Clarendon Press. McKelvey, R. D., and A. McLennan. (1996). Computation of equi- References libria in finite games. In H. Amman, D. Kendrick, and J. Rust, Eds., Handbook of Computational Economics, vol. 1. Amster- Abreu, D., and A. Rubinstein. (1992). The structure of Nash equi- dam: Elsevier. libria in repeated games with finite automata. Econometrica 56: Megiddo, N., Ed. (1994). Essays in Game Theory. New York: 1259–1281. Springer. Aumann, R. J., and S. Hart, Eds. (1992; 1994; 1997). Handbook of Myerson, R. B., Ed. (1991). Game Theory. Cambridge, MA: Har- Game Theory with Economic Applications, vols. 1–3. Amster- vard University Press. dam: Elsevier. Osborne, M. J., and A. Rubinstein. (1994). A Course in Game The- Aumann, R. J. (1976). Agreeing to disagree. Annals of Statistics ory. Cambridge, MA: MIT Press. 4(6): 1236–1239. Aumann, R. J. (1989). Lectures on Game Theory. Boulder, CO: Van Damme, E. (1991). Stability and Perfection of Nash Equilib- Westview Press. ria. Berlin: Springer. Battigalli, P., M. Gilli, and M. C. Milinari. (1992). Learning and convergence to equilibrium in repeated strategic interactions: Gender An introductory survey. Ricerche Economiche 46: 335–377. Fudenberg, D., and J. Tirole. (1991). Perfect Bayesian equilibrium and sequential equilibrium. Journal of Economic Theory 53: See LANGUAGE AND GENDER; METAPHOR AND CULTURE; 236–260. NAIVE SOCIOLOGY; STEREOTYPING Koller, D., and A. Pfeffer. (1997). Representations and solutions for game-theoretic problems. Artificial Intelligence. Generative Grammar Kreps, D. M., and R. B. Wilson. (1982). Sequential equilibria. Econometrica 50: 863–894. Kuhn, H. W. (1953). Extensive games and the problem of informa- The motivating idea of generative grammar is that we can tion. In H. W. Kuhn and A. W. Tucker, Eds., Contributions to gain insight into human language through the construction of the Theory of Games II. Princeton: Princeton University Press, explicit grammars. A language is taken to be a collection of pp. 193–216. structured symbolic expressions, and a generative grammar is Luce, R. D., and H. Raiffa. (1957). Games and Decisions—Intro- simply an explicit theoretical account of one such collection. duction and Critical Survey. New York: Wiley. Generative Grammar 341 Several researchers (beginning with Putnam 1961) have Simple finite sets of rules can describe infinite languages noted that transformational grammars introduce the unde- with interesting structure. For example, the following in- sirable property of Turing-equivalence. Others, including struction describes all palindromes over the alphabet {A, B}: Chomsky, have argued that this is not a vitiating result. Start with S and recursively replace S by ASA, BSB, or Generative grammatical study and the investigation of nothing. human mental capacities have been related via widely dis- cussed claims, including the following: We prove that ABBA is a palindrome by constructing the sequence S—ASA—ABSBA—ABBA, which represents a 1. People tacitly know (and learn in infancy) which sen- derivation of the string. The instruction, or rule, constitutes tences are grammatical and meaningful in their lan- a generative grammar for the language comprising such pal- guage. indromes; for every palindrome over the alphabet {A,B} the 2. They possess (and acquire) such knowledge even about novel sentences. grammar provides a derivation. S can be regarded as a gram- 3. Therefore they must be relying not on memorization but matical category analogous to a sentence, and A and B are on mentally represented rules. analogous to words in a language. 4. Generative grammars can be interpreted as models of The following condition on nodes in trees turns out to be mentally represented rule sets. equivalent to the above grammar: 5. The ability to have (and acquire) such rules must be a significant (probably species-defining) feature of human Nodes labeled S have left and right daughters either minds. both labeled A or both labeled B; optionally these are separated by a node labeled S; nodes labeled A or B Critics have challenged all these claims. (It has been a have no daughters. key contribution of generative grammar to cognitive science to have stimulated enormous amounts of interesting critical If a set of trees T satisfies this condition, then its frontier discussion on issues of this sort.) Psycholinguistic studies of (the sequence of daughterless nodes) is a palindrome of A’s speakers’ reactions to novel strings have been held to under- and B’s; and any such palindrome is the frontier of some cut (1). One response to (2) is that speakers might simply such tree. This is a declarative definition of a set of trees, generalize or analogize from familiar cases. Regarding (3), not a procedure for deriving strings. But it provides an alter- some philosophers object that one cannot draw conclusions native way of defining the same language of strings, a dif- about brain inscriptions (which are concrete objects) from ferent type of generative grammar. properties of sentences (arguably abstract objects). In Studying English in generative terms involves trying to response to (4), it has been noted that the most compact and determine whether some finite set of rules could generate nonredundant set of rules for a language will not necessarily the entire set of strings of words that a native speaker of be identical with the sets people actually use, and that the (say) English would find acceptable. It presumably is possi- Turing-equivalence results mean that (4) does not imply any ble, because speakers, despite their finite mental capacities, distinction between being able to learn a natural language seem to have a full grasp of what is in their language. and being able to learn any arbitrary finitely representable A fundamental early insight was that limitations on the subject matter. And primatologists have tried to demonstrate format of rules or constraints set limits on definable lan- language learning in apes to challenge (5). guages, and some limits are too tight to allow for descrip- Two major rifts appeared in the history of generative tion of languages like English. String-rewriting rules grammar, one in the years following 1968 and the other having the form “rewrite X as wY,” where w is a string of about ten years later. The first rift developed when a group words and X and Y are grammatical categories, are too of syntacticians who became known as the generative weak. (This limitation would make grammars exactly semanticists suggested that syntactic investigations revealed equivalent to finite automata, so strictly speaking English that transformations must act directly on semantic structures cannot be recognized by any finite computer.) However, if that looked nothing like superficial ones (in particular, they limits are too slack, the opposite problem emerges: the did not respect basic constituent order or even integrity of theory may be Turing-equivalent, meaning that it pro- words), and might represent They persuaded Mike to fell the vides a grammar for any recursively enumerable set of tree thus (a predicate-initial representation of clauses is strings. That means it does not make a formal distinction adopted here, though this is not essential): between natural languages and any other recursively enu- merable sets. Building on work by Zellig Harris, Chomsky [SPAST [SCAUSE they [SAGREE Mike [SCAUSE Mike (1957) argued that a grammar for English needed to go [SFALL the tree]]]]]. beyond the sort of rules seen so far, and must employ The late 1960s and early 1970s saw much dispute about the transformational rules that convert one structural represen- generative semantics program (Harris 1993), which ulti- tation into another. For example, using transformations, mately dissolved, though many of its insights have been passive sentences might be described in terms of a rear- revived in recent work. Some adherents became interested rangement of structural constituents of corresponding in the semantic program of logician Richard Montague active sentences: (1973), bringing the apparatus of model-theoretic semantics into the core of theoretical linguistics; others participated in Take a subjectless active sentence structure with root the development of RELATIONAL GRAMMAR (Perlmutter and label S and move the postverbal noun phrase leftward to Postal 1983). become the subject (left branch) of that S. 342 Generative Grammar The second split in generative grammar, which devel- generative grammarians in European countries after 1975. oped in the late 1970s and persists to the present, is Second, the theoretical study of how natural languages between those who continue to employ transformational might be learned by infants—particularly how SYNTAX is analyses (especially movement transformations) and hence learned—has been growing in prominence and deserves derivations, and those who by 1980 had completely aban- some further discussion here. doned them to formulate grammars in constraint-based Transformationalists argue that what makes languages terms. The transformationalists currently predominate. The learnable is that grammars differ only in a finite number of constraint-based minority is a heterogenous group exempli- parameter settings triggered by crucial pieces of evidence. fied by the proponents of, inter alia, relational grammar For example, identifying a clause with verb following object (especially the formalized version in Johnson and Postal might trigger the “head-final” value for the head-position 1980), HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR (Pol- parameter, as in Japanese (where verbs come at the ends of lard and Sag 1994), LEXICAL FUNCTIONAL GRAMMAR (Bre- clauses), rather than the “head-initial” value that character- snan and Kaplan 1982), and earlier frameworks including izes English. (In some recent transformationalist work hav- tree adjoining grammar (Joshi, Levy, and Takahashi 1975) ing the head-final parameter actually corresponds to having and generalized phrase structure grammar (Gazdar et al. a required leftward movement of postverbal constituents 1985). into preverbal positions, but the point is that it is conjectured In general, transformationalist grammarians have been that only finitely many distinct alternatives are made avail- less interested, and constraint-based grammarians more able by universal grammar.) Theories of learning based on interested, in such topics as COMPUTATIONAL LINGUISTICS this idea face significant computational problems, because and the mathematical analysis of properties of languages even quite simple parameter systems can define local blind (sets of expressions) and classes of formalized grammars. alleys from which a learning algorithm cannot escape Transformationalists formulate grammars procedurally, regardless of further input data (see Gibson and Wexler defining processes for deriving a sentence in a series of 1995). steps. The constraint-based minority states them declara- The general goal is seen as that of overcoming the prob- tively, as sets of constraints satisfied by the correct struc- lem of the POVERTY OF THE STIMULUS: the putative lack of tures. The derivational account of passives alluded to an adequate evidential basis to support induction of a above involves a movement transformation: a representa- grammar via general knowledge acquisition procedures tion like (see LEARNING SYSTEMS and ACQUISITION, FORMAL THEO- RIES OF). [[NP__] PAST made [NP mistakes] ([PP by people]]) Nontransformationalists are somewhat more receptive is changed by transformation into something more like than transformationalists to the view that the infant’s experi- ence might not be all that poverty-stricken: once one takes [[NP mistakes] were made [NP__]] ([PP by people]]) account of the vast amount of statistical information con- The same facts might be described declaratively by means tained in the corpus of observed utterances (see STATISTI- of constraints ensuring that for every transitive verb V there CAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING), it is a corresponding intransitive verb V* used in such a way can be seen that the child’s input might be rich enough to that for B to be V*ed by A means the same as for A to V B. account for language acquisition through processes of grad- This is vague and informal, but may serve to indicate how ual generalization and statistical approximation. This idea, passive sentences can be described without giving instruc- familiar from pre-1957 structuralism, is reemerging in a tions for deriving them from active structures, yet without form that is cognizant of the past four decades of generative missing the systematic synonymy of actives and their grammatical research without being wholly antagonistic in related passives. spirit to such trends as CONNECTIONIST APPROACHES TO There is little explicit debate between transformational- LANGUAGE. The research on OPTIMALITY THEORY that has ists and constraint-based theorists, though there have been a emerged under the influence of connectionism meshes well few interchanges on such topics as (1) whether a concern both with constraint-based approaches to grammar and with with formal exactitude in constructing grammars (much work on how exposure to data can facilitate identification of stressed in the constraint-based literature) is premature at constraint systems. the present stage of knowledge; (2) whether claims about There is no reason to see such developments as being at grammars can sensibly be claimed to have neurophysiologi- odds with the original motivating idea of generative gram- cal relevance (as transformationalists have claimed); and (3) mar, inasmuch as rigorous and exact description of human whether progress in computational implementation of non- languages and what they have in common—that is, a thor- transformational grammars is relevant to providing a theo- ough understanding of what is acquired—is surely a prereq- retical argument in their favor. uisite to any kind of language acquisition research, whatever At least two topics can be identified that have been of its conclusions. interest to both factions within generative grammar. First, See also COGNITIVE LINGUISTICS; LANGUAGE ACQUISI- broadening the range of languages from which data are TION; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; drawn is regarded as most important. The influence of rela- MINIMALISM; PARAMETER-SETTING APPROACHES TO ACQUI- tional grammar during the 1980s helped expand consider- SITION, CREOLIZATION, AND DIACHRONY ably the number of languages studied by generative —Geoffrey K. Pullum grammarians, and so did the growth of the community of 343 burda 1985 and Damasio and Galaburda 1985). His parents References had emigrated from Poland at the turn of the century. Bresnan, J., and R. Kaplan. (1982). The Mental Representation of Geschwind graduated from Boys’ High School in Brooklyn, Grammatical Relations. Cambridge, MA: MIT Press. New York, in 1942, and attended Harvard College on a Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton. Pulitzer Scholarship from 1942 until 1944, when his studies Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, were interrupted by service in the United States Army in the MA: MIT Press. last years of World War II. After the war, Geschwind fin- Chomsky, N. (1994). The Minimalist Program. Cambridge, MA: ished his undergraduate studies and then attended Harvard MIT Press. Medical School. After graduation in 1951 he carried out an Gazdar, G., E. Klein, G. K. Pullum, and I. A. Sag. (1985). General- internship at Boston’s Beth Israel Hospital (to which he ized Phrase Structure Grammar. Cambridge, MA: Harvard University Press. would return at the end of his life as chair of neurology). Gibson, T., and K. Wexler. (1995). Triggers. Linguistic Inquiry 25: Afterward Geschwind traveled to England to study muscle 407–454. physiology with neurologist Ian Simpson at the National Harris, R. A. (1993). The Linguistic Wars. New York: Oxford Uni- Hospital in Queen Square. He returned from London to con- versity Press. tinue his neurological training under Derek Denny-Brown at Johnson, D. E., and P. M. Postal. (1980). Arc Pair Grammar. Princ- the Boston City Hospital. In 1958, Geschwind joined Fred eton: Princeton University Press. Quadfasel at the Boston’s Veterans Administration Hospital, Joshi, A., L. S. Levy, and M. Takahashi. (1975). Tree adjunct where his education and work on the neurology of behavior grammars. Journal of Computing and System Sciences 19: 136– began. When Quadfasel retired in 1963, Geschwind 163. replaced him as chief of service and remained at that post Montague, R. (1993). Formal Philosophy. New Haven: Yale Uni- versity Press. until 1969. That year Geschwind returned to Harvard as the Perlmutter, D. M., and P. M. Postal. (1983). Studies in Relational James Jackson Putnam Professor of Neurology and chief of Grammar 1. Chicago: University of Chicago Press. the neurological unit of Boston City Hospital. He continued Pollard, C., and I. A. Sag. (1994). Head-driven Phrase Structure to be involved in the APHASIA Research Center, which he Grammar. Chicago: University of Chicago Press. had founded while at the Veterans Hospital. Putnam, H. (1961). Some issues in the theory of grammar. Pro- Initially Geschwind was influenced in his thinking about ceedings of the Twelfth Symposium in Applied Mathematics. behavior by the holistic views of Hughlings Jackson, Kurt Providence: American Mathematical Society. Reprinted in Goldstein, Henry Head, and Carl LASHLEY (Geschwind Philosophical Papers, vol. 2: Mind, Language and Reality. New 1964). In the early 1960s, however, he was seduced by the York: Cambridge University Press, 1975, pp. 85–106; and in G. style of explanation of neurologists BROCA, Wernicke, Bas- Harman, Ed., On Noam Chomsky. New York: Anchor, 1974, pp. 80–103. tian, Dejerine, and Charcot, and others, which relied heavily on anatomical relationships among areas of the brain by way of neural connections. From that time on Geschwind became Genetic Algorithms the clearest and most forceful and incisive champion of this localizationist approach to the understanding of behavior and See EVOLUTIONARY COMPUTATION behavioral disorders. Geschwind’s analysis of the case of a patient with a brain tumor who could write correct language with his right hand but not with his left showed the power of Geschwind, Norman this approach and launched Geschwind’s career as a behav- ioral neurologist (Geschwind and Kaplan 1962). The expla- Norman Geschwind (1926–1984) was an eminent American nation was damage to the large bundle of neural connections neurologist whose major contribution was to help revive the linking the two hemispheres of the brain, the corpus callo- CORTICAL LOCALIZATION-based anatomophysiological ana- sum, whose importance was also being recognized by the lysis of human behavior and behavioral disorders typical of neurobiologist Roger SPERRY through his work with mon- the approach of the last decades of the nineteenth century. In keys. Additional reports and impressive review of the this way, in the early 1960s and almost single-handedly, he world’s literature led to Geschwind’s famous two-part paper brought the study of behavior back into the framework of in the journal Brain, ‘Disconnexion syndromes in animals neurology and away from purely behavioral explanations and man’ (1965a and b). The clarity of exposition and con- characteristic of most of the first half of the twentieth cen- viction about the power of the anatomical method produced tury. Thus, he helped to pave the way to what is now the a strong following and established Geschwind as the leading domain of cognitive neuroscience. His research interests figure in American behavioral neurology, which he remained included the study of brain connections as a way of explain- until his death. ing the neural basis and disorders of language, knowledge, Irked by a statement by the anatomist Gerhard von Bonin and action (Geschwind 1965a, 1965b), the study of HEMI- that there were no anatomical asymmetries in the human SPHERIC SPECIALIZATION and its biological underpinnings brain to account for the striking hemispheric specialization (Geschwind and Levitsky 1968; Geschwind and Galaburda the brain exhibits, Geschwind undertook his own literature 1984), and the study of developmental learning disorders review and laboratory studies and published an important such as DYSLEXIA (Geschwind and Galaburda 1987). paper, together with Walter Levitsky, which disclosed strik- Geschwind was born in New York City on January 8, ing asymmetries in a region of the temporal lobe, in an area 1926 (for a more extensive biographical sketch, see Gala- important to language, called the planum temporale 344 Gestalt Perception (Geschwind and Levitsky 1968). Several others confirmed Geschwind, N., and A. M. Galaburda, Eds. (1984). Cerebral Dom- inance: The Biological Foundations. Cambridge, MA: Harvard these findings, and the paper stimulated a great deal of addi- University Press. tional research on brain asymmetries, some of which are Geschwind, N. and A. M. Galaburda. (1987). Cerebral Lateraliza- still actively studied using anatomical brain imaging tech- tion: Biological Mechanisms, Associations, and Pathology. niques such as computed assisted tomography (CAT scans) Cambridge, MA: MIT Press. and MAGNETIC RESONANCE IMAGING (MRI scans) in living Geschwind, N., and E. Kaplan. (1962). A human reconnection syn- subjects, as well as functional MRI and POSITRON EMISSION drome: a preliminary report. Neurology 12: 675–685. TOMOGRAPHY (PET). Geschwind, N., and W. Levitsky. (1968). Human brain: left-right During the last few years of his life Geschwind became asymmetries in temporal speech region. Science 161: 186–187. interested in developmental learning disabilities. In Ge- schwind’s mind, strict localization theory began to give way Gestalt Perception to localization with NEURAL PLASTICITY resulting from early extrinsic influences on brain development occurring in utero or soon after birth. These effects were capable of changing Gestalt perception is the name given to various perceptual standard patterns of brain asymmetry and could lead to de- phenomena and theoretical principles associated with the velopmental learning disorders. His keen clinical acumen led school of GESTALT PSYCHOLOGY (Koffka 1935). Its most to his noticing that mothers of dyslexic children often re- important contributions concerned perceptual organization: ported left-handedness, atopic illnesses such as asthma, and the nature of relations among parts and wholes and how autoimmune diseases such as hypothyroidism. In an epide- they are determined. Previously, perceptual theory was miological study carried out with Peter Behan in London, dominated by the structuralist proposal that complex per- they showed an association between stuttering, dyslexia, ceptions were constructed from atoms of elementary color colitis, thyroid disease, and myasthenia gravis in left-handers sensations and unified by associations due to spatial and (Geschwind and Behan 1983). This work engendered mas- temporal contiguity. Gestalt theorists rejected both assump- sive additional research and debate and may constitute Ge- tions, arguing that perception was holistic and organized schwind’s most creative contribution to new knowledge. due to interactions between stimulus structure and underly- Since his death a slightly larger number of reports have ing brain processes. found support for this last of Geschwind’s insight than the Wertheimer (1923) posed the problem of perceptual number finding no support. organization in terms of how people manage to perceive Norman Geschwind wrote more than 160 journal articles organized scenes consisting of surfaces, parts, and whole and books, and his name became a household word among objects coherently arranged in space rather than the chaotic, not only neurologists but also psychologists, philosophers dynamic juxtaposition of millions of different colors regis- of mind, educators, and neurobiologists. He was recognized tered by retinal receptors. He attempted to answer this ques- by many prizes, honorary degrees, and visiting professor- tion by identifying stimulus factors that caused simple ships, which led him to travel widely. He spoke several lan- arrays of elements to be perceived as organized in distinct guages and possessed a strong memory, a sharp logical groups. The factors he identified are usually called the laws mind, and a broad culture, which turned him into a powerful (or principles) of grouping, several of which are illustrated adversary in debate and discussion. He left a legacy of in figures A–F: proximity (A), similarity of color (B), simi- knowledge and ideas, as well as a long list of students and larity of size (C), common fate (D), good continuation (E), followers, many of whom became leaders in behavioral neu- and closure (F). More recently other principles have been rology in the United States and abroad. identified (Palmer and Rock 1994), including common region (G) and element connectedness (H). In each case, See also HEBB, DONALD O.; LANGUAGE IMPAIRMENT, elements that have a stronger relation in terms of the speci- DEVELOPMENTAL; LANGUAGE, NEURAL BASIS OF; LURIA, fied property (i.e., those that are closer, more similarly col- ALEXANDER ROMANOVICH ored, etc.) tend to be grouped together. These “laws” are —Albert M. Galaburda actually ceteris paribus rules: all else being equal, the ele- ments most closely related by the specified factor will be References grouped together. They cannot predict the result when two or more factors vary in opposition because the rules fail to Damasio, A. R., and A. M. Galaburda. (1985). Norman specify how multiple factors are integrated. No general the- Geschwind. Archives of Neurology 42: 500–504. ory has yet been formulated that overcomes this problem. Galaburda, A. M. (1985). Norman Geschwind. Neuropsychologia 23: 297–304. A second important phenomenon of Gestalt perception is Geschwind, N. (1964). The paradoxical position of Kurt Goldstein figure-ground organization (Rubin 1921). In figure I, for in the history of aphasia. Cortex 1: 214–224. instance, one can perceive either a white object on a black Geschwind, N. (1965a). Disconnexion syndromes in animals and background or a black object on a white background. The man. Brain 88: 237–294. crucial feature of figure-ground organization is that the Geschwind, N. (1965b). Disconnexion syndromes in animals and boundary is perceived as belonging to the figural region. As man. Brain 88: 585–644. a result, it seems “thing-like,” has definite shape, and Geschwind, N., and P. Behan (1983). Left-handedness: association appears closer, whereas the ground appears farther and to with immune disease, migraine, and developmental learning extend behind the figure. Gestalt psychologists identified disorder. Proceedings of the National Academy of Sciences several factors that govern figure-ground organization, (U.S.A.) 79: 5097–5100. Gestalt Perception 345 the rod-and-frame effect (P), observers perceive an upright rod as tilted when it is presented inside a large tilted rectan- gle in an otherwise darkened environment (Asch and Witkin 1948), much as one perceives a vertical chandelier as hang- ing askew inside a tilted room in a fun house. Larger, sur- rounding objects or surfaces thus tend to be taken as the frame of reference for the smaller objects they enclose. Several other organizational phenomena are strongly identified with the Gestalt approach to perception. Amodal completion refers to the perception of partly visible figures as completed behind an occluding object. Figure Q, for example, is invariably perceived as a complete circle behind a square even though only three-fourths of it is actually showing. Illusory contours refer to the perception of a figure defined by edges that are not physically present in the image. As figure R illustrates, an illusory figure is perceived when aligned contours in the inducing elements cause them to be seen as partly occluded by a figure that has the same color as the background (Kanisza 1979). Color scission (figure S) refers to the splitting of perceived color into one component due to an opaque figure and another component due to a translucent figure through which the farther figure is seen (Metelli 1974). Although these examples do not exhaust the perceptual contributions of Gestalt psychologists and their followers, they are representative of the phenomena they studied in the visual domain. They also investigated the perceptual organi- zation of sounds, a topic that has been extended signifi- cantly by modern researchers (Bregman 1990). Recent studies of PERCEPTUAL DEVELOPMENT demonstrate that, contrary to the nativistic beliefs of Gestalt theorists, most principles of organization are not present at birth, but develop at different times during the first year of life (Kell- man and Spelke 1983). Theoretically, Gestalt psychologists maintained that these phenomena of perceptual organization support holism, the doctrine that the whole is different from the sum of its parts. They attempted to explain such holistic effects in terms of their principle of Prägnanz (or minimum princi- ple), the claim that the percept will be as “good” as the stim- ulus conditions allow. This means that the preferred organization should be the simplest, most regular possibility compatible with the constraints imposed by the retinal Figure 1. image. Unfortunately, they did not provide an adequate def- including surroundedness (J), size (K), contrast (L), convex- inition of goodness, simplicity, or regularity, so their central ity (M), and symmetry (N). These principles are also ceteris claim was untestable in any rigorous way. Later theorists paribus rules: all else being equal, the surrounded, smaller, have attempted to ground these concepts in objective analy- higher contrast, more convex, or symmetrical region tends ses, suggesting that simple perceptions correspond to low to be seen as the figure. They therefore suffer from the same information content, economy of symbolic representation, problem as the laws of grouping: they cannot predict the and/or minimal transformational distance. Non-Gestalt the- result when two or more factors conflict. orists typically appeal instead to HELMHOLTZ’s likelihood A third important phenomenon of Gestalt perception is principle that the perceptual system is biased toward the that certain properties of objects are perceived relative to a most likely (rather than the simplest) interpretation. The dif- frame of reference. MOTION and orientation provide two ficulty in discriminating between these two alternatives compelling examples. In induced motion (O), a slowly mov- arises in part from the fact that the most likely interpretation ing larger object surrounds a smaller stationary object in an is usually the simplest in some plausible sense. (See Pomer- otherwise dark environment. Surprisingly, observers per- antz and Kubovy 1986 for a review of this issue.) ceive the frame as still and the dot as moving in the opposite Most phenomena of Gestalt perception have resisted direction (Duncker 1929)—for example, when the moon explanation at computational, algorithmic, and physiological appears to move through a cloud that appears stationary. In levels. Köhler (1940) suggested that the best organization 346 Gestalt Psychology was achieved by electromagnetic fields in the CEREBRAL Max Wertheimer, invested much of his energies in other top- CORTEX that settled into states of minimum energy, much as ics, such as epistemology (Wertheimer 1934), ethics (1935), soap bubbles stabilize into perfect spheres, which are mini- problem solving (1920), and creativity (1959/1978; see also mal in both energy and complexity. Although subsequent Duncker 1945/1972; Luchins 1942). Koffka, one of Werthe- physiological findings have discredited the brain-field con- imer’s foremost collaborators, devoted more than half of his jecture, the more abstract idea of a physical Gestalt is com- Principles of Gestalt Psychology (1935) to attitudes, emo- patible with modern investigations of RECURRENT tion, the will, memory (see also Wulf 1921; Restorff 1933), NETWORKS (e.g., Grossberg and Mingolla 1985) that con- learning, and the relations between society and personality verge on a solution by reaching a minimum in an energy-like (see also Lewin 1935; Dembo 1931/1976). Köhler, the third function. This suggests new theoretical approaches to the member of the leadership of the Gestalt school, did research many important organizational phenomena Gestalt psychol- on the insightful problem-solving of apes (1921/1976) and ogists discovered more than a half century ago. wrote about ethics (claiming that value is an emergent prop- erty of situations) and requiredness (1938), which antici- See also HIGH-LEVEL VISION; ILLUSIONS; MID-LEVEL pates Gibson’s notion of AFFORDANCES. VISION; NATIVISM; OBJECT RECOGNITION, ANIMAL STUDIES; After the dismantling of German psychology—begin- OBJECT RECOGNITION, HUMAN; PICTORIAL ART AND VISION; ning with the coming to power of Hitler in 1933—American VISION AND LEARNING psychology, dominated by doctrinaire BEHAVIORISM, was —Stephen Palmer disdainful of cognitive ideas and saw Gestalt psychology as outmoded and suspiciously vitalistic. These suspicions were References reinforced by the views of some Gestalt psychologists about the way the brain creates Gestalts, views that seemed to Asch, S. E., and H. A. Witkin. (1948). Studies in space orientation: have been spectacularly refuted (for a summary, see Pomer- I and II. Journal of Experimental Psychology 38: 325–337, antz and Kubovy 1981). It was perhaps for this reason that 455–477. Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual during this period, the influence of Gestalt psychologists— Organization of Sound. Cambridge, MA: MIT Press. many of whom were refugees from Nazi Germany—on Duncker, K. (1929). Über induzierte Bewegung. Psychologishe American psychology was felt mostly in social psychology Forschung 12: 180–259. Condensed translation published as (e.g., Lewin 1951; Heider 1958/1982; Krech and Crutch- Induced motion, in W. D. Ellis (1938), A Sourcebook of Gestalt field 1948) and psychology of art (e.g., Arnheim 1974, Psychology. New York: Harcourt, Brace, pp. 161–172. 1969), which were not under the control of behaviorists and Grossberg, S., and E. Mingolla. (1985). Neural dynamics of form were not concerned with brain theory (Zajonc 1980). perception: boundary completion, illusory contours, and neon During the eclipse of Gestalt psychology as cognitive color spreading. Psychological Review 92: 173–211. psychology, some of the questions posed by the Gestalt psy- Kanisza, G. (1979). Organization in Vision. New York: Praeger. chologists were kept alive by Attneave (1959), Garner Kellman, P. J., and E. S. Spelke. (1983). Perception of partly occluded objects in infancy. Cognitive Psychology 15: 483– (1974), Goldmeier (1973), and Rock (1973). A more gen- 524. eral revival of these questions took place in the early 1980s Koffka, K. (1935). Principles of Gestalt Psychology. New York: with the publication of three edited books: Kubovy and Harcourt Brace. Pomerantz (1981), Beck (1982), and Dodwell and Caelli Köhler, W. (1940). Dynamics in Psychology. New York: Liveright (1984). Publishing Corp. Gestalt psychology can be characterized by four main Metelli, F. (1974). The perception of transparency. Scientific Amer- features: (1) its method: phenomenology; (2) its attitude ican 230(4): 90–98. towards REDUCTIONISM: brain-experience isomorphism; (3) Pomerantz, J. R., and M. Kubovy. (1986). Theoretical approaches its focus of investigation: part-whole relationships; (4) its to perceptual organization. In K. R. Boff, L. Kaufman, and J. P. theoretical principle: Prägnanz (Pomerantz and Kubovy Thomas, Eds., Handbook of Perception and Human Perfor- mance. Vol. 2, Cognitive Processes and Performance. New 1981; Epstein and Hatfield 1994). Let us consider these fea- York: Wiley, pp. 36–1 to 36–46. tures one by one. Palmer, S. E., and I. Rock. (1994). Rethinking perceptual organiza- Phenomenology. The application of phenomenology to tion: the role of uniform connectedness. Psychonomic Bulletin perception involves a descriptive text laced with pictures and Review 1: 29–55. (exemplified by GESTALT PERCEPTION, and Ihde 1977). Rubin, E. (1921). Visuell Wahrgenommene Figuren. Copenhagen: According to Bozzi (1989: chap. 7), a conventional psycho- Glydendalske. logical experiment (in any field of psychology) differs from Wertheimer, M. (1923). Untersuchungen zur Lehre von der a phenomenological experiment in the way summarized in Gestalt, II. Psychologische Forschung 4: 301–350. Condensed Table 1. This comparison shows how close phenomenologi- translation published as Laws of organization in perceptual cal research is to protocol analysis, the use of verbal reports forms, in W. D. Ellis (1938), A Sourcebook of Gestalt Psychol- ogy. New York: Harcourt, Brace, pp. 71–88. as data, often used in research on problem-solving (Simon and Kaplan 1989: 21–29). In recent years some psycholo- gists have confirmed and elaborated results obtained with the Gestalt Psychology phenomenological method by using more conventional experimental methodology (Kubovy, Holcombe, and Wager- The scope of Gestalt psychology goes beyond its origins in mans forthcoming). Phenomenology as practiced by the research on perception. The founder of the Gestalt school, Gestalt psychologists should not be confused with introspec- Gestalt Psychology 347 a seminal essay “On ‘Gestalt Qualities’” (1988) written Table 1. Comparison of conventional and phenomenological experiments (after Bozzi 1989: chap. 7) twenty-two years before Wertheimer’s (1912a) first discus- sion of the subject. He is the one who first asked, “Is a mel- Conventional Phenomenological ody (i) a mere Zusammenfassung of elements, or (ii) environment isolated (such as a any (preferably not a something novel in relation to this Zusammenfassung, some- laboratory) laboratory) thing that. . . is distinguishable from the Zusammenfassung participants kept naive about the are told everything of the elements?” (p. 82). Although Zusammenfassung has topic or purpose of the usually been translated as “sum,” this translation may have research (to minimize led to confusion because the notion of Zusammenfassung is demand characteristics) more vague than the word “sum” suggests. The word means task well-defined jointly defined by “combination,” “summing-up,” “summary,” “synopsis,” that participant and is to say, “sum” as in the expression “in sum,” rather than as researcher an arithmetic operation (Grelling and Oppenheim 1939/ participants’ • often the first that • may transcend 1988: 198, propose the translation “totality”). response comes to mind their first impres- sion, and thus pro- Consider the tune Row Your Boat. The melody is different vide information from the following Zusammenfassung: “10 Cs, 3 Ds, 7 Es, 2 about their solution Fs, and 5 Gs,” because the duration of the notes is important. space The melody is also different from a more detailed Zusam- • may not be modified • may be reconsid- menfassung (if the time signature is 6 ): “1 × < C,6>, 2 × 8 ered , 1 × , 6 × , 3 × , . . .” where 2 × • are either correct or • all answers are , means two tokens of C whose duration is equivalent incorrect valid to three eighth notes, because the order of the notes is impor- • unambiguous, or are • responses are tant. But even the score—which specifies the notes, their filtered into a set of classified only after duration, and their order—does not capture the melody. The mutually exclusive and all the data have collectively exhaustive a been examined tune could be transposed from the key of C to the key of F#, priori categories and played at a faster tempo so that it would share no pitches and no absolute durations with the original version. It would still be the same tune. What is preserved are the ratios of fre- tionism as practiced by early psychologists such as Titchener quencies (musical intervals) and the ratios of durations (cf. INTROSPECTION). In fact, in its methodological assump- (rhythm). Melody is a property of the whole that depends on tions, introspectionism is closer to contemporary experimen- relationships among the elements, not on the elements them- tal psychology than to phenomenology. Any account of the selves. According to von Ehrenfels, the Gestalt is a quality introspectionists’ methods (Lyons 1986) will confirm that one perceives in addition to perceiving the individual ele- they are well described by Bozzi’s six features of conven- ments. It was Wertheimer who reformulated this idea of the tional psychological experiments (Table 1). Gestalt as a whole embracing perceived elements as parts. Brain Theory. Despite the refutation of the specifics of Prägnanz. The notion of Prägnanz (introduced by Wer- the Gestalt psychologists’ brain theory, their search for spe- theimer 1912b) is of great importance to the understand- cific correspondences between experiences and brain events ing of cognition. Textbooks define Prägnanz as the and their idea of brain fields anticipated important current tendency of a process to realize the most regular, ordered, foci of research (e.g., NEURAL NETWORKS and massively stable, balanced state possible in a given situation. This parallel processing). Köhler (1947: 132–133) predicted that notion is illustrated by the behavior of soap films, which future psychological research would study “dynamic self- are laminae of minimal potential energy. Unconstrained, a distribution. . . which Gestalt Psychology believes to be soap film becomes a spherical bubble, but when con- essential in neurological and psychological theory,” whose strained by a wire, the film takes on a graceful and seem- “final result always constitutes an orderly distribution,” that ingly complex shape (Hildebrandt and Tromba 1985: is, a “balance of forces” (p. 130; this is a statement of the chap. 5). But the standard definition of Prägnanz ignores principle of Prägnanz, discussed below). It is impossible to an important difference between physical and cognitive read this prophecy without thinking of Hopfield networks systems. Physical systems are exquisitely sensitive to the and Boltzman machines, that can be thought of as minimiz- exact form of the constraints: small changes in the con- ing a system energy function (Anderson 1995; Kelso 1995). straints can make a big difference to the shape of the soap It should be noted that for all their emphasis on brain the- film. In contrast, cognitive systems are relatively insensi- ory, the Gestalt psychologists did not think of themselves as tive to the details of the input, because they decompose it nativists (Köhler 1947: 113, 117, 215). They were neverthe- into a schema that has Prägnanz, to which a correction is less vigorously opposed to the behaviorists’ empirism (a added to characterize the input (Woodworth 1938: chap. psychological metatheory that gives primacy to learning 4). Cognitive systems have several ways to extract Präg- theories; to be carefully distinguished from the epistemo- nanz from inputs (Rausch 1966—summarized by Smith logical theory of empiricism). 1988—proposed seven of them). Here are five of them Part-whole relationships. The problem of part-whole rela- (the first three are exemplified in figure 1): (1) Lawful- tionships, which is currently under vigorous investigation in ness: extract the part of an event or an object that con- perception, was first discussed by Christian von Ehrenfels, in forms to a law or a rule. (2) Originality: extract the part of 348 Gestalt Psychology Figure 1. Three ways to apply Prägnanz to the perception of shape. For each of these ways, the smaller the “correction” to the “schema,” the greater the Prägnanz of the shape. Originality: A shape can be seen as a transformed prototype: a parallelogram can be seen as a skewed rectangle. Integrity: A shape can be seen as an intact shape (“rectangle”) modified by a feature or a flaw: “rectangle with a bump,” “rectangle with one side missing.” Lawfulness: A hard-to-name irregular shape can be said to be roughly rectangular in form. Attneave, F. (1959). Applications of Information Theory to Psy- chology. New York: Holt, Rinehart and Winston. Beck, J., Ed. (1982). Organization and Representation in Percep- tion. Hillsdale, NJ: Erlbaum. Birenbaum, G. (1930). Das Vergessen einer Vornahmen. Isolierte seelische Systeme und dynamische Gesamtbereiche. [Forget- ting a person’s name. Isolated mental systems and dynamical global fields.] Psychologische Forschung 13: 218–284. Bozzi, P. (1989). Fenomenologia Sperimentale. [Experimental Phenomenology]. Bologna, Italy: Il Mulino. Dembo, T. (1976). The dynamics of anger. In J. De Rivera, Ed., Field Theory as Human-Science: Contributions of Lewin’s Ber- lin Group. New York: Gardner Press. pp. 324–422. Original work published 1931. Dodwell, P. C., and T. Caelli, Eds. (1984). Figural Synthesis. Hills- dale, NJ: Erlbaum. Duncker, K. (1972). On Problem-Solving. Trans. L. S. Lees. West- Figure 2. A puzzle that is hard to solve because of Prägnanz. port, CT: Greenwood Press. Original work published 1945. Ehrenfels, C. von. (1988). On “Gestalt qualities”. In B. Smith, Ed., an event or an object that is a prototype (in the sense cur- Foundations of Gestalt Theory. Munich: Philosophia Verlag, rently used in theories of the structure of CONCEPTS) with pp. 82–117. Original work published 1890. respect to other events or objects. (3) Integrity: extract the Ellis, W. D., Ed. (1938). A Source Book of Gestalt Psychology. London: Kegan, Paul, Trench, Trubner. part of an event or an object that is whole, complete or Epstein, W., and G. Hatfield. (1994). Gestalt psychology and the intact, rather than partial, incomplete or flawed. (4) Sim- philosophy of mind. Philosophical Psychology 7: 163–181. plicity: extract that part of an event or an object that is Garner, W. R. (1974). The Processing of Information and Struc- simple or “good.” (5) Diversity: extract that part of an ture. Potomac, MD: Erlbaum. event or an object that is “pregnant,” that is, rich, fruitful, Goldmeier, E. (1973). Similarity in visually perceived form. Psy- significant, weighty. chological Issues 8: Whole No.1. Much of the work of Gestalt psychologists on obstacles Grelling, K., and P. Oppenheim. (1988). Logical analysis of to PROBLEM SOLVING has implicated Prägnanz. For exam- “Gestalt” as “Functional whole”. In B. Smith, Ed., Foundations ple, if the six pieces shown in figure 2 are scattered in front of Gestalt Theory. Munich: Philosophia Verlag, pp. 210–226. of you, and you are asked to make a square out of them, you Original work published 1939. Heider, F. (1982). The Psychology of Interpersonal Relations. are likely to start by forming a disk—a good form that Hillsdale, NJ: Erlbaum. Original work published 1958. delays the solution (Kanizsa 1979: chap. 14). Hildebrandt, S., and S. Tromba. (1985). Mathematics and Optimal See also EMERGENCE; GIBSON, JAMES JEROME; ILLU- Form. New York: Scientific American Books. SIONS; RECURRENT NETWORKS Ihde, D. (1977). Experimental Phenomenology: An Introduction. New York: Putnam. —Michael Kubovy Kanizsa, G. (1979). Organization in Vision. New York: Praeger. Kelso, J. A. S. (1995). Dynamic Patterns: The Self-Organization of References Brain and Behavior. Cambridge, MA: MIT Press. Köhler, W. (1938). The Place of Value in a World of Facts. New Anderson, J. A. (1995). An Introduction to Neural Networks. Cam- York: Liveright. bridge, MA: MIT Press. Köhler, W. (1947). Gestalt Psychology: An Introduction to New Arnheim, R. (1969). Visual Thinking. Berkeley: University of Cali- Concepts in Modern Psychology. Rev. ed. New York: Liveright. fornia Press. Köhler, W. (1976). The Mentality of Apes. 2nd ed. Trans. by E. Arnheim, R. (1974). Art and Visual Perception: A Psychology of Winter. New York: Liveright. Original work published 1921. the Creative Eye. Berkeley: University of California Press. Gibson, James Jerome 349 Koffka, K. (1963). Principles of Gestalt psychology. New York: Wertheimer, M. (1978). Productive Thinking. Enlarged ed. West- Harcourt, Brace and World. Original work published 1935. port, CT: Greenwood Press. Original work published 1959. Krech, D., and R Crutchfield. (1948). Theory and Problems in Woodworth, R. S. (1938). Experimental Psychology. New York: Social Psychology. New York: McGraw-Hill. Henry Holt. Kubovy, M., and J. R. Pomerantz, Eds. (1981). Perceptual Organi- Wulf, F. (1921). Über die Veränderung von Vorstellungen zation. Hillsdale, NJ: Erlbaum. (Gedächtnis und Gestalt). [On the modification of representa- Kubovy, M., A. O. Holcombe, and J. Wagemans. (1998). On the tions (memory and Gestalt)]. Psychologische Forschung 1: lawfulness of grouping by proximity. Cognitive Psychology 35: 333–373. (Excerpts in Ellis 1938: 136–148). 71–98. Zajonc, R. B. (1980). Cognition and social cognition: a Lewin, K. (1935). A Dynamic Theory of Personality. New York: historical perspective. In L. Festinger, Ed., Retrospections McGraw-Hill. on Social Psychology. New York: Oxford University Press, pp. Lewin, K. (1951). Field Theory in Social Science. New York: 180–204. Harper. Luchins, A. S. (1942). Mechanization in problem solving: the Gibson, James Jerome effect of “Einstellung.” Psychological Monographs 54(6): whole no. 248. Lyons, W. (1986). The Disappearance of Introspection. Cam- In his last book, The Ecological Approach to Visual Percep- bridge, MA: MIT Press. tion, James Gibson (1904–1979) concluded with a plea that Metzger, W. (1962). Schöpferische Freiheit. [Creative Freedom]. the terms and concepts of his theory “...never shackle thought Frankfurt-am-Main: Kramer. as the old terms and concepts have!” He was referring to the Pomerantz, J. R., and M. Kubovy. (1981). Perceptual organization: framework of traditional perception, as was reflected, for an overview. In M. Kubovy and J. R. Pomerantz, Eds., Percep- tual Organization. Hillsdale, NJ: Erlbaum, pp. 423–456. example, in the classical problem of space perception Bishop Pomerantz, J. R., and M. Kubovy. (1986). Theoretical approaches Berkeley posed more than three hundred years ago (Berkeley to perceptual organization: Simplicity and likelihood princi- 1963). How is it possible to perceive three-dimensional space ples. In K. R. Boff, L. Kaufman, and J. P. Thomas, Eds., when the input to our senses is a two-dimensional retinal sur- Handbook of Perception and Human Performance. Vol. 2: face in the case of vision, or a skin surface in the case of Cognitive Processes and Performance. New York: Wiley, pp. touch? Logically, it seemed this inadequate stimulation had 36-1–36-46. to be supplemented somehow to account for our ordinary Rausch, E. (1966). Das Eigenschaftsproblem in der Gestalttheorie perception of a three-dimensional world. There have been der Wahrnehmung. [The problem of properties in the Gestalt two general proposals for the nature of this supplementation. theory of perception]. In W. Metzger and H. Erke, Eds., Hand- An empiricist proposal, advocated by Berkeley himself, buch der Psychologie [Handbook of Psychology], vol. 1: Wahr- nehmung und Bewusstsein [Perception and Consciousness]). based the supplementation in the prior experience of the indi- Göttingen: Hogrefe, pp. 866–953. vidual. The alternative nativist proposal based the supple- von Restorff, H. (1933). Analyse von Vorgängen in Spurenfeld. I. mentation in the innate functioning of the mental apparatus Über die Wirkung von Bereichtsbildungen im Spurenfeld. which intrinsically imposes a three-dimensional structure on [Analysis of processes in the memory trace. I. On the effect of two-dimensional stimulation. These two alternatives in only region-formation on the memory trace]. Psychologische Fors- slightly modified forms persist to this day. chung 18: 299–342. Gibson challenged Berkeley’s initial assumption, assert- Rock, I. (1973). Orientation and Form. New York: Academic ing that there is indeed sufficient information available to Press. observers for perceiving a three-dimensional world. It does Rosch, E., and C. B. Mervis. (1975). Family resemblances: studies not have to be supplemented from our past experience or in the internal structure of categories. Cognitive Psychology 7: 573–605. from our innate mental operations. Gibson’s refutation of Shipley, T., Ed. (1961). Classics in Psychology. New York: Philo- the traditional formulation depended on confirming the sophical Library. hypothesis that information is sufficient to account for what Simon, H. A., and C. A Kaplan. (1989). Foundations of cognitive we perceive. He argued that the traditional physical analysis science. In M. I. Posner, Ed., Foundations of Cognitive Science. of energy available to our senses (rays of light and sound Cambridge, MA: MIT Press, pp. 1–47. waves) is the wrong level of analysis for perceiving organ- Smith, B. (1988). Gestalt theory: an essay in philosophy. In B. isms with mobile eyes in mobile heads who look and walk Smith, Ed., Foundations of Gestalt Theory. Munich: around. Rather, light in ambient arrays (as opposed to radi- Philosophia Verlag, pp. 11–81. ant light) is structured by, and fully specifies, its sources in Wertheimer, M. (1912a). Über das Denken der Naturvölker. I. the objects and events of the world we perceive. He showed Zahlen und Zahlgebilde. [On the thought-processes in preliter- ate groups. I. Numbers and numerical concepts]. Zeitschrift für that if the entire structure of the optic array at any point in Psychologie 60: 321–378. (Excerpts in Ellis 1938: 265–273.) space were examined, rather than punctate stimuli imping- Wertheimer, M. (1912b). Experimentelle Studien über das Sehen ing on the retina, the information available is exceedingly von Bewegung. [Experiments on the perception of motion]. rich. Moreover it specifies important features of the environ- Zeitschrift für Psychologie 61: 161–265. (Excerpts in Shipley ment. Thus textured optic arrays specify surfaces, gradients 1961: 1032–1089.) of texture specify slanted or receding surfaces, changing Wertheimer, M. (1920). Über Schlussprozesse im produktiven Den- patterns in the structure are specific to particular types of ken. Berlin: de Gruyter. object and observer movement, and so on. Wertheimer, M. (1934). On truth. Social Research 1: 135–146. Two implications of Gibson’s reformulation need to be Wertheimer, M. (1935). Some problems in the theory of ethics. emphasized. First, patterns of stimulation change when an Social Research 2: 353–367. 350 Gibson, James Jerome observing organism is active. The very act of moving makes different from the traditional ones; they are not association information available. Gibson showed that the transforma- or computation but exploration, detection of invariant rela- tions in the optic array sampled by a moving observer tions, and perceptual learning. simultaneously specify the path of locomotion (perspective Many of Gibson’s empirical discoveries were incorpo- structure) and the stable environment (invariant structure). rated into mainstream theories of perception during his life- The traditional formulation of perception involves a passive time. From early in his career adaptation to prolonged observer with stimulation imposed by the natural physical inspection of curved and tilted lines (e.g., Gibson 1937) world or by a psychological experimenter; either observer or became a prototype of subsequent research concerned with object movement is a complication. Gibson emphasized the perceptual and perceptual-motor adaptation to visual-motor active nature of perceiving, and the idea that movement is rearrangements. TEXTURE gradients have long been essential. Second, in Gibson’s formulation perception is of accepted as one of the “cues” of DEPTH PERCEPTION. The properties that are relevant to an organism’s being in contact investigation of the use of motion transformations for guid- with its environment: things like surfaces and changes in ing locomotion and the analysis of active perception in gen- surface layout, places that enclose, paths that are open for eral, particularly in computer science, are very active mobility, objects approaching or receding, and so on. In the research areas. Gibson’s theoretical influence has been traditional Berkeley perspective perception is of abstract extended by many of his former colleagues and students three-dimensional space. For Gibson abstract three- whose research is motivated by his ecological framework. dimensional space is a conceptual achievement. Perception Many groups and research problems illustrate this influ- is concerned with guiding behavior in a populated and clut- ence: Lee (1980), Warren, Morris, and Kalish (1988), and tered environment. others have investigated the geometric nature of motion- Gibson’s emphasis on the functional aspects of percep- generated information available for the guidance of locomo- tion had roots in his work on pilot selection and training in tion. Turvey, Shaw, and others (e.g., Turvey et al. 1981) the Army Air Force during World War II (Gibson 1947). have integrated Gibson’s ideas with those of Nicolai Bern- This functional emphasis was developed most thoroughly in shtein, the Russian action physiologist, in investigations and his last book, where he presented his ecological approach analyses of both visual and haptic perception. Edward Reed (Gibson 1979). The first section of the book included an has extended his views to what he terms ecological philoso- analysis of the physical world at a level ecologically rele- phy, described in three recent books (Reed 1996a, 1996b, vant to the activity of a perceiving organism. This provided 1997). Gibson’s closest and most influential colleague was a taxonomy of the features that are perceived and an analy- his wife, Eleanor Jack Gibson, who has elaborated a theory sis of how the physical world structures light so as to pro- of perceptual learning and development, complementary to vide the information for the meaningful properties to be his theory of perception (E. J. Gibson 1969). Most recently perceived. she and her colleagues (e.g., Adolph, Eppler, and Gibson Gibson’s ecological perspective emphasizes both the 1993; A. D. Pick 1997; Walker-Andrews 1988; and others) environment/organism mutuality of perception and its have been applying the concept of affordance in the study of intrinsic meaningfulness. This emphasis has the radical PERCEPTUAL DEVELOPMENT in a way that simultaneously implication of breaking down the subject/object distinction refines the concept itself. Such investigations have shown pervasive in Western philosophy and psychology as well as the promise and utility of Gibson’s radical formulation. solving the psychological riddle of how perception is mean- Such success will really be complete when it permits and ingful. The meaningfulness of perception is reflected in his encourages further development of his theoretical concepts concept of affordance which is currently the source of some without the shackling which he feared. controversy. AFFORDANCES are the properties of the envi- ronment, taken with reference to creatures living in it, that This article was to have been written by Gibson’s close make possible or inhibit various kinds of activity: surfaces friend and younger colleague, Edward Reed. His untimely of a certain height, size, and inclination afford sitting on by death in February, 1997, has deprived the field of a brilliant humans, those of a different height and size afford stepping and humane scholar. up on, objects moving at a certain speed afford catching, See also ECOLOGICAL PSYCHOLOGY; MARR, DAVID; and so forth. For Gibson, perception of these possibilities MOTION, PERCEPTION OF; RATIONALISM VS. EMPIRICISM; for action are primary, and they are specified by information STRUCTURE FROM VISUAL INFORMATION SOURCES; VISION in the optic array. AND LEARNING The concept of affordance is implicated in another con- —Herb Pick, Jr., and Anne Pick troversial concept of Gibson’s formulation, that of direct perception. Gibson argued that perception is direct in the References sense that perceiving a property, for instance an affordance, is based on detection of the information specifying that Adolph, K. E., M. A. Eppler, and E. J. Gibson. (1993). Develop- property. (For affordances the meaning is appreciated in the ment of perception of affordances. In C. Rovee-Collier and L. very detection of the information.) Critics of his theory P. Lipsitt, Eds., Advances in Infancy Research, vol. 8. Norwood, interpret direct perception as implying that perception is NJ: Ablex, pp. 51–98. automatic and argue that it is easy to find examples where it Berkeley, G. (1709/1963). An essay towards a new theory of is not. This is a misunderstanding. Direct perception does vision. In C. Turbayne, Ed., Berkeley: Works on Vision. India- not imply automaticity, rather the processes involved are napolis: Bobbs-Merrill. Gödel’s Theorems 351 tice in formal theories; see FORMAL SYSTEMS. One funda- Gibson, E. J. (1969). Principles of Perceptual Learning and Devel- opment. New York: Appleton-Century-Crofts. mental question was: Is there a formal theory such that Gibson, J. J. (1937). Adaptation with negative after-effect. Psycho- mathematical truth is co-extensive with provability in that logical Review 44: 222–244. theory? Russell’s type theory P of Principia Mathematica Gibson, J. J. (1947). Motion picture testing and research. Aviation and axiomatic set theory as formulated by Zermelo seemed Psychology Research Reports No. 7. Washington, DC: U.S. to make a positive answer plausible. A second question Government Printing Office. emerged from the research program that had been initiated Gibson, J. J. (1979). The Ecological Approach to Visual Percep- by Hilbert around 1920 (with roots going back to the turn of tion. Boston: Houghton Mifflin. the century): Is the consistency of mathematics in its for- Lee, D. N. (1980). The optic flow field: the foundation of vision. malized presentation provable by restricted mathematical, Philosophical Transactions of the Royal Society, London Series so-called finitist means? The incompleteness theorems gave B, 290: 169–179. Pick, A. D. (1997). Perceptual learning, categorizing, and cogni- negative answers to both questions for the particular theo- tive development. In C. Dent-Read and P. Zukow-Goldring, ries mentioned. To be more precise, a negative answer to the Eds., Evolving Explanations of Development. Washington, DC: second question is provided only if finitist mathematics American Psychological Association, pp. 335–370. itself can be formalized in these theories; that was not Reed, E. S. (1996a). Encountering the World. New York: Oxford claimed by Gödel in 1931, only in his (1933) did he assert it University Press. with great force. Reed, E. S. (1996b). The Necessity of Experience. New Haven: The first incompleteness theorem states (making use of Yale University Press. an improvement due to Rosser): Reed, E. S. (1997). From Soul to Mind. The Emergence of Psychol- ogy, from Erasmus Darwin to William James. New Haven: Yale If P is consistent, then there is a sentence σ in the lan- University Press. guage of P, such that neither σ nor its negation ¬ σ is Turvey, M. T., R. E. Shaw, E. S. Reed, and W. M. Mace. (1981). provable in P. Ecological laws of perceiving and acting: in reply to Fodor and Pylyshyn (1981). Cognition 9: 237–304. σ is thus independent of P. As σ is a number theoretic Walker-Andrews, A. S. (1988). Infant perception of the affor- statement it is either true or false for the natural numbers; dances of expressive behaviors. In L. P. Lipsitt and C. Rovee- Collier, Eds., Advances in Infancy, vol. 5. Norwood, NJ: Ablex, in either case, we have a statement that is true and not pp. 173–221. provable in P. This incompleteness of P cannot be reme- Warren, W. H., M. W. Morris, and M. Kalish. (1988). Perception of died by adding the true statement to P as an axiom: for the translation heading from optical flow. Journal of Experimental theory so expanded, the same incompleteness phenomenon Psychology: Human Perception and Performance 14: 646–660. arises. Gödel’s second theorem claims the unprovability of a Further Readings (meta-) mathematically meaningful statement: Gibson, J. J. (1950). The Perception of the Visual World. Boston: If P is consistent, then cons, the statement in the language of Houghton Mifflin. P that expresses the consistency of P, is not provable in P. Gibson, J. J. (1968). The Senses Considered as Perceptual Systems. London: Allen and Unwin. Some, for example Church, raised the question whether Michaels, C., and C. Carello. (1981). Direct Perception. New York: the proofs in some way depended on special features of P. Prentice-Hall. In his Princeton lectures of 1934, Gödel tried to present Reed, E. S. (1988). James J. Gibson and the Psychology of Percep- matters in a more general way; he succeeded in addressing tion. New Haven: Yale University Press. Church’s concerns, but continued to strive for even greater Reed, E. S., and R. K. Jones, Eds. (1982). Reasons for Realism: generality in the formulation of the theorems. To understand Selected Essays of James J. Gibson. Hillsdale, NJ: Erlbaum. in what direction, we first review the very basic ideas under- lying the proofs and then discuss why Turing’s work is Gödel’s Theorems essential for a general formulation. Crucial are the effective presentation of P’s syntax and its Kurt Gödel was one of the most influential logicians of the (internal) representation. Gödel uses a presentation by twentieth century. He established a number of absolutely primitive recursive functions, that is, the basic syntactic central facts, among them the semantic completeness of first objects (strings of letters of P’s alphabet and strings of such order logic and the relative consistency of the axiom of strings) are “coded” as natural numbers, and the subsets cor- choice and of the generalized continuum hypothesis. How- responding to formulas and proofs are given by primitive ever, the theorems that have been most significant—for the recursive characteristic functions. Representability condi- general discussion concerning the foundations of mathemat- tions are established for all syntactic notions R, that is, ics—are his two incompleteness theorems published in really for all primitive recursive sets (and relations): if R(m) 1931; they are also referred to simply as Gödel’s theorems holds then P proves r(m), and if not R(m) holds then P or the Gödel theorems. proves ¬r(m), where r is a formula in the language of P and The early part of the twentieth century saw a dramatic m the numeral for the natural number m. Thus, the development of logic in the context of deep problems in the metamathematical talk about the theory can be represented within it. Then the self-referential statement σ (in the lan- foundations of mathematics. This development provided for the first time the basic means to reflect mathematical prac- guage of P) is constructed in conscious analogy to the liar 352 Golgi, Camillo sentence; σ expresses that it is not provable in P. An argu- results do not establish “any bounds for the powers of ment similar to that showing the liar sentence not to be true human reason, but rather for the potentialities of pure for- establishes that σ is not provable in P, thus we have part of malism in mathematics.” Indeed, in the Gibbs lecture the first theorem. The second theorem is obtained, very Gödel reformulated the first disjunct as this dramatic and roughly speaking, by formalizing the proof of the first theo- vague statement: “the human mind (even within the realm rem concerning σ, but additional derivability conditions are of pure mathematics) infinitely surpasses the powers of any needed: this yields a proof in P of (cons → σ). Now, clearly, finite machine.” cons cannot be provable in P, otherwise σ were provable, See also COMPUTATION AND THE BRAIN; COMPUTA- contradicting the part of the first theorem we just estab- TIONAL THEORY OF MIND; LOGIC lished. The proof of the second theorem was given in detail only by Hilbert and Bernays (1939). A gem of an informal —Wilfried Sieg presentation of this material is (Gödel 1931b); for a good introduction to the mathematical details see Smorynski References (1977). Gödel, K. (1931a). Über formal unentscheidbare Sätze der Prin- Gödel viewed in (1934) the primitive recursiveness of the cipia Mathematica und verwandter Systeme I. Translated in syntactic notions as “a precise condition which in practice Collected Works I, pp. 144–195. suffices as a substitute for the unprecise requirement . . . that Gödel, K. (1931b). Über unentscheidbare Sätze. Translated in Col- the class of axioms and relation of immediate consequence lected Works III, pp. 30–35. be constructive,” that is, have an effectively calculable char- Gödel, K. (1933). The present situation in the foundations of math- acteristic function. What was needed, in principle, was a ematics. In Collected Works III, pp. 45–53. precise concept capturing the informal notion of an effec- Gödel, K. (1934). On undecidable propositions of formal mathe- tively calculable function, and that would allow a perfectly matical systems (Princeton Lectures). In Collected Works I, pp. 346–369. general characterization of formal theories. Such a notion Gödel, K. (1951). Some basic theorems on the foundations of emerged from the investigations of Church and TURING; see mathematics and their implications (Gibbs Lecture). In Col- CHURCH-TURING THESIS. Only then was it possible to state lected Works III, pp. 304–323. and prove the Incompleteness Theorems for all formal theo- Gödel, K. (1964). Postscriptum to Gödel (1934). In Collected ries satisfying representability (for all recursive relations) Works I, pp. 369–371. and derivability conditions. In the above statement of the Gödel, K. (1986). Collected Works I. Oxford: Oxford University theorems, the premise “P is consistent” can now be replaced Press. by “P is any consistent formal theory satisfying the repre- Gödel, K. (1990). Collected Works II. Oxford: Oxford University sentability conditions,” respectively “P is any consistent Press. formal theory satisfying the representability and derivability Gödel, K. (1995). Collected Works III. Oxford: Oxford University Press. conditions.” It is this generality of his results that Gödel Hilbert, D. and P. Bernays. (1939). Grundlagen der Mathematik II. emphasized again and again; for example, in his (1964) Berlin: Springer. work: “In consequence of later advances, in particular of the Lucas, J. R. (1961). Minds, machines and Gödel. Philosophy 36: fact that, due to A. M. Turing’s work, a precise and unques- 112–127. tionably adequate definition of the general concept of for- Penrose, R. (1989). The Emperor’s New Mind. New York: Oxford mal system can now be given, the existence of undecidable University Press. arithmetical propositions and the non-demonstrability of the Rosser, B. (1936). Extensions of some theorems of Gödel and consistency of a system in the same system can now be Church. J. Symbolic Logic 2: 129–137. proved rigorously for every consistent formal system con- Smorynski, C. (1977). The incompleteness theorem. In J. Barwise, taining a certain amount of finitary number theory.” Ed., Handbook of Mathematical Logic. Amsterdam. North- Holland, pp. 821–865. Gödel exploited this general formulation of his theorems (based on Turing’s work) and analyzed their broader signif- Further Readings icance for the philosophy of mathematics and mind most carefully in (1951). The first section is devoted to a discus- Dawson, J. W. (1997). Logical Dilemmas—The Life and Work of sion of the Incompleteness Theorems, in particular of the Kurt Gödel. New York: A. K. Peters. second theorem, and argues for a “mathematically estab- lished fact” which is of “great philosophical interest” to Golgi, Camillo Gödel: either the humanly evident axioms of mathematics cannot be comprised by a finite rule given by a Turing machine, or they can be and thus allow the successive Camillo Golgi (1843–1926) was one of a generation of great development of all of demonstrable mathematics. In the lat- neurohistologists that included Kölliker, Gerlach, Nissl, and ter case human mathematical abilities are in principle cap- CAJAL. For these scientists, the cellular nature of nervous tured by a Turing machine, and thus there will be tissue was still enigmatic and controversial, decades after absolutely undecidable problems. That is what can be Schleiden and Schwann had promulgated the theory that strictly inferred from Gödel’s theorems, counter to Lucas, cells are the basic architectonic units of living tissues. What Penrose, and others. Gödel thought that the first disjunct we now somewhat nonchalantly identify as nerve cells had held, as he believed that the second disjunct had to be false; been visualized as early as 1836 (by Valentin); but, with the he emphasized repeatedly, for example in (1964), that his techniques then available, the relationship between cell bod- Golgi, Camillo 353 ies and their protoplasmic extensions could not be clear. A reservations can be seen as an arguably legitimate concern natural interpretation, bizarre as it may now seem, was that about the neuron doctrine in its most stringent formulation. nerve cells were nodes, perhaps nutritive in function, Thus, he makes the interesting distinction between a “nerve embedded within a continuous reticulum of nerve fibers. cell,” which corresponds to a distinct histological entity, and Golgi’s unique and enduring contribution is generally a “neuron,” the definition of which, he suggests, should cited as the discovery of the silver dichromate stain for include its functional operation. In the functional domain, nerve tissue, which for the first time allowed visualization the concept of the “neuron” is indeed elusive and continues of nerve cells in their entirety. The actual discovery is sur- to evolve even at the present moment. rounded with a certain romanticism, an admixture of luck Golgi’s reticularist stance may have derived from a and perseverance. Golgi, the son of a medical practitioner, strong conviction in the unified or holistic nature of brain had taken his degree in medicine (1865), and spent six years function, or at least a preoccupation with how unity (of (1865–71) tending patients at the Ospedale di San Matteo in perception, of consciousness) can result from individual- Pavia while also doing research in brain histology in the lab- ized elements. So stated, this is not necessarily dissimilar oratory of his younger friend and mentor, Giulio Cesare from modern discussions of the BINDING PROBLEM. It is Bizzozero. The actual discovery, however, came while he interesting to read Golgi’s prose against the backdrop of was first resident physician in the home for incurables at current work on functional ensembles linked by temporal Abbiategrasso. Working in the evenings by candlelight in response properties (“l’action d’ensemble des cellules the kitchen of his hospital apartment (da Fano 1926), he nerveuses, que j’ai ainsi définée par opposition à la préten- continued the research that led to the new technique. The due action individuelle,” Golgi 1908; “the group action of resulting article, published in 1873 (”On the structure of the nerve cells which I have defined as being opposite to their gray matter of the brain”), has a refreshing simplicity: alleged individual action,” Golgi 1967: 202). “Using the method I have developed for staining brain ele- Golgi is also known for his discovery of the “internal ments . . . I was able to discover several facts about the reticular apparatus” (smooth endoplasmic reticulum or structure of the grey brain matter which I believe worth “Golgi apparatus;” see Peters, Palay, and de Webster 1991), making known” (in Corsi 1988). and for his distinction between neurons with long or local The silver stain, although used to advantage by Golgi axons (respectively, Golgi Type I and Type II, also observed himself (who subsequently moved to the faculty at Pavia, as early on by Cajal). professor of general pathology and histology, where he In summary, Golgi’s silver stain has become, rightly or remained until his retirement in 1918), was at first dis- not, a favorite illustration of the role of serendipity in scien- missed by the mainstream school of German histologists. In tific discovery. It is an early example of the importance of 1887 the Golgi stain was itself discovered by Ramon y new techniques for advancing the investigation of brain and Cajal, who used it to impressive advantage in the first great cognitive processes. The story is also a lesson in the poten- investigations of functional neuroanatomy. Throughout the tial deceptiveness of “concrete” images—whether we read it twentieth century, the Golgi stain remained important in as a case of missed opportunity and intransigence, or of investigations of normative structure and of changes associ- being right for the wrong reasons. ated with development, pathology, or plasticity. It has to See also BINDING BY NEURAL SYNCHRONY; CORTICAL some extent been superseded by intracellular injection of LOCALIZATION, HISTORY OF tracers (such as biocytin or horseradish peroxidase), but —Kathleen S. Rockland remains a valuable method for visualizing larger popula- tions of cells and when experimental injection is not feasible (i.e., in most human material). In tribute to the elegance of References the original silver methods, high-quality cellular images, Clark, E., and C. D. O’Malley. (1968). The Human Brain and Spi- immunocytochemical or intracellular, are still evaluated as nal Cord. Berkeley: University of California Press. “pseudo-Golgi” or “Golgi-like.” Corsi, P. (1988). Camillo Golgi’s morphological approach to neu- Golgi further deserves acknowledgment for his role in roanatomy. In R. L. Masland, A. Portera Sanchez, and G. Tof- the early polemics surrounding the NEURON doctrine. This fano, Eds., Neuroplasticity: A New Therapeutic Tool in the CNS debate was articulated dramatically, almost scandalously Pathology. Padova: Liviana Press—Springer (Fidia Research from the modern perspective, in the Nobel addresses for Series 12), pp. 1–7. 1906, when Golgi and Cajal were jointly awarded the prize Da Fano, C. (1926). Camillo Golgi. Journal of Pathology and Bac- in physiology and medicine. Golgi defended the reticularist teriology 29: 500–514. position, while Cajal championed the neuron doctrine Mazzarello, P. (1996). La Struttura Segreta. Pavia: Edizioni Cisal- (respectively representing the “continualists” and the “con- pino. Peters, A., S. L. Palay, and H. de Webster. (1991). The Fine Struc- tiguists”; Van der Loos 1967). Golgi’s position, in light of ture of the Neuron’s System. 3rd ed. New York: Oxford Univer- the facts, has come to be viewed as archaic, and an unfortu- sity Press. nate example of dogma winning out over observation. In his Santini, M., Ed. (1975). Golgi Centennial Symposium: Perspec- defense, it is worth remembering that synaptic morphology tives in Neurobiology. New York: Raven Press. —and in particular the discontinuity of the pre- and post- Shepherd, G. M. (1991). Foundations of the Neuron Doctrine. synaptic elements—was not definitively demonstrated until Oxford: Oxford University Press. electron microscopic studies in the 1950s (see Peters, Palay, Van der Loos, H. (1967). The history of the neuron. In H. Hyden, and de Webster 1991). Moreover, at least some of Golgi’s Ed., The Neuron. Amsterdam: Elsevier, pp. 1–47. 354 Good Old Fashioned AI (GOFAI) of syntactic processing in sentence comprehension have Selected Works by Golgi also been carried out (Tyler 1985; Swinney and Zurif 1995). Golgi, C. (1903–1923). Opera Omnia. R. Fusati, G. Marenghi, and The original studies of patients with syntactic comprehen- S. Sala, Eds., 4 vols. Milan: Hoepli. sion disorders also focused on agrammatic Broca’s aphasics Golgi, C. (1908). La doctrine du neurone. In Les Prix Nobel en (Caramazza and Zurif 1976). However, patients whose 1906 Stockholm: P.A. Norstedt and Söner. lesions lie outside Broca’s area also often show impairments Golgi, C. (1967). The neuron doctrine—theory and facts. In Nobel of syntactically-based sentence comprehension (Berndt, Lectures: Physiology or Medicine 1901–1921. Amsterdam: Mitchum, and Haendiges 1996; Caplan 1987; Caplan and Elsevier, pp. 189–217. Hildebrandt 1988; Caplan, Baker, and Dehaut 1985; Caplan, Hildebrandt, and Makris 1996; Tramo, Baynes, and Volpe Good Old Fashioned AI (GOFAI) 1988), and patients with agrammatism often show good syntactic comprehension (Berndt, Mitchum, and Haendiges 1996). This has led some researchers to suggest that a more SeeINTRODUCTION: COMPUTATIONAL INTELLIGENCE; CON- distributed neural system in the left perisylvian cortex, of NECTIONISM, PHILOSOPHICAL ISSUES which Broca’s area may be a specialized part, is responsible for this function (Mesulam 1990; Damasio and Damasio GPSG 1992). One study (Caplan, Hildebrandt, and Makris 1996) reported a small but clear impairment in syntactic process- ing in comprehension after right hemisphere strokes, sug- SeeGENERATIVE GRAMMAR; HEAD-DRIVEN PHRASE STRUC- gesting some role of the nondominant hemisphere in this TURE GRAMMAR function. Physiological and metabolic studies in normal subjects Grammar, Neural Basis of have also provided information about the brain regions involved in syntactic processing in comprehension. Event Grammar refers to the syntactic structure of sentences that related potentials (ERPs) have shown components, such as allows the meanings of words to be related to each other to the P600 or “syntactic positive shift” in the central parietal form propositions. Linguistics has been concerned with the region and the “left anterior negativity,” that may be asso- way humans’ unconscious knowledge of this structure is ciated with syntactic processing (Hagoort, Brown, and represented. PSYCHOLINGUISTICS has been concerned with Groothusen 1993; Munte, Heinze, and Mangun 1993; Nev- how this knowledge is used in speaking and comprehension. ille et al. 1991; Rosler et al. 1993). Recently, functional There is no way at present to investigate how the nervous neuroimaging with POSITRON EMISSION TOMOGRAPHY system represents syntactic knowledge, but there are two (PET) and functional MAGNETIC RESONANCE IMAGING approaches to the neural basis for syntactic processing. One (fMRI) has been used to investigate the regional cerebral has been the traditional deficit-lesion correlational approach blood flow (rCBF) associated with sentence-level lan- in patients with brain lesions. The second is the observation guage processing. Using PET, Mazoyer et al. (1993) of neurophysiological and metabolic activity associated reported inconsistent rCBF increases associated with syn- with syntactic processing in normal subjects. Both tactic processing, but it may be that their experimental approaches have made some progress, but there are many conditions did not differ in the minimal ways necessary to gaps in our scientific investigation of the question. isolate the neural correlates of the various components of Deficit-lesion correlations are available for patients with linguistic processing above the single-word level. Strom- disorders of both production and receptive processing of swold et al. (1996) reported an isolated increase in rCBF syntactic structures. With respect to the production of syn- in part of Broca’s area associated with syntactic process- tactic form, the speech of patients with a symptom known as ing. Using a slightly different experimental paradigm with agrammatism is characterized by short phrases with simple fMRI, Just et al. (1996) reported an increase in rCBF in syntactic structures and omission of grammatical markers both Broca’s area and a second language area—Wernicke’s and function words. These patients tend to have lesions that area in the left first temporal gyrus —as well as smaller include Broca’s area (pars triangularis and opercularis of the increases in rCBF in the right hemisphere homologues of left third frontal convolution), which has led some research- these structures. The Just et al. results are consistent with ers to suggest that this region is responsible for syntactic those of Caplan, Hildebrandt, and Makris (1996), but more planning in LANGUAGE PRODUCTION (Zurif 1982). Several research is needed to understand the differences across studies have shown, however, that lesions in other brain various studies. areas can produce agrammatism, suggesting that other left In summary, the dominant perisylvian cortex is the region hemisphere areas can be responsible for this function in of the brain most involved in syntactic processing and pro- some individuals (Vanier and Caplan 1990; Dronkers et al. duction. Whether there is any further specialization within 1994). this region for these functions remains to be established. Syntactic processing can also be impaired in comprehen- See also APHASIA; LANGUAGE, NEURAL BASIS OF; PHO- sion, as shown by patients’ failure to understand sentences NOLOGY, NEURAL BASIS OF; SENTENCE PROCESSING; SYN- with more complex syntactic structures whose meaning can- TAX; SYNTAX, ACQUISITION OF not be simply inferred (e.g., The boy was pushed by the —David N. Caplan girl). More detailed studies of disorders of the time-course Grammatical Relations 355 References Zurif, E. B. (1982). The use of data from aphasia in constructing a performance model of language. In M. A. Arbib, D. Caplan, Berndt, R., C. Mitchum, and A. Haendiges. (1996). Comprehen- and J. C. Marshall, Eds., Neural Models of Language Pro- sion of reversible sentences in “agrammatism”: a meta-analysis. cesses. New York: Academic Press, pp. 203–207. Cognition 58: 289–308. Caplan, D. (1987). Discrimination of normal and aphasic subjects Grammatical Relations on a test of syntactic comprehension. Neuropsychologia 25: 173–184. Caplan, D., and N. Hildebrandt. (1988). Disorders of Syntactic In its broadest sense, the term grammatical relation (or Comprehension. Cambridge, MA: MIT Press/Bradford Books. grammatical role or grammatical function) can be used to Caplan, D., C. Baker, and F. Dehaut. (1985). Syntactic determi- refer to almost any relationship within grammar, or at least nants of sentence comprehension in aphasia. Cognition 21: within SYNTAX and MORPHOLOGY. In its narrowest sense, 117–175. grammatical relation is a cover term for grammatical sub- Caplan, D., N. Hildebrandt, and G. S. Waters. (1994). Interaction ject, object, indirect object, and the like. To understand what of verb selectional restrictions, noun animacy and syntactic grammatical relations are, in this more specific sense, it will form in sentence processing. Language and Cognitive Pro- help to contrast them with intuitively similar but distinct cesses 9: 549–585. syntactic concepts such as thematic roles, Cases, and syn- Caplan, D., N. Hildebrandt, and N. Makris. (1996). Location of lesions in stroke patients with deficits in syntactic processing in tactic positions. sentence comprehension. Brain 119: 933–949 We can see the difference between surface grammatical Caramazza, A., and E. B. Zurif. (1976). Dissociation of algorith- relations and semantic relations or THEMATIC ROLES (e.g., mic and heuristic processes in language comprehension: evi- agent, goal, theme, etc.) by examining examples (1) and (2). dence from aphasia. Brain and Language 3: 572–582. In these examples, the noun phrase “that story” has the same Damasio, A. R., and H. Damasio. (1992). Brain and language. Sci- thematic role (theme or patient) in both the active and the entific American (September): 89–95. passive versions of the sentence; however, the grammatical Dronkers, N. F., D. P. Wilkins, R. D. van Valin, B. B. Redfern, and relation of “that story” differs in the two sentences. In the J. J. Jaeger. (1994). A reconsideration of the brain areas active sentence, “that story” has the grammatical relation of involved in the disruption of morphosyntactic comprehension. object, whereas in the passive sentence, it has the grammati- Brain and Language 47: 461–463. Hagoort, P., C. Brown, and J. Groothusen. (1993). The syntactic cal relation of subject. positive shift (SPS) as an ERP measure of syntactic processing. (1) active: This girl wrote that story. Language and Cognitive Processes 8(4): 485–532. Just, M. A., P. A. Carpenter, T. A. Keller, W. F. Eddy, and K. R. (2) passive: That story was written by this girl. Thulborn. (1996). Brain activation modulated by sentence com- prehension. Science 274: 114–116. It is not necessary, however, to appeal to derived contexts Mazoyer, B., N. Tzourio, V. Frak, A. Syrota, N. Murayama, O. in order to distinguish surface grammatical relations and Levrier, and G. Salamon. (1993). The cortical representation of thematic roles. Although the subject of an active sentence is speech. Journal of Cognitive Neuroscience 5: 467–479. Mesulam, M.-M. (1990). Large-scale neurocognitive networks and very often an agent, there are sentences without an agent distributed processing for attention, language and memory. role and in such sentences, some other thematic role such as Annals of Neurology 28(5): 597–613. theme, goal, or experiencer is associated with the grammati- Munte, T. F., H. J. Heinze, and G. R. Mangun. (1993). Dissociation cal relation of subject: of brain activity related to syntactic and semantic aspects of language. Journal of Cognitive Neuroscience 5: 335–344. (3) The ball rolled down the hill. Neville, H., J. L. Nicol, A. Barss, K. I. Forster, and M. F. Garrett. (1991). Syntactically based sentence processing classes: evi- (4) The woman received the letter. dence from event-related brain potentials. Journal of Cognitive Neuroscience 3: 151–165. (5) The boy enjoyed the ice cream. Rosler, F., P. Putz, A. Friederici, and A. Hahne. (1993). Event- Grammatical relations are also distinct from Cases (e.g. related potentials while encountering semantic and syntactic constraint violations. Journal of Cognitive Neuroscience 5: nominative, accusative, dative, etc.). Although the grammat- 345–362. ical relation of subject is often associated with nominative Stromswold, K., D. Caplan, N. Alpert, and S. Rauch. (1996). Case, whereas the grammatical relation of object is often Localization of syntactic comprehension by positron emission associated with accusative Case, there are many examples tomography. Brain and Language 52: 452–473 of other pairings of these grammatical relations and Cases. Swinney, D., and E. Zurif. (1995). Syntactic processing in aphasia. For example, in Icelandic some verbs take a dative subject Brain and Language 50: 225–239. and a nominative object: Tramo, M. J., K. Baynes, and B. T. Volpe. (1988). Impaired syn- tactic comprehension and production in Broca’s aphasia: CT (6) Barninu batnaði veikin. lesion localization and recovery patterns. Neurology 38: 95– the.child-DAT recovered.from the.disease-NOM(*ACC) 98. ‘The child recovered from the disease.’ (Yip, Maling, Tyler, L. (1985). Real-time comprehension processes in agramma- and Jackendoff 1987: 223) tism: a case study. Brain and Language 26: 259–275. Vanier, M., and D. Caplan. (1990). CT-scan correlates of agramma- Hindi also has verbs that take dative subjects, but most transi- tism. In L. Menn and L. K. Obler, Eds., Agrammatic Aphasia. tive verbs in Hindi take an ergative subject and a nominative Amsterdam: Benjamins, pp. 97–114. 356 Grammatical Relations object (in the perfective aspect): ory of grammar is unnecessary and therefore undesirable (e.g., Chomsky 1981; Hoekstra 1984; Williams 1984; Bhat (7) Raam-ne roTii khaayii thii. 1991). Between these poles, there are several middle and Ram(masc.)-ERG bread(fem)-NOM eat(perf, fem) variant positions. Some take the position that although be(past, fem) “Ram had eaten bread.” (Mahajan 1990: grammatical relations are not primitive notions of grammar, 73) they do play an important role in grammar as derived notions (Anderson 1978). Some argue for the need for finer- Thus we see that grammatical relations are distinct from grained grammatical relations, such as adding “restricted Cases. object” (Bresnan and Kanerva 1989), within the theory of Finally, grammatical relations can also be distinguished LEXICAL FUNCTIONAL GRAMMAR. from syntactic positions. If an object is fronted in a topical- Many works whose titles contain the phrase “grammati- ization construction, for example, it remains an object cal relations” are not so much concerned with this theoret- despite the fact that it is located above the subject in the syn- ical controversy, but rather with a somewhat broader sense tactic structure. of grammatical relations, pertaining to how Case, agree- ment, and/or word order identify or distinguish subjects (8) a. I read that book. and objects. In this broader sense, a “theory of grammati- b. That book, I read. cal relations” is assumed to include a theory of Case and Similarly, grammatical relations remain constant across agreement systems that can account for all of the cross-lin- many word orders in languages that allow scrambling. For guistic differences that occur (see TYPOLOGY). Such work example, in all of the word order variants of the Hindi sen- often focuses on languages or particular constructions in tence below, Ram has the grammatical relation of subject which the familiar associations between subjects and nom- and banana has the grammatical relation of object: inative Case or objects and accusative Case do not hold. For example, they address the question of how or why (9) a. Raam-ne kelaa khaayaa. dative or ergative Case is assigned to subjects in construc- Ram-ERG banana-NOM ate tions such as (6) or (7), instead of nominative Case; and ‘Ram ate a banana.’ why nominative Case is instead assigned to the objects in b. Raam-ne khaayaa kelaa. those constructions. One approach to this problem of non- c. Kelaa raam-ne khaayaa. prototypic associations between Cases and grammatical d. Kelaa khaayaa raam-ne. relations has been to propose that at some level, the proto- e. Khaayaa raam-ne kelaa. typic association (e.g., between nominative Case and sub- f. Khaayaa kelaa raam-ne. (Mahajan 1990: 19) jects) actually does hold, but that grammatical relations (or So far, we have been discussing only surface grammati- structural positions) are inverted at some level in the deri- cal relations; that is, the grammatical relations that hold at vation (e.g., Harris 1976; Marantz 1984). Others maintain surface structure after all movements or other grammatical that assuming such a close association between Case and relation-changing processes have occurred. However, one grammatical relations (or syntactic positions) is not correct may also speak of deep or initial grammatical relations. In and that there are conditions under which subjects can take both the active sentence in (10) and the passive sentence in other Cases, especially lexical (inherent, quirky) Cases, (11), one may say that that banana has the initial grammati- freeing up the nominative Case which can then be assigned cal relation of object. to an object (e.g., Yip, Maling, and Jackendoff 1987). Other works with “grammatical relations” in the title (10) Ram ate that banana. focus on the question of how to describe and analyze a range of constructions that appear to involve changes in (11) That banana was eaten by Ram. grammatical relations, such as passive, causative, or appli- cative constructions, or on restrictions on various grammati- Thus there is a closer association between initial grammati- cal processes such as relativization that may be stated in cal relations and thematic roles than there is between sur- terms of grammatical relations (e.g., Gary and Keenan face grammatical relations and thematic roles. 1977; Marantz 1984; Baker 1988). Almost all linguists use the terms grammatical relation, See also HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR; subject, object, and the like in a descriptive sense. However, theories of grammar differ widely with respect to the ques- SYNTAX, ACQUISITION OF tion of the theoretical status of grammatical relations. The —Ellen Woolford controversy centers around the question of how many for- mal devices of what kind the correct theory of grammar References includes. Proponents of RELATIONAL GRAMMAR (Perlmutter and Postal 1977; Perlmutter 1983) maintain that grammati- Anderson, J. (1978). On the derivative status of grammatical rela- cal relations are primitive notions in the theory of grammar tions. In W. Abraham, Ed., Valence, Semantic Case, and Gram- and that universal generalizations about language are for- matical Relations. Amsterdam: John Benjamins. mulated in terms of those primitives. Others maintain that Baker, M. (1988). Incorporation: A Theory of Grammatical Func- no grammatical rules make crucial reference to grammatical tion Changing. Chicago: University of Chicago Press. relations, as distinct from thematic roles, Cases, or syntactic Bhat, D. N. S. (1991). Grammatical Relations: The Evidence positions and that adding grammatical relations to the the- Against Their Necessity and Universality. London: Routledge. Greedy Local Search 357 Bresnan, J., and J. Kanerva. (1989). Locative inversion in Chich- Maxwell, E. M. (1981). Question strategics and hierarchies of ewa: a case study of factorization in grammar. Linguistic grammatical relations in Kinyarwanda. Proceedings of the Ber- Inquiry 20: 1–50. keley Linguistics Society 7: 166–177. Chomsky, N. (1981). Lectures on Government and Binding. Dor- Palmer, F. R. (1994). Grammatical Roles and Relations. Cam- drecht: Foris. bridge: Cambridge University Press. Gary, J. O., and E. Keenan. (1977). Grammatical relations in Kin- Perlmutter, D. (1978). Impersonal passives and the unaccusative yarwanda and universal grammar. In Paul F. Kotey and H. Der- hypothesis. Proceedings of the Berkeley Linguistics Society 4: Houssikian, Eds., Language and Linguistic Problems in Africa. 157–189. Columbia, SC: Hornbeam Press, pp. 315–329. Perlmutter, D., and C. G. Rosen, Eds. (1984). Studies in Relational Harris, A. C. (1976). Grammatical Relations in Modern Georgian. Grammar 2. Chicago: University of Chicago Press. Ph.D. diss., Harvard University. Plank, F., Ed. (1979). Ergativity: Toward a Theory of Grammatical Hoekstra, T. (1984). Transitivity: Grammatical Relations in Gov- Relations. London: Academic Press. ernment-Binding Theory. Dordrecht: Foris. Plank, F., Ed. (1984). Objects: Towards A Theory of Grammatical Mahajan, A. (1990). The A/A-Bar Distinction and Movement The- Relations. London: Academic Press. ory. Ph.D. diss., MIT. Distributed by MIT Working Papers in Postal, P., and B. D. Joseph. (1990). Studies in Relational Gram- Linguistics. mar 3. Chicago: University of Chicago Press. Marantz, A. (1984). On the Nature of Grammatical Relations. Scancarelli, J. (1987). Grammatical Relations and Verb Agreement Cambridge, MA: MIT Press. in Cherokee. Ph.D. diss., UCLA. Perlmutter, D., Ed. (1983). Studies in Relational Grammar I. Chi- Woolford, E. (1993). Symmetric and asymmetric passives. Natural cago: University of Chicago Press. Language and Linguistic Theory 11: 679–728. Perlmutter, D., and P. Postal. (1977). Toward a universal character- Woolford, E. (1997). Four-way case systems: nominative, ergative, ization of passive. Proceedings of the Berkeley Linguistics Soci- objective, and accusative. Natural Language and Linguistic ety 3: 394–417. Theory 15: 181–227. Williams, E. (1984). Grammatical relations. Linguistic Inquiry 15: 639–673. Grammatical Theory Yip, M., J. Maling, and R. Jackendoff. (1987). Case in tiers. Lan- guage 63: 217–250. See INTRODUCTION: LINGUISTICS AND LANGUAGE; GENERA- TIVE GRAMMAR Further Readings Grasping Abraham, W., Ed. (1978) Valence, Semantic Case, and Grammati- cal Relations. Amsterdam: John Benjamins. Anderson, J. M. (1977). On Case Grammar: Prolegomena to a See MANIPULATION AND GRASPING Theory of Grammatical Relations. London: Croom Helm. Bowers, J. (1981). The Theory of Grammatical Relations. Ithaca: Greedy Local Search Cornell University Press. Bresnan, J., Ed. (1982). The Mental Representation of Grammati- cal Relations. Cambridge, MA: MIT Press. Greedy local search search methods are widely used to Burgess, C., K. Dziwirek, and D. Gerdts. (1995). Grammatical solve challenging computational problems. One of the earli- Relations: Theoretical Approaches to Empirical Questions. est applications of local search was to find good solutions Stanford: CSLI Publications. for the traveling salesman problem (TSP). In this problem, Campe, P. (1994). Case, Semantic Roles, and Grammatical Rela- the goal is to find the shortest path for visiting a given set of tions: A Comprehensive Bibliography. Amsterdam: John Ben- cities. The TSP is prototypical of a large class of computa- jamins. Cole, P., and J. Sadock, Eds. (1977). Grammatical Relations (Syn- tional problems for which it is widely believed that no effi- tax and Semantics 8). New York: Academic Press. cient (i.e., polynomial time) ALGORITHM exists. Technically Croft, W. (1991). Syntactic Categories and Grammatical Rela- speaking, it is an NP-hard optimization problem (Cook tions: The Cognitive Organization of Information. Chicago: 1971; Garey and Johnson 1979; Papadimitriou and Steiglitz University of Chicago Press. 1982). A local search method for the TSP proceeds as fol- Dziwirek, P. F., and E. Mejías-Bikandi. (1990). Grammatical Rela- lows: start with an arbitrary path that visits all cities, such a tions: A Cross-Theoretical Perspective. Stanford: CSLI Publi- path will define an order in which the cities are to be visited. cations. Subsequently, one makes small (“local”) changes to the path Faarlund, J. T. (1987). On the history of grammatical relations. to try to find a shorter one. An example of such a local Proceedings of the Chicago Linguistic Society 23: 64–78. change is to swap the position of two cities on the tour. One Gary, J. O., and E. Keenan. (1977). On collapsing grammatical relations in universal grammar. Syntax and Semantics 8: 83– continues making such changes until no swap leads to a 120. shorter path. Lin (1965) and Lin and Kernighan (1973) Gerdts, D. (1993). Mapping Halkomelem grammatical relations. show that such a simple procedure, with only a slightly Linguistics 31: 591–621. more complex local change, leads to solutions that are sur- Hudson, R. A. (1988). Extraction and grammatical relations. Lin- prisingly close to the shortest possible path. gua 76: 177–208. The basic local search framework allows for several vari- Jake, J. (1983). Grammatical Relations in Imabaura Quechua. ations. For example, there is the choice of the initial solu- Ph.D. diss., University of Illinois. tion, the nature of the local changes considered, and the Keenan, E., and B. Comrie. (1977). Noun phrase accessibility and manner in which the actual improvement of the current universal grammar. Linguistic Inquiry 8: 63–99. 358 Greedy Local Search solution is selected. Lin and Kernighan found that multiple expression should evaluate to “true.” In our example, setting runs with different random initial paths lead to the best solu- a to “true,” b to “true,” and c to “false” satisfies the formula. tions. Somewhat surprisingly, starting with good initial Finding a satisfying assignment for arbitrary formulas is a paths did not necessarily lead to better final solutions. The computationally difficult task (Cook 1971). Note that the obvious algorithm would enumerate all 2N Boolean truth reason for this appears to be that the local search mecha- nism itself is powerful enough to improve on the initial assignments, where N is the number of Boolean variables. solutions—often quickly giving better solutions than those The SAT problem is of particular interest to computer scien- generated using other methods. Choosing the best set of tists because many other problems can be efficiently repre- local changes to be considered generally requires an empiri- sented as Boolean satisfiability problems. The best cal comparison of various kinds of local modifications that traditional methods for solving the SAT problem are based are feasible. Another issue is that of how to select the actual on a systematic backtrack-style search procedure, called the improvement to be made to the current solution. The two Davis, Putnam, and Loveland procedure (Davis and Putnam extremes are first-improvement (also called “hill- 1960; Davis, Logemann, and Loveland 1962). These proce- climbing”), in which any favorable change is accepted, and dures can currently solve hard, randomly generated steepest-descent, in which the best possible local improve- instances with up to four hundred variables (Mitchell, Sel- ment is selected at each step. Steepest-descent is sometimes man, and Levesque 1992; Crawford and Auton 1993; Kirk- referred to as greedy local search, but this term is also used patrick and Selman 1994). In 1992, Selman, Livesque, and to refer to local search in general. Mitchell showed that a greedy local search method, called A local search method does not necessarily reach a global GSAT, could solve instances with up to seven hundred vari- optimum because the algorithm terminates when it reaches a ables. Recent improvements on the local search strategy state where no further improvement can be found. Such enable us to solve instances with up to three thousand vari- states are referred to as local optima. In 1983, Kirkpatrick, ables (Selman, Kautz, and Cohen 1994). The GSAT proce- Gelatt, and Vecchi introduced a technique for escaping from dure starts with a randomly generated truth assignment. It such local optima. The idea is to allow the algorithm to make then considers changing the truth value of one of the Bool- occasional changes that do not improve the current solution, ean variables in order to satisfy more of the given logical that is, changes that lead to equally good or possibly inferior expression. It keeps making those changes until a satisfying solutions. Intuitively speaking, these nonimproving moves truth assignment is found or until the procedure reaches can be viewed as injecting noise into the local search pro- some preset maximum number of changes. When it reaches cess. Kirkpatrick, Gelatt, and Vecchi referred to their method this maximum, GSAT restarts with a new initial random as simulated annealing, because it was inspired by the assignment. (For closely related work in the area of schedul- annealing technique used to reach low energy states in ing, see Minton et al. 1992.) One inherent limitation of local glasses and metals. The amount of “noise” introduced is con- search procedures applied to decision problems is that they trolled with a parameter, called the temperature T. Higher cannot be used to determine whether a logical expression is values of T correspond to more noise. The search starts off at inconsistent, that is, no satisfying truth assignment exists. In a high temperature, which is slowly lowered during the practice, this means that one has to use model-based formu- search in order to reach increasingly better solutions. lations, where solutions correspond to models or satisfying Another effective way of escaping from local minima is assignments. the tabu search method (Glover 1989). During the search, An important difference in applying local search to the algorithm maintains a “tabu” list containing the last L decision problems, as opposed to optimization problems, changes, where L is a constant. The local search method is is that near-solutions are of no particular interest. For prevented from making a change that is currently on the decision problems, the goal is to find a solution that satis- tabu list. With the appropriate choice of L, this methods fies all constraints of the problem under consideration (see often forces the search to make upward (nonimproving) also CONSTRAINT SATISFACTION and HEURISTIC SEARCH). changes, again introducing noise into the search. In practice, this means that, for example, GSAT and related Genetic algorithms can also be viewed as performing a local search procedures spend most of their time satisfying form of local search (Holland 1975). In this case, the search the last few remaining constraints. Recent work has shown process proceeds in parallel. Solutions are selected based on that incorporating random walk-style methods in the their “fitness” (i.e., solution quality) from an evolving popu- search process greatly enhances the effectiveness of these lation of candidates. Noise is introduced in the search pro- procedures. cess via random mutations (see EVOLUTIONARY COMPU- Since Lin and Kernighan’s successful application of local search to the TSP, and the many subsequent enhance- TATION). A recent new area of application for local search meth- ments to the local search method, local search techniques ods is in solving NP-complete decision problems, such as have proved so powerful and general that such procedures the Boolean satisfiability (SAT) problem. An instance of have become the method of choice for solving hard compu- SAT is a logical expression over a set of Boolean variables. tational problems. An example expression is “(a or (not b)) and ((not a) or (not See also COMPUTATIONAL COMPLEXITY; GAME–PLAYING c)).” The formula has a, b, and c as Boolean variables. The SYSTEMS; INTELLIGENT AGENT ARCHITECTURE; PROBLEM satisfiability problem is to find an assignment to the Bool- SOLVING; RATIONAL DECISION MAKING ean variables such that the various parts of the logical —Bart Selman expression are simultaneously satisfied. That is, the overall Grice, H. Paul 359 cative intention is thus a self-referential, or reflexive, References intention. It does not involve a series of nested inten- Cook, S. A. (1971). The complexity of theorem–proving proce- tions—the speaker does not have an intention to convey dures. Proc. STOC–71, pp. 151–158. something and a further intention that the first be recog- Crawford, J. M., and L. D. Auton. (1993). Experimental results on nized, for then this further intention would require a still the cross–over point in satisfiability problems. Proc. AAAI–93, further intention that it be recognized, and so on ad infini- pp. 21–27. tum. Confusing reflexive with iterated intentions, to which Davis, M., G. Logemann, and D. Loveland. (1962). A machine even Grice himself was prone, led to an extensive litera- program for theorem-proving. Comm. Assoc. for Comput. ture replete with counterexamples to ever more elaborate Mach. 5: 394–397. characterizations of the intentions required for genuine Davis, M., and H. Putnam. (1960). A computing procedure for quantification theory. J. of the ACM 7: 201–215. communication (see, e.g., Strawson 1964 and Schiffer Garey, M. R., and D. S. Johnson. (1979). Computers and Intracta- 1972), and to the spurious objection that it involves an bility, A Guide to the Theory of NP-Completeness. New York: infinite regress (see Sperber and Wilson 1986, whose own W. H. Freeman. RELEVANCE theory neglects the reflexivity of communica- Glover, F. (1989) Tabu search—part I. ORSA Journal on Comput- tive intentions). Although the idea of reflexive intentions ing 1(3): 190–206. raises subtle issues (see the exchange between Recanati Holland, J. H. (1975). Adaptation in Natural and Artificial Sys- 1987 and Bach 1987), it clearly accounts for the essen- tems. Ann Arbor: University of Michigan Press. tially overt character of communicative intentions, Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi. (1983). Optimiza- namely, that their fulfillment consists of their recognition tion by simulated annealing. Science 220: 671–680. (by the intended audience). This idea forms the core of a Kirkpatrick, S., and B. Selman. (1994). Critical behavior in the sat- isfiability of random Boolean expressions. Science 264: 1297– Gricean approach to the theory of speech acts, including 1301. nonliteral and indirect speech acts (Bach and Harnish Lin, S. (1965). Computer solutions of the traveling salesman prob- 1979). Different types of speech acts (statements, lem. BSTJ 44 (10): 2245–2269. requests, apologies, etc.) may be distinguished by the type Lin, S., and B. W. Kernighan. (1973). An effective heuristic for the of propositional attitude (belief, desire, regret, etc.) being traveling–salesman problem. Oper. Res. 21: 498–516. expressed by the speaker. Minton, S., M. Johnston, A. B. Philips, and P. Laird. (1992). Mini- Grice’s distinction between speaker’s and linguistic mizing conflicts: a heuristic repair method for constraint satis- MEANING reflects the fact that what a speaker means in faction and scheduling problems. Artificial Intelligence 58: uttering a sentence often diverges from what the sentence 161–205. itself means. A speaker can mean something other than Mitchell, D., B. Selman, and H. J. Levesque. (1992). Hard and easy distributions of SAT problems. Proc. AAAI–92, pp. 459– what the sentence means, as in “Nature abhors a vacuum,” 465. or something more, as in “Is there a doctor in the house?” Papadimitriou, C. H., and K. Steiglitz. (1982). Combinatorial Grice invoked this distinction for two reasons. First, he Optimization. Englewood Cliffs, NJ: Prentice–Hall. thought linguistic meaning could be reduced to (standard- Selman, B., H. A. Kautz, and B. Cohen. (1994). Noise strategies ized) speaker’s meaning. This reductive view has not for improving local search. Proc. AAAI–94, pp. 337–343. gained wide acceptance, because of its extreme complexity Selman, B., H. J. Levesque, and D. Mitchell. (1992). A new (see Grice 1989: chaps. 6 and 14, and Schiffer 1972) and method for solving hard satisfiability problems. Proc. AAAI– because it requires the controversial assumption that lan- 92, pp. 440–446. guage is essentially a vehicle for communicating thoughts rather than a medium of thought itself. Even so, many phi- Grice, H. Paul losophers would at least concede that mental content is a more fundamental notion than linguistic meaning, and per- H. Paul Grice (1913–1988), the English philosopher, is best haps even that SEMANTICS reduces to the psychology of known for his contributions to the theory of meaning and PROPOSITIONAL ATTITUDES. communication. This work (collected in Grice 1989) has Grice’s other reason for invoking the distinction had lasting importance for philosophy and linguistics, with between speaker’s and linguistic meaning was to combat implications for cognitive science generally. His three most extravagant claims, made by so-called ordinary language influential contributions concern the nature of communica- philosophers, about various important philosophical tion, the distinction between speaker’s meaning and linguis- terms, such as believes or looks. For example, it was tic meaning, and the phenomenon of conversational sometimes suggested that believing implies not knowing, because to say, for example, “I believe that alcohol is dan- IMPLICATURE. Grice’s concept of speaker’s meaning was an ingenious gerous” is to imply that one does not know this, or to say refinement of the crude idea that communication is a mat- “The sky looks blue” is to imply that the sky might not ter of intentionally affecting another person’s psychologi- actually be blue. However, as Grice pointed out, what car- cal states. He discovered that there is a distinctive, rational ries such implications is not what one is saying but that means by which the effect is achieved: by way of getting one is saying it (as opposed to the stronger “I know that one’s audience to recognize one’s intention to achieve it. alcohol is dangerous” or “The sky is blue”). Grice also The intention includes, as part of its content, that the audi- objected to certain ambiguity claims, for instance that or ence recognize this very intention by taking into account has an exclusive as well as inclusive sense, as in “I would the fact that they are intended to recognize it. A communi- like an apple or an orange,” by pointing out that it is the 360 Gustation use of or, not the word itself, that carries the implication Strawson, P. F. (1964). Intention and convention in speech acts. Philosophical Review 73: 439–460. of exclusivity. Grice’s Modified Occam’s Razor (“Senses are not to be multiplied beyond necessity”) cut back on a growing conflation of (linguistic) meaning with use, and Further Readings has since helped linguists appreciate the importance of Carston, R. (1988). Implicature, explicature, and truth-theoretic separating, so far as possible, the domain of PRAGMATICS semantics. In R. Kempson, Ed., Mental Representations: The from semantics. Interface between Language and Reality. Cambridge: Cam- Conversational implicature is a case in point. What a bridge University Press. Reprinted in Davis (1991), pp. 33–51. speaker implicates is distinct from what the speaker says Davis, S., Ed. (1991). Pragmatics: A Reader. Oxford: Oxford Uni- and from what his words imply. Saying of an expensive versity Press. dinner, “It was edible,” implicates that it was mediocre at Grandy, R., and R. Warner, Eds. (1986). Philosophical Grounds of best. This simple example illustrates a general phenome- Rationality: Intentions, Categories, Ends. Oxford: Oxford Uni- non: a speaker can say one thing and manage to mean versity Press. something else or something more by exploiting the fact Harnish, R. M. (1976). Logical form and implicature. In T. Bever, J. Katz, and T. Langendoen, Eds., An Integrated Theory of Lin- that he may be presumed to be cooperative, in particular, guistic Ability. New York: Crowell. Reprinted in Davis (1991), to be speaking truthfully, informatively, relevantly, and pp. 316–364. otherwise appropriately. The listener relies on this pre- Horn, L. (1984). Toward a new taxonomy for pragmatic inference: sumption to make a contextually driven inference from Q-based and R-based implicature. In D. Schiffrin, Ed., Mean- what the speaker says to what the speaker means. If taking ing, Form, and Use in Context. Washington, DC: Georgetown the utterance at face value is incompatible with this pre- University Press, pp. 11–42. sumption, one may suppose that the speaker intends one to Levinson, S. (Forthcoming). Default Meanings: The Theory of figure out what the speaker does mean by searching for an Generalized Conversational Implicature. explanation of why the speaker said what he said. Lewis, D. (1979). Scorekeeping in a language game. Journal of Although Grice’s distinction between what is said and Philosophical Logic 8: 339–359. Neale, S. (1992). Paul Grice and the philosophy of language. Lin- what is implicated is not exhaustive (for what it omits, see guistics and Philosophy 15: 509–559. Bach 1994), the theoretical strategy derived from it aims to Recanati, F. (1989). The pragmatics of what is said. Mind and Lan- reduce the burden on semantics and to explain a wide guage 4: 295–328. Reprinted in Davis (1991), pp. 97–120. range of nonsemantic phenomena at an appropriate level of generality. This strategy has had lasting application to a wide range of problems in philosophy of language as well Gustation as other areas of philosophy, such as epistemology and eth- ics, and to various areas of research in linguistics and com- See TASTE puter science, such as the LEXICON, ANAPHORA, DISCOURSE, and PLANNING. Economy and plausibility of theory require heeding Grice’s distinction between Haptic Perception speaker’s and linguistic meaning, and the correlative dis- tinction between speaker’s and linguistic reference. Rather The haptic sensory modality is based on cutaneous recep- than overly attribute features to specific linguistic items, tors lying beneath the skin surface and kinesthetic receptors one can proceed on the default assumption that uses of lan- found in muscles, tendons, and joints (Loomis and Leder- guage can be explained in terms of a core of linguistic man 1986). The haptic modality primarily provides infor- meaning together with general facts about rational commu- mation about objects and surfaces in contact with the nication. perceiver, although heat and vibration from remote sources See also FOLK PSYCHOLOGY; FREGE, GOTTLOB; LAN- can be sensed (see also PAIN). Haptic perception provides a GUAGE AND COMMUNICATION; SENSE AND REFERENCE rich representation of the perceiver’s proximal surroundings —Kent Bach and is critical in guiding manipulation of objects. Beneath the surface of the skin lie a variety of structures References that mediate cutaneous (or tactile) perception (see, e.g., Bol- anowski et al. 1988; Cholewiak and Collins 1991). These Bach, K. (1987). On communicative intentions: a reply to include four specialized end organs: Meissner corpuscles, Recanati. Mind and Language 2: 141–154. Bach, K. (1994). Conversational implicature. Mind and Language Merkel disks, Pacinian corpuscles, and Ruffini endings. 9: 124–162. There is substantial evidence that these organs play the role Bach, K., and R. M. Harnish. (1979). Linguistic Communication of mechanoreceptors, which transduce forces applied to the and Speech Acts. Cambridge, MA: MIT Press. skin into neural signals. The mechanoreceptors can be func- Grice, P. (1989). Studies in the Way of Words. Cambridge, MA: tionally categorized by the size of their receptive fields Harvard University Press. (large or small) and their temporal properties (fast adapting, Recanati, F. (1986). On defining communicative intentions. Mind FA, or slowly adapting, SA). The resulting 2 × 2 clas- and Language 1: 213–242. sification comprises (1) FAI receptors, which are rapidly Schiffer, S. (1972). Meaning. Oxford: Oxford University Press. adapting, have small receptive fields, and are believed to Sperber, D., and D. Wilson. (1986). Relevance. Cambridge, MA: correspond to the Meissner corpuscles; (2) FAII receptors, Harvard University Press. Haptic Perception 361 which are rapidly adapting, have large receptive fields, and that weight can be judged by wielding an object (as occurs likely correspond to the Pacinian corpuscles (hence also during unsupported holding), because the motion provides called PCs); (3) SAI receptors, which are slowly adapting, information about the object’s resistance to rotation, which have small receptive fields, and likely correspond to the is related to its mass and volume (Amazeen and Turvey Merkel disks; and (4) SAII receptors, which are slowly 1996). adapting, have large receptive fields, and likely correspond With free exploration, familiar common objects can usu- to the Ruffini endings. Among other cutaneous neural popu- ally be identified haptically (i.e., without vision) with virtu- lations are thermal receptors that respond to cold or warmth. ally no error, within a period of 1–2 s (Klatzky, Lederman, By virtue of differences in their temporal and spatial and Metzger 1985; see also OBJECT RECOGNITION). The responses, the various mechanoreceptors mediate different sequence of exploratory procedures during identification types of sensations. The Pacinian corpuscles have a maxi- appears to be driven both by the goal of maximizing mum response for trains of impulses on the order of 250 Hz bottom-up information and by top-down hypothesis testing. and hence serve to detect vibratory signals, like those that Object exploration tends to begin with general-purpose pro- arise when very fine surfaces are stroked or when an object cedures, which provide coarse information about multiple is initially contacted. The SAI receptors, by virtue of their object properties, and proceed to specialized procedures, sustained response and relatively fine spatial resolution, are which test for idiosyncratic features of the hypothesized implicated in the perception of patterns pressed into the object (Lederman and Klatzky 1990). skin, such as braille symbols (Phillips, Johansson, and Although haptic object identification usually has a time- Johnson 1990). The SAIs also appear to mediate the percep- course of seconds, considerable information about objects tion of roughness, when surfaces have raised elements sepa- can be acquired from briefer contact. Intensive properties of rated by about 1 mm or more (Connor and Johnson 1992; objects—those that can be coded unidimensionally (i.e., not see TEXTURE). with respect to layout in 2-D or 3-D space)—can be extracted The responses of haptic receptors are affected by move- with minimal movement of the fingers and in parallel across ments of the limbs, which produce concomitant changes in multiple fingers (Lederman and Klatzky 1997). When an the nature of contact between the skin and touched surfaces. array of surface elements is simultaneously presented across This dependence of perception on movement makes haptic multiple fingers, the time to determine whether an intensively perception active and purposive. Characteristic, stereotyped coded target feature (e.g., a rough surface) is present can patterns of movement arise when information is sought about average on the order of 400 ms, including response selection a particular object property. For example, when determining and motor output. Properties extracted during such early the roughness of a surface, people typically produce motion touch can form the basis for object identification: a 200-ms laterally between the skin and the surface, by stroking or rub- period of contact, without finger movement, is sufficient for bing. Such a specialized movement pattern is called an identification at levels above chance (Klatzky and Lederman exploratory procedure (Lederman and Klatzky 1987). 1995). An exploratory procedure is said to be associated with an A critical role for haptic perception is to support manipu- object property if it is typically used when information latory actions on objects (see also MOTOR CONTROL). When about that property is called for. A number of exploratory an object is lifted, signals from cutaneous afferents allow a procedures have been documented. In addition to the lateral grip force to be set to just above the threshold needed to pre- motion procedure associated with surface texture, there is vent slip (Westling and Johannson 1987). During lifting, unsupported holding, used to sense weight; pressure, used incipient slip is sensed by the FA receptors, leading to cor- to sense compliance; enclosure, used to sense global shape rective adjustments in grip force (Johannson and Westling and volume; static contact, used to determine apparent tem- 1987). Adjustments also occur during initial contact in perature; and contour following, used to determine precise response to perceived object properties such as coefficient shape. The exploratory procedure associated with a property of friction (Johannson and Westling 1987). Age-related ele- during free exploration also turns out to be optimal, in terms vations in cutaneous sensory thresholds lead older adults to of speed and/or accuracy, or even necessary (in the case of use grip force that is substantially greater than the level contour following), for extracting information about that needed to prevent slip (Cole 1991). property; an exploratory procedure that is optimal for one See also ECOLOGICAL PSYCHOLOGY; MANIPULATION AND property may also deliver relatively coarse information GRASPING; SMELL; TASTE about others (Lederman and Klatzky 1987). —Roberta Klatzky The exploratory procedures appear to optimize percep- tion of an object property by facilitating a computational References process that derives that property from sensory signals. For example, the exploratory procedure called static contact Amazeen, E. L., and M. T. Turvey. (1996). Weight perception and promotes perception of surface temperature, because it the haptic size-weight illusion are functions of the inertia ten- characteristically involves a large skin surface and therefore sor. Journal of Experimental Psychology: Human Perception produces a summated signal from spatially distributed ther- and Performance 22: 213–232. mal receptors (Kenshalo 1984). Texture perception is Bolanowski, S. J., Jr., G. A. Gescheider, R. T. Verrillo, and C. M. enhanced by lateral motion of the skin across a surface, Checkosky. (1988). Four channels mediate the mechanical because the scanning motion increases the response of the aspects of touch. Journal of the Acoustical Society of America SA units (Johnson and Lamb 1981). It has been proposed 84(5): 1680–1694. 362 Head-Driven Phrase Structure Grammar to which Sag and Wasow (1998) offers an ele- Cholewiak, R., and A. Collins. (1991). Sensory and physiological GRAMMAR bases of touch. In M. A. Heller and W. Schiff, Eds., The Psy- mentary introduction. Two assumptions underlie the the- chology of Touch. Mahwah, NJ: Erlbaum, pp. 23–60. ory of head-driven phrase structure grammars. The first is Cole, K. J. (1991). Grasp force control in older adults. Journal of that languages are systems of types of linguistic objects Motor Behavior 23: 251–258. like word, phrase, clause, person, index, form-type, con- Connor, C. E., and K. O. Johnson. (1992). Neural coding of tac- tent, rather than collections of sentences. The other is that tile texture: Comparison of spatial and temporal mechanisms grammars are best represented as process-neutral systems for roughness perception. The Journal of Neuroscience 12: of declarative constraints (as opposed to constraints 3414–3426. defined in terms of operations on objects as in transforma- Johannson, R. S., and G. Westling. (1987). Signals in tactile affer- tional grammar). Representations are structurally uniform: ents from the fingers eliciting adaptive motor responses during all objects of a particular type have all and only the precision grip. Experimental Brain Research 66: 141–154. Johnson, K. O., and G. D. Lamb. (1981). Neural mechanisms of spa- attributes defined for that type. What attributes are defined tial tactile discrimination: Neural patterns evoked by Braille-like for an object type is restricted empirically, not by a priori dot patterns in the monkey. Journal of Physiology 310: 117–144. conditions; they cover phonological, semantic, structural, Kenshalo, D. R. (1984). Cutaneous temperature sensitivity. In W. contextual, formal and selectional (subcategorizational) W. Dawson and J. M. Enoch, Eds., Foundations of Sensory Sci- properties. ence. Berlin: Springer, pp. 419–464. A grammar (and for that matter, a theory of universal Klatzky, R., S. Lederman, and V. Metzger. (1985). Identifying grammar) is thus seen as consisting of an inheritance hierar- objects by touch: an “expert system.” Perception and Psycho- chy of such types (an “is-a” hierarchy similar to familiar physics 37: 299–302. semantic networks of the sort that have “creature” as a root Klatzky, R. L., and S. J. Lederman. (1995). Identifying objects and progressively more specific nodes on a branch leading from a haptic glance. Perception and Psychophysics 57(8): 1111–1123. to a particular canary “Tweety”). The types are interrelated Lederman, S. J., and R. L. Klatzky. (1987). Hand movements: a in two ways. First, some types are defined in terms of other window into haptic object recognition. Cognitive Psychology types. Second, the hierarchy allows for multiple inheritance, 19: 342–368. in that linguistic objects can belong to multiple categories at Lederman, S. J., and R. L. Klatzky. (1990). Haptic object classifi- the same time, just as other conceptual objects do. The con- cation: knowledge driven exploration. Cognitive Psychology straints in the linguistic hierarchy are all local, so that well- 22: 421–459. formedness is determined exclusively with reference to a Lederman, S. J., and R. L. Klatzky. (1997). Relative availability of given structure, and not by comparison to any other candi- surface and object properties during early haptic processing. date structures. The LEXICON is a rich subhierarchy within Journal of Experimental Psychology: Human Perception and the larger hierarchy constituting the grammar. Having Performance 23: 1680–1707. Loomis, J., and S. Lederman. (1986). Tactual perception. In K. declarative constraints on a hierarchy of interrelated types Boff, L. Kaufman, and J. Thomas, Eds., Handbook of Human of linguistic objects is seen as enabling an account of lan- Perception and Performance. New York: Wiley, pp. 1–41. guage processing which is incremental and pervasively inte- Phillips, J. R., R. S. Johansson, and K. D. Johnson. (1990). Repre- grative. Thus, as long as information about grammatical sentation of braille characters in human nerve fibres. Experi- number is consistent, it does not matter whether it comes mental Brain Research 81: 589–592. from a verb or its subject, as shown by the fact that (1–3) are Westling, G., and R. S. Johannson. (1987). Responses in glabrous acceptable, whereas (4) is not. skin mechanoreceptors during precision grip in humans. Exper- imental Brain Research 66: 128–140. 1. The dogs slept in the barn. 2. The sheep which was mine stayed in the pen. Further Readings 3. The sheep which stayed in the pen were mine. 4. *The sheep which was mine are in the pen. Heller, M. A., and W. Schiff. (1991). The Psychology of Touch. Mahwah, NJ: Erlbaum. Linguistic objects are modeled as feature structures. Fea- Jeannerod, M., and J. Grafman., Eds. (1997). Handbook of Neu- ture structures are complete specifications of values for all ropsychology, vol. 11. (Section 16: Action and Cognition). the attributes that are appropriate for the particular sort of Amsterdam: Elsevier. object that they model, and they are the entities constrained Katz, D. (1989). The World of Touch. L. E. Krueger, Ed., Mahwah, by the grammar. Feature structure descriptions describe NJ: Erlbaum. classes of feature structures, by means of familiar attribute- Nicholls, H. R., Ed. (1992). Advanced Tactile Sensing for Robot- ics. River Edge, NJ: World Scientific. and-value matrices (AVMs) that (partially) describe them. A Schiff, W., and E. Foulke. (1982.) Tactual Perception: A Source- partial description constrains all the members of whatever book. New York: Cambridge University Press. class of feature structures it describes, while a total descrip- Wing, A. M., P. Haggard, and J. R. Flanagan, Eds. (1996). Hand tion is a constraint that limits the class to a single member. and Brain: The Neurophysiology and Psychology of Hand For the most part, grammar specification deals with general- Movements. San Diego: Academic Press. izations over classes of objects like words and phrase-types, and therefore with (partial) feature structure descriptions. Head-Driven Phrase Structure Grammar Feature-based unification grammar formalisms like HPSG are thus conceptually lean and computationally tractable, Head-driven phrase structure grammar (HPSG) is a lexi- and are being used in increasing numbers of NATURAL LAN- calist, constraint-based family of theories of GENERATIVE GUAGE PROCESSING systems. Head-Driven Phrase Structure Grammar 363 A feature’s value is of one of four possible types: atom, of the subject element that its complement subcategorizes feature structure, set of feature structures, or list of fea- for, and assigns a semantic role to that index, whereas a rais- ture structures. (Set values are represented as sequences ing verb just subcategorizes for whatever its infinitive VP within curly brackets: SLASH { 1 2 }. The empty set is complement subcategorizes for, and assigns no semantic denoted: { }while { [ ] } denotes a singleton set. List val- role to the index of that element. ues are represented as sequences within angled brackets: The general outlines of the HPSG treatment of COMPS < NP, VP[inf ] >. The empty list is denoted: < >, unbounded extractions (WH-MOVEMENT) follow the three- and < [ ] > denotes a singleton list.) Values that are not part strategy developed in GPSG (Gazdar 1981; Gazdar et specified in a feature-structure description are still con- al. 1985). An extra constituent is licensed just in case it strained to be among the legitimate values for the features matches a missing constituent. Something must ensure that the constraints on the types to which it belongs that the missing constituent is missing. The correspon- require. dence between the gap and the extra constituent (the filler) Like other linguistic objects, categories that figure in the is recorded via constraints on local (i.e., depth one) con- SYNTAX have rich internal structure and constituency stituency relations over an indefinitely large array of descriptions. But HPSG is a “WYSIWYG” theory; empty structure. categories are avoided rather than exploited. In HPSG, the extra constituent is licensed in strong The general outlines of the HPSG approach to constitu- (topicalization-type) extractions by the schema or sort dec- ent order derive from the theory of linear precedence rules laration that defines head-filler clauses (topicalization struc- sketched in GPSG (Gazdar and Pullum 1981; Gazdar et al. tures), and for weak extraction phenomena such as tough- 1985), and discussed at some length in Pollard and Sag constructions, by subcategorization and sort specifications (1987). As in GPSG, so-called free word order (i.e., free that require a complement daughter to not be lexically real- phrase order) is a consequence of not constraining the order ized. Gaps (or traces) are licensed in phrases by constraints of constituents at all. (Genuinely free word order, where or rules that allow dependents to be unrealized when the (any) words of one phrase can precede (any) words of any lexical head that selects them inherits information that a other phrase requires a word-order function that allows con- matching element should be missing. As in GPSG, “a linked stituents of one phrase to be recursively interleaved with series of local mother-daughter feature correspondences” constituents of another; see Gazdar and Pullum 1981; Pol- (Gazdar et al. 1985: 138), embodied as constraints on lard and Sag 1987; Dowty 1996; Reape 1994). phrase-types, entail that the extra constituent and the miss- As grammar-writing research on a number of lan- ing constituent match. guages (especially notably, German and French) has The HPSG account of the binding of indexical elements made abundantly clear, word order constraints are not like her and themselves is stated in terms of the relative always compatible with the semantic and syntactic evi- obliqueness of the GRAMMATICAL RELATIONS of the indexi- dence for constituency, and the exact form of the resolu- cal and its antecedent relative to a predicate. Considering its tion to this dilemma constitutes a lively topic in current nonconfigurational approach, the HPSG binding theory none- research. theless resembles familiar configurational accounts: Constraints on phrase types project meanings, subcatego- • A locally commanded anaphor must be locally o-bound. rization requirements, and head properties from subconstitu- • A personal pronoun must be locally o-free. ents. The HEAD-feature principle, for example, represented • A non-pronoun must be o-free. in figure 1, constrains HEAD properties of a phrase (i.e., cat- However, it differs crucially from typical configurational egory information like person, number, case, inflection) to be accounts in that it has an inherently narrower scope. Princi- the same as that of its head daughter. ple A does not constrain all anaphors to be locally o-bound Constraints on phrase types also provide COMPOSITION- (coindexed to something before them on a predicate’s argu- ALITY in the semantics by specifying how the semantics of a ment-structure list); it constrains only those that are locally phrase type is a function of the semantics of its daughter o-commanded (i.e., the ones that are noninitial on the list). constituents. This makes strong, vulnerable, and apparently correct Equi and raising structures (like Kim tried to run and claims. First, pronouns that are initial elements on argu- Kim seemed to run, respectively) are both projections of ment-structure lists are unconstrained—free to be anaphors, heads that subcategorize for an unsaturated predicative com- coindexed to anything, and vacuously satisfying principle plement, and have the same sorts of constituent structure. A, or to be pronouns, substantively satisfying principle B. Equi verbs like try, however, systematically assign one more Thus, the theory predicts that phrases in these “exempt” semantic role than raising verbs like seem do. Pollard and conditions, which are coindexed to anything anywhere in a Sag (1994) represent this difference by saying that an equi higher clause, or even outside the sentence altogether, can verb subcategorizes for an NP with a referential index (i.e., be either anaphors or pronouns. This is correct; the reflexive one that is not an expletive), which is the same as the index pronouns that contradict the naive versions of principle A are generally replaceable with personal pronouns with the SYNSEM | LOCAL | CATEGORY | HEAD 1 same reference. HEAD-DTR <[SYNSEM | LOCAL | CATEGORY | HEAD 1 ]> Unification-based, declarative models of grammar like headed-phrase HPSG are attractive for natural language processing appli- cations (e.g., as interfaces to expert systems) precisely Figure 1. 364 Head Movement because they are nondirectional and suited to the construc- Center for the Study of Language and Information. (Distributed by University of Chicago Press.) tion of application-neutral systems serving NATURAL LAN- GUAGE GENERATION as well as parsing and interpretation. See also ANAPHORA; BINDING THEORY; COMPUTATIONAL Head Movement LEXICONS; COMPUTATIONAL LINGUISTICS; FORMAL GRAM- MARS Within the syntactic framework that grew out of Chomsky —Georgia M. Green (1965), elements that appear in unexpected positions are often said to have undergone movement. One case of this is wh-movement where a maximal projection (see X-BAR THE- References ORY) moves to Spec, CP. Heads of maximal projections may Dowty, D. (1996). Towards a minimalist theory of syntactic struc- also be displaced as seen by the following triple. “The chil- ture. In H. Bunt and A. van Horck, Eds., Discontinuous Con- dren will not have done their homework.” “The children stituency. Berlin: Mouton de Gruyter. have not done their homework.” “Have the children done Gazdar, G. (1981). Unbounded dependencies and coordinate struc- their homework?” The verb have appears in three different ture. Linguistic Inquiry 12: 155–184. positions with respect to negation not and the subject the Gazdar, G., and G. K. Pullum. (1981). Subcategorization, constitu- children. A head movement account assumes that have orig- ent order, and the notion “head.” In M. Moortgat, H. v. D. inates in V (head of VP), moves to T(ense) (head of TP), Hulst, and T. Hoekstra, Eds., The Scope of Lexical Rules. Dor- and then to C (head of CP). drecht: Foris, pp. 107–123. Gazdar, G., E. Klein, G. K. Pullum, and I. A. Sag. (1985). General- (1) [CPC [TP the children T [VP V [VP do their home- ized Phrase Structure Grammar. Cambridge, MA: Harvard work]]]]. University Press. Pollard, C., and I. Sag. (1987). Information-based Syntax and CP Semantics, vol. 1. Stanford: CSLI. Pollard, C., and I. Sag. (1994). Head-driven Phrase Structure C TP Grammar. Chicago: University of Chicago Press. havei Reape, M. (1994). Domain union and word order variation in Ger- Spec T' man. In J. Nerbonne, K. Netter, and C. Pollard, Eds., German in the children Head-driven Phrase Structure Grammar. CSLI Lecture Notes No. 46. Stanford: CSLI. T VP Sag, I. A., and T. Wasow. (1998). Syntactic Theory: A Formal ti Introduction. Stanford: CSLI. V VP ti Further Readings done their homework By positing a process by which heads may be moved, Carpenter, B. (1992). The logic of typed feature structures. Cam- languages that appear to have quite different surface real- bridge Tracts in Theoretical Computer Science 32. New York: Cambridge University Press. izations may be seen as having similar abstract underlying Copestake, A., D. Flickinger, R. Malouf, S. Riehemann, and I. A. representations that are then disrupted by language- Sag. (1995). Translation using Minimal Recursion Semantics. specific rules of head movement. For instance, if one Proceedings of the Sixth International Conference on Theoreti- assumes that VPs containing the V and the object are uni- cal and Methodological Issues in Machine Translation (TMI- versal (see LINGUISTIC UNIVERSALS), one can account for 95) Leuven, Belgium. VSO languages (see TYPOLOGY) by positing obligatory Kay, M., J. M. Gawron, and P. Norvig. (1994). Verbmobil: A movement of the V to a head higher in the syntactic tree Translation System for Face-to-Face Dialog. CSLI Lecture in these languages. Notes No. 33. Stanford: CSLI. Lappin, S., and H. Gregory. (1997). A computational model of XP (2) ellipsis resolution. Master’s thesis, School of Oriental and Afri- can Studies, University of London. Available from website X TP http://semantics.soas.ac.uk/ellip/. Vi Meurers, W. D., and G. Minnen. (1995). A computational treat- ment of HPSG lexical rules as covariation in lexical entries. Pro- Subj T' ceedings of the Fifth International Workshop on Natural Language Understanding and Logic Programming. Lisbon, T VP Portugal. ti Pollard, C. (1996). The nature of constraint-based grammar. Talk V Object presented at Pacific Asia Conference on Language, Informa- ti tion, and Computation. Seoul, Korea: Kyung Hee University. Pollard, C., and D. Moshier. (1990). Unifying partial descriptions The word order in verb second languages such as German is of sets. Information, Language and Cognition. Vancouver Stud- characterized by obligatory movement of a topic to Spec, ies in Cognitive Science, vol. 1. Vancouver: University of Brit- CP and head movement of the verb to C (see PARAMETER- ish Columbia Press, pp. 285–322. SETTING APPROACHES TO ACQUISITION, CREOLIZATION, AND Shieber, Stuart. (1986). An Introduction to Unification-based Approaches to Grammar. CSLI Lecture Notes Series. Stanford: DIACHRONY). Head Movement 365 Another use of head movement has been to explain the Travis 1984). Baker (1988) and Rizzi (1990) have subse- tight correlation between morpheme orders and phrase quently reformulated this locality condition, collapsing it structure (the mirror principle of Baker 1985) through local with the locality condition on rules that move maximal pro- iterative head movement. Although the movement in jections. The existence of this locality condition on head English discussed earlier transposes words by moving a movement and the similarity of this condition to the condi- word into an empty head position, head movement may also tion for movement of maximal projections strengthens the move a stem into a head which contains an affix (or force claim that head movement is one instance of a more general movement of a word which contains this affix in MINIMAL- movement rule. Further, the fact that this locality condition ISM). For example, in Japanese where the morpheme order shows up in noun incorporation and denominal verb forma- is V-Tense-C as in tabe-ta-to, “eat-pst-Comp(that),” the tion as well as the English verb movement facts strengthens verb has undergone the same movement that we saw for the claim that they are all part of the same phenomenon. For English from V to T to C, picking up the head-related MOR- instance, the following strings are ungrammatical for the same reason: PHOLOGY. This use of head movement to create morphologically (5) a. *Have the children will __ done their homework complex words can be further extended to account for pro- b. *wa-hake-’sereht-uny-λ-’ wa’-ke-nohare-’ cesses such as noun incorporation where the noun head of fact-agr(3sS)+agr(1sO)-car-make-ben-punc fact- the object NP incorporates into the verb through head move- agr(1sS)-wash-punc) ment to form complex verbs like the Mohawk form in (3) ‘He made me wash the car’ (lit: he me-car-made (see POLYSYNTHETIC LANGUAGES as well as Baker 1988 wash) and 1996). c. *The children shelved the books on __. (3) wa-hake-‘sereht-uny-λ-’ fact-agr(3sS)+agr(1sO)-car-make-ben-punc In all three cases an intervening head position has been ‘He made a car for me’ skipped (the T will, the V ohare, “wash,” and the P on), vio- lating the Head Movement Constraint. Head movement as a mechanism to build complex words If heads must always move to head positions, then head interacts in obvious ways with questions concerning mor- movement may be used as a probe to determine phrase phology and the LEXICON (see, e.g., diSciullo and Williams structure. For instance, Pollock (1989) has argued that there 1988 for arguments against this account of incorporation). must be (at least) two head positions between C and V due As well as creating morphologically complex words, to the fact that, in French, there are two possible landing head movement has been used to represent words that are sites for the verb—one between the subject and negation (as morphologically simple but semantically complex. Hale in the English example above) and one between negation and Keyser (1993) have suggested that denominal and and an adverb. This may be extended as in Cinque (forth- deadjectival verbs such as shelve and thin are formed coming) to posit head positions between different classes of through head movement. “The children shelved the books” adverbs in order to account for the possible placement of the would be derived from a structure similar to “The children participle rimesso (marked with an X) in the following Ital- put the books on the shelf.” The verb and the preposition ian sentence. would be null, however, allowing the movement of shelve as the head of the prepositional object NP to move itera- (6) Da allora, non hanno X di solito X mica X più X sempre tively through the empty P to the empty V. (The structure X completamente rimesso tutto bene in ordine below contains an extra VP; see Hale and Keyser 1993 for “Since then, they haven’t usually not any longer always details.) put everything well in order” (4) the children [VP shelved [the books] [PP P [NP N]]. It has also been proposed that head movement cannot proceed from a lexical category (N, V, A, P) through a func- VP tional category (T, C, D(et)) back to a lexical category to explain why functional (grammatical) morphemes are not V VP found in, say, causative structures (*make-fut-work; see Li shelvedi 1990). A typology of head movement has also been pro- Spec V' posed (Koopman 1983) that includes one type of head the booksj movement with the characteristics of NP-Movement (like V PP passive and raising), and another with the characteristics of ti P NP WH-MOVEMENT. Head movement is different from maximal projection ti movement in that it can be seen to create both morphologi- N cally complex as well as (the meaning of) morphologically tj simple words, arguably putting it into direct competition with lexical and semantic rules. Yet because it shows paral- Like other movements, head movement in not uncon- strained but must obey a locality condition. Descriptively lel restrictions and typology to rules that permute maximal this locality condition requires movement to the most local projections, it can be said to be part of the computational component of SYNTAX. possible landing site (the Head Movement Constraint of 366 Hearing See also BINDING THEORY; GENERATIVE GRAMMAR; SYN- Hearing TAX, ACQUISITION OF —Lisa Travis See AUDITION; AUDITORY PHYSIOLOGY; SPEECH PERCEP- TION References Hebb, Donald O. Baker, M. (1985). The Mirror Principle and morphosyntactic explanation. Linguistic Inquiry 16: 373–415. Baker, M. (1988). Incorporation: A Theory of Grammatical Func- Donald Olding Hebb (1904–1985) was, during his lifetime, tion Changing. Chicago: University of Chicago Press. an extraordinarily influential figure in the discipline of psy- Baker, M. (1996). The Polysynthesis Parameter. Oxford: Oxford University Press. chology. His principled opposition to radical BEHAVIORISM Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, and emphasis on understanding what goes on between stim- MA: MIT Press. ulus and response (perception, LEARNING, thinking) helped Cinque, G. (Forthcoming). Adverbs and Functional Heads: A clear the way for the cognitive revolution. His view of psy- Cross-Linguistic Perspective. Oxford: Oxford University Press. chology as a biological science and his neuropsychological diSciullo, A.-M., and E. Williams. (1988). On the Definition of cell-assembly proposal rejuvenated interest in physiological Word. Cambridge, MA: MIT Press. psychology. Since his death, Hebb’s seminal ideas exert an Hale, K., and S. J. Keyser. (1993). On argument structure and the ever-growing influence on those interested in mind (cogni- lexical expression of syntactic relations. In K. Hale and S. J. tive science), brain (neuroscience), and how brains imple- Keyser, Eds., The View from Building 20. Cambridge, MA: MIT Press, pp. 53–110. ment mind (cognitive neuroscience). Koopman, H. (1983). The Syntax of Verbs. Dordrecht: Foris. On graduating from Dalhousie University in 1925, Hebb Li, Y. (1990). X°-binding and verb incorporation. Linguistic aspired to write novels, but chose instead the more practical Inquiry 21: 399–426. field of education and quickly became a school principal. Pollock, J.-Y. (1989). Verb movement, UG, and the structure of IP. The writings of JAMES, FREUD, and Watson stimulated his Linguistic Inquiry 20: 365–424. interest in psychology, and as a part-time graduate student at Rizzi, L. (1990). Relativized Minimality. Cambridge, MA: MIT McGill University, Hebb was exposed to Pavlov’s program. Press. Unimpressed, Hebb was “softened up for [his] encounter Travis, L. (1984). Parameters and Effects of Word Order Variation. with Kohler’s GESTALT PSYCHOLOGY and LASHLEY’s critique Ph.D. diss., Massachusetts Institute of Technology. of reflexology.” Hebb went to work with Lashley, and in 1936 completed his doctorate at Harvard on the effects of Further Readings early visual deprivation on size and brightness perception in Belletti, A. (1990). Generalized Verb Movement: Aspects of Verb the rat. He accepted Wilder PENFIELD’s offer of a fellowship Syntax. Torino: Rosenberg and Sellier. at the Montreal Neurological Institute, where he explored the Borsley, R. D., M.-L. Rivero, and J. Stephens. (1996). Long head impact of brain injury and surgery, particularly lesions of the movement in Breton. In R. D. Borsley and I. Roberts, Eds., The frontal lobes, on human intelligence and behavior. From his Syntax of Celtic Languages. Cambridge: Cambridge University observations that removal of large amounts of tissue might Press, pp. 53–74. have little impact on MEMORY and INTELLIGENCE, Hebb Lema, J., and M.-L. Rivero. (1989). Long head movement: ECP inferred a widely distributed neural substrate. At Queens vs. HMC. In J. Carter, R.-M. Dechaine, B. Philip, and T. University, Hebb developed human and animal intelligence Sherer, Eds., NELS. Carnegie Mellon University: GLSA, pp. 333–347. tests (including the “Hebb-Williams” maze) and concluded Lema, J., and M.-L. Rivero. (1991). Types of verbal movement in that experience played a much greater role in determining Old Spanish: modals, futures, and perfects. Probus 3: 237– intelligence than was typically assumed (Hebb 1942). 278. In 1942 Hebb rejoined Lashley, who had become director Lema, J., and M.-L. Rivero. (1989). Inverted conjugations and V- of the Yerkes Laboratory of Primate Biology. There Hebb second effects in Romance. In C. Laeufer and T. Morgan, Eds., explored fear, anger, and other emotional processes in the Theoretical Analyses in Contemporary Romance Linguistics: chimpanzee (cf. EMOTION AND THE ANIMAL BRAIN). Stimu- Selected Papers from the Nineteenth Symposium on Romance lated by the intellectual climate at Yerkes, Hebb began writ- Linguistics. London: John Benjamins, pp. 311–328. ing a book synthesizing different lines of research into a Rivero, M.-L. (1991). Long head movement and negation: Serbo- “general theory of behavior that attempts to bridge the gap Croation vs Slovak and Czech. The Linguistic Review 8: 319– 351. between neurophysiology and psychology” (Hebb 1949: Rivero, M.-L. (1994). Clause structure and V-movement in the lan- vii). Hebb returned to McGill as professor of psychology guages of the Balkans. Natural Language and Linguistic The- and in 1948 was appointed chair. His book The Organiza- ory 12: 63–120. tion of Behavior: A Neuropsychological Theory wielded a Rizzi, L., and I. Roberts. (1989). Complex inversion in French. kind of magic in the years after its appearance (Hebb 1949). Probus 1: 1–30. It attracted many brilliant scientists into psychology, made Roberts, I. (1994). Two types of head movement in Romance. In McGill University a North American mecca for scientists D. Lightfoot and N. Hornstein, Eds., Verb Movement. Cam- interested in the brain mechanisms of behavior, led to many bridge: Cambridge University Press, pp. 207–242. important discoveries, and steered contemporary psychol- Roberts, I. (1991). Excorporation and Minimality. Linguistic ogy onto a more fruitful path. Inquiry 22: 209–218. Helmholtz, Hermann Ludwig Ferdinand von 367 For Hebb “the problem of understanding behavior is the Goddard, G. V. (1980). Component properties of the memory machine: Hebb revisited. In P. W. Jusczyk and R. M. Klein, problem of understanding the total action of the nervous Eds., The Nature of Thought: Essays in Honor of D. O. Hebb. system, and vice versa” (1949: xiv), and his advocacy of an Hillsdale, NJ: Erlbaum, pp. 231–247. interdisciplinary effort to solve this “neuropsychological” Hebb, D. O. (1942). The effects of early and late brain injury upon problem was his most general theme. When Hebb’s book test scores, and the nature of normal adult intelligence. Pro- was published physiological psychology was in decline, and ceedings of the American Philosophical Society 85: 275–292. there was a growing movement in psychology to reject Hebb, D. O. (1949). The Organization of Behavior: A Neuropsy- physiological concepts (Skinner 1938). The Organization of chological Theory. New York: Wiley. Behavior marked a turning point away from this trend. Met- Hebb, D. O. (1959). A neuropsychological theory. In S. Koch, Ed., aphors, using nonbiological devices with well-understood Psychology: A Study of a Science, vol. 1. New York: McGraw-Hill. properties, figure prominently in the history of attempts to Hunt, J. M. (1979). Psychological development: early experience. Annual Review of Psychology 30: 103–143. explain behavior and thought. The mental chemistry of the McKelvie, S. (1987). Learning and awareness in the Hebb digits British Associationists, hydraulics of psychotherapy, mag- task. Journal of General Psychology 114: 75–88. netic fields of Gestalt psychology, and the computer meta- Milner, P. M. (1957). The cell assembly: Mark II. Psychological phor of information processing psychology were all fruitful Review 64: 242–252. to a point, but then limited and misleading. Hebb’s appeal- Olds, J., and P. M. Milner. (1954). Positive reinforcement produced ingly simple alternative was to explain human and animal by electrical stimulation of the septal area and other regions of behavior and thought in terms of the actual device that pro- the rat brain. Journal of Comparative and Physiological Psy- duces them—the brain. In The Organization of Behavior, chology 47: 419–427. Hebb presented just such a neuropsychological theory. Pritchard, R. M., W. Heron, and D. O. Hebb. (1960). Visual per- There were three pivotal postulates: (1) Connections ception approached by the method of stabilized images. Cana- dian Journal of Psychology 14: 67–77. between neurons increase in efficacy in proportion to the Skinner, B. F. (1938). The Behavior of Organisms: An Experimen- degree of correlation between pre- and postsynaptic activity. tal Analysis. New York: Appleton-Century. In neuroscience this corresponds to the “Hebb synapse,” the Zubek, P. (1969). Sensory Deprivation: 15 Years of Research. New first instances of which were later discovered in LONG-TERM York: Meredith. POTENTIATION and kindling, whereas in cognitive science this postulate provides the most basic learning algorithm for Further Readings adjusting connection weights in artificial NEURAL NETWORK models. (2) Groups of neurons that tend to fire together Glickman, S. (1996). Donald Olding Hebb: Returning the nervous system to psychology. In G. Kimble, C. Boneau, and M. Wer- form a cell-assembly whose activity can persist after the theimer, Eds., Portraits of Pioneers in Psychology, vol. 2. Hills- triggering event and serves to represent it. (3) Thinking is dale, NJ: Erlbaum. the sequential activation of a set of cell-assemblies. Hebb, D. O. (1980). D. O. Hebb. In G. Lindzey, Ed., A History of Hebb knew that his theory was speculative, vague, and Psychology in Autobiography, vol. 8. San Francisco: W. H. incomplete. Missing from the model, for example, was neural Freeman. inhibition (Milner 1957), a concept Hebb later incorporated Hebb, D. O. (1953). Heredity and environment in mammalian (1959). But Hebb believed that a class of theory was needed, behavior. British Journal of Animal Behavior 1: 43–47. of which his was merely one specific form—subject to modi- Hebb, D. O. (1958). A Textbook of Psychology. Philadelphia: Saun- fication or rejection in the face of new evidence. Hebb’s ideas ders. were certainly fruitful in generating new evidence, as whole Hebb, D. O. (1955). Drives and the CNS (conceptual nervous sys- tem). Psychological Review 62: 243–254. literatures on the role of early experience in PERCEPTUAL Hebb, D. O. (1980). Essay on Mind. Hillsdale, NJ: Erlbaum Asso- DEVELOPMENT (Hunt 1979), sensory deprivation (Zubek ciates. 1969), self stimulation (Olds and Milner 1954), the stopped Klein, R. M. (1980). D. O. Hebb: An appreciation. In P. W. Jusczyk retinal image (Pritchard, Heron, and Hebb 1960), synaptic and R. M. Klein, Eds., The nature of Thought: Essays in Honour modifiability (Goddard 1980), and learning without aware- of D. O. Hebb. Hillsdale, New Jersey: Erlbaum, pp. 1–18. ness (McKelvie 1987), were provoked or fostered by them. Milner, P. M. (1986). The mind and Donald O. Hebb. Scientific When philosophy and physiology converged in the nine- American 268(1): 124–129. teenth century, psychology emerged with the promise of a sci- ence of mental life (Boring 1950). By providing a neural Hebbian Learning implementation of the Associationists’ mental chemistry Hebb fulfilled this promise and laid the foundation for neoconnec- See tionism, which seeks to explain cognitive processes in terms of COMPUTATIONAL NEUROSCIENCE; HEBB, DONALD O.; connections between assemblies of real or artificial neurons. NEURON See also COGNITIVE MODELING, CONNECTIONIST; CONDI- Helmholtz, Hermann Ludwig Ferdinand TIONING; CONDITIONING AND THE BRAIN von —Raymond M. Klein References Hermann Ludwig Ferdinand von Helmholtz (1821–1894) was born on August 31, 1821, in Potsdam. His father, Ferdi- Boring, E. G. (1950). A History of Experimental Psychology. 2nd nand Helmholtz, was a respected teacher of philology and ed. New York: Appleton-Century-Crofts. 368 Helmholtz, Hermann Ludwig Ferdinand von philosophy at the gymnasium. His mother was the daughter time Muller could not believe that nerve conduction rate is of a Hanoverian artillery officer with the surname Penne, slower than sound! descended from the Quaker William Penn, founder of Penn- Helmholtz followed the English philosopher John Locke sylvania. (1632–1704) in holding that sensations are symbols for After serving as an army surgeon, Helmholtz held a suc- external objects, no more like external objects than words cession of academic positions—lecturer at the Berlin Anat- used to describe them. Thus the physical world is separated omy Museum, professor of physiology at Konigsberg, from experience, and perception is only indirectly related to professor of physiology at Bonn, professor of physiology at external events or objects. This, and Muller’s law of specific Heidelberg, professor of physics at the Military Institute for energies, are basic to his theory that visual perceptions are Medicine and Surgery in Berlin, first president of the Impe- unconscious inferences. This was a generation before rial Physico-Technical Institute in Berlin. He married and Freud’s unconscious mind, which also evoked much criti- had two children, his son Richard becoming a physical cism as it challenged the right to blame, or indeed praise, chemist. During his extremely distinguished life he was actions that are unconscious. Yet studying unconscious pro- ennobled by the emperor: hence the “von” in his name. cesses has proved vital for investigating brain and mind, Helmholtz was no less than a hero of nineteenth-century perhaps ultimately to understanding consciousness. For science, making major contributions to physics and the Helmholtz phenomena of ILLUSIONS are important evidence foundations of geometry, and founding the modern science for understanding perceptions as inferences, depending on of visual and auditory perception. He formulated the princi- assumptions that may be wrong. His basic principle was: ple of conservation of energy in 1847, and made significant “We always think we see such objects before us as would contributions to the philosophy of non-Euclidean geometry. have to be present in order to bring about the same retinal This fueled his rejection of the prevailing Kantian philoso- images under normal conditions of observation.” So after- phy in favor of a thoroughly empirical approach to the natu- images and even crude pictures are seen as objects. ral and biological sciences. He was the last scholar to Apart from the mathematical and experimental sciences combine both in depth. as well as philosophy, he was talented in languages and in Helmholtz was the first to see a living human RETINA. music, playing the piano. He conveyed science and some- The wonderful memory of doing so remained with him for thing of the arts to the public with notable popular lectures the rest of his life. His discovery was made with the that remain interesting to read. Remarkably active through- extraordinarily useful instrument, the ophthalmoscope, out his life, he suffered occasional migraines, which inter- which he invented in 1851. He explained why the pupil is rupted his work, and hay fever, which spoiled his holidays. black—the observing eye and head gets in the way of light He traveled widely, often to the British Isles, and was a par- reaching the observed retina—so he introduced light onto ticular friend of the physicist Lord Kelvin, meeting in Glas- the retina with a part-reflecting 45° mirror, using thin gow, Scotland. He attributed his success to the unusual microscope slides for part-reflecting mirrors, and also a range of his knowledge, which indeed was exceptional. concave lens. Although Helmholtz immediately saw its Helmholtz’s death on September 8, 1894, a few days general medical significance, doctors were slow to adopt it. after his seventy-third birthday, resulted from an accidental It became his most famous invention, which set him up as a fall while on a ship bound for America, which, sadly, he scientist commanding support for any future work he chose never visited. Neither of his biographies mentions his to undertake. account of perception as unconscious inference, which after A guiding principle for Helmholtz’s physiological psy- a long delay is now seen as centrally important in current chology was his teacher Johannes Muller’s law of specific cognitive psychology. There should be a fuller and more energies (perhaps better called law of specific qualities): “In readable life of this major scientist and philosopher, who whatever way a terminal organ of sense may be stimulated, gave psychology a scientific basis that is still not fully the result in CONSCIOUSNESS is always of the same kind.” appreciated, and championed thoroughgoing empiricism for Various SENSATIONS are given not by different nerve signals understanding physics and biology, and even the misleading but according to which part of the “sensory” brain is stimu- yet highly suggestive phenomena of illusions. lated. The eyes, ears, and the other organs of sense convert See also AUDITION; FREUD, SIGMUND; LIGHTNESS PER- patterns of various kinds of physical energies into the same CEPTION; RATIONALISM VS. EMPIRICISM; WUNDT, WILHELM neural coding, now known to be trains of minute electrical —Richard L. Gregory impulses called action potentials, varying in frequency according to strength of stimulation. It was Helmholtz who first measured the rate of conduction of nerve, and recorded References reaction times to unexpected events. His teacher Johannes Helmholtz, H. von. (1866). Treatise on Physiological Optics. vol. Muller thought the speed must be too great to measure, 3. 3rd ed. Trans. by J. P. C. Southall. New York: Opt. Soc. probably greater than the speed of light; but Helmholtz Amer., 1924. Dover reprint, 1962. showed him to be wrong with a very simple technique. For Helmholtz, H. von. (1881). Popular Lectures. London: Longmans noninvasive measures on humans, he touched the shoulder Green. Dover reprint, 1962. or the wrist and noted the difference in reaction times. Koenigsberger, Leo. (1906). Hermann von Helmholtz. Oxford: Knowing the difference in length of nerve between shoulder Oxford University Press. Dover reprint, 1965. and wrist, it was easy to calculate the conduction rate, and M’Kendrick, John. (1899). Hermann von Helmholtz. London: also to find the brain’s processing delay time. For a long Fisher Unwin. Hemispheric Specialization 369 ferentially distributed (Mendelsohn 1988). In addition, lan- Hemispheric Specialization guage lateralization is not dependent on the vocal-auditory modality. Disturbances of SIGN LANGUAGE in deaf subjects The modern era of neuroscientific investigation into the are also consistently associated with left hemisphere dam- asymmetry of the cerebral hemispheres began in the 1860s age, and signing deficits are typically analogous to the lan- when localization of function within the cerebral cortex was guage deficits one observes in hearing subjects with the same thrust into the forefront of scientific thought by Paul BROCA. lesion location (Bellugi, Poizner, and Klima 1989). Broca etched out his place in history by announcing that In early decorticate patients with one hemisphere miss- language resided in the frontal lobes and that the left hemi- ing, language development proceeds relatively normally in sphere played the predominant role. Although neither of either hemisphere (Carlson et al. 1968). Language develop- these ideas originated with Broca, the recognition that the ment in the right hemisphere, while retaining its phonemic brain may be functionally asymmetric opened up new ave- and semantic abilities, has deficient syntactic competence nues of cognitive and neurobiological investigation that that is revealed when meaning is conveyed by syntactic have persisted for well over a century. This summary paper diversity, such as repeating stylistically permuted sentences will briefly describe a number of lateralized cognitive func- and determining sentence implication (Dennis and Whitaker tions, including language, FACE RECOGNITION, fine MOTOR 1976). These results suggest that although the right hemi- CONTROL, visuospatial skills, and EMOTIONS, and will sphere is capable of supporting language, language usage examine whether structural asymmetries in the organization does not reach a fully normal state. of cerebral cortex are related to these functional specializa- In adults who have had language develop in the dominant tions. The interested reader is referred to several thorough hemisphere, but later became available for the testing of lan- reviews on the topic of lateralization (see further readings). guage in their right hemisphere due to commissurotomy or hemispherectomy, the right hemisphere appears capable of understanding a limited amount of vocabulary, but is usually Language Lateralization unable to produce speech. In recent years speech production by the right hemisphere of commissurotomy patients has Language is perhaps the most notable and strongly lateral- also been reported, albeit in an extremely limited context ized function in the human brain. Much of our knowledge of (Baynes, Tramo, and Gazzaniga 1992). the organization of language in the brain is based on the cor- relation of behavioral deficits with the location of lesions in the neocortex of patient populations. Several language areas Motor Control and the Left Hemisphere are found to be located within the left hemisphere and the Nine out of ten individuals demonstrate a clear preference behavioral outcome of injury to these particular cerebral for using the right hand. Broca inferred that the hemisphere locations is generally predictable (e.g., Broca’s APHASIA, dominant for language would also control the dominant Wernicke’s aphasia, conduction aphasia). In other cases hand; however, it soon became clear that this was not uni- uniquely specific linguistic deficits can result. For example, versally true. Most studies suggest that over 95 percent of one case has been reported in which the subject showed an right-handers are left-hemisphere dominant for language; unusual disability at naming fruits and vegetables despite however, only 15 percent of left-handers show the expected normal performance on a variety of other lexical/semantic right-hemisphere dominance. Of left-handers, a full 70 per- tasks following injury to the frontal lobe and BASAL GAN- cent are left-hemisphere dominant, while the remaining 15 GLIA (Hart, Berndt, and Caramazza 1985). Recent reports percent have bilateral language abilities. describe two more patients who are able to produce a nor- Disorders of skilled movement are referred to as apraxia. mal complement of verbs, but are extremely deficient in These disorders are characterized by a loss of the ability to noun production, while a third case shows exactly the carry out familiar purposeful movements in the absence of reverse deficit. Despite the variety of deficits and lesion sensory or motor impairment. The preponderance of apraxia locations, all are associated with the left hemisphere (Dama- following left hemisphere damage has led many researchers sio and Tranel 1993). to suggest that this hemisphere may be specialized for com- Modern research techniques including regional cerebral plex motor programming. Although lesion studies argue for blood flow, POSITRON EMISSION TOMOGRAPHY (PET), func- the left hemisphere's dominance of complex motor control, tional MAGNETIC RESONANCE IMAGING (fMRI), and intraop- the lateralization of this function is not nearly as strong as erative cortical stimulation, have continued to localize that seen for language. In addition, studies of commissurot- cortical regions that are activated during language tasks and omy patients suggest that the right hemisphere is capable of further support the left hemisphere's special role in language independently directing motor function in response to visual functions. nonverbal stimuli without the help of the left hemisphere Although it is true that individuals can be right hemi- (Geschwind 1976). sphere, or bilaterally, dominant for language, 90 percent of the adult population (both left- and right-handed) have lan- guage functions that are predominantly located within the Right Hemisphere Specializations left hemisphere. Even in seemingly anomalous situations the The right hemisphere also plays a predominant role in sev- left hemisphere maintains its “specialized” role in language eral specialized tasks. Right hemisphere lesion patients have functions. Studies of bilingual subjects indicate that both lan- greater difficulties localizing points, judging figure from guages are located in the same hemisphere, but may be dif- 370 Hemispheric Specialization ground, and performing tasks that require stereoscopic the weight and volume of the two cerebral hemispheres depth discriminations than do patients suffering damage to were published following the discovery of the left hemi- the left hemisphere. Additionally, commissurotomy patients sphere's role in language, it was not long before the differ- show a right hemisphere advantage for a number of visuo- ences between the length of the left and right sylvian perceptual tasks (Gazzaniga 1995). Many investigators have (lateral) fissures were described. Related to this difference also reported a right hemisphere advantage for visuopercep- in sylvian fissure length are the casual reports by von tual tasks in normal subjects, but these results are controver- Economo and Horn in 1930 and later Pfeifer (1936) that sial. On the whole visuoperceptual abilities do not appear to the planum temporale, the dorsal surface of the temporal be strongly lateralized as both hemispheres are capable of lobe, is typically larger in the left hemisphere than the performing these types of low level perceptual tasks. Sev- right. This very specific size difference between the two eral suggestions have been made to account for the asym- hemispheres became a focus of research in the late 1960s metries that are present. One suggestion is that there is no after it was described that the left planum temporale (the right hemisphere advantage for visuoperception, but a left dorsal surface of the temporal lobe) is significantly larger hemisphere disadvantage due to that hemisphere's preoccu- than the right in 65 percent of the population (GESCHWIND pation with language functions (Corballis 1991; Gazzaniga and Levitsky 1968). Based on these studies, it was com- 1985). Other authors have reported a difference in the abil- monly accepted that a difference in the size of cortical ity of each hemisphere to process global versus local pat- regions could account for the left hemisphere's specializa- terns or in terms of a hemispheric specialization for tion for language. different spatial frequencies. The right hemisphere is typi- A recent reanalysis of this question using computer- cally much better at representing the whole object while the generated three-dimensional reconstruction techniques has left hemisphere shows a slight advantage for recognizing revealed a different story. The right lateral fissure rises dra- the parts of an object (Hellige 1995). matically at its caudal extent which results in an apparent One specific task that does show convincing evidence for foreshortening of the planum in the right hemisphere when a right hemisphere advantage is face perception. Prosopag- it is studied using the previously applied methods (i.e., nosia, the inability to recognize familiar faces, occurs more photographic tracings and slice reconstruction). Three- often following damage to the right hemisphere than the left dimensional measurements that accurately map the highly (although most cases result from bilateral damage). In addi- convoluted cortical surface reveal no size difference tion, commissurotomy patients have a right hemisphere between the left and right planum temporale (Loftus et al. advantage in their ability to recognize upright faces (Gazza- 1993). Thus these anatomical differences may not reflect niga and Smylie 1983; Puce et al. 1996). size differences between the hemispheres, but rather differ- In support of a facial processing asymmetry, a number of ences in gross cortical folding. cognitive studies have indicated that normal subjects attend Many modern authors have also continued to report the more to the left side of a face than the right and that the difference in the length of the sylvian fissure that borders information carried by the left side of the face is more likely the lateral aspect of the planum on the dorsal surface of the to influence a subject's response. Finally, numerous imaging temporal lobes (Rubens et al. 1976). Subsequently these studies have demonstrated right hemispheric activation findings have been corroborated in certain primate species, using a variety of facial stimuli. human fossils, infants, and, interestingly enough, in the The right hemisphere may also be superior at tasks male cat (Tan 1992). requiring spatial attention (Mangun et al. 1994). Hemine- glect patients typically do not attend to one side of space Lateralized Cortical Circuitry and do not recognize the presence of individuals in the other Although many studies have examined gross size differ- hemifield. Additionally, they ignore one side of their body ences between the two hemispheres, relatively few have and copy drawings in a manner that entirely ignores half of directly examined whether connectional or organizational the picture. This attentional deficit is more often observed specializations underlie lateralized functions. Not surpris- following right hemisphere damage. ingly, both neurochemical and structural differences have Studies of normal subjects, psychiatric patients, and been found between the hemispheres. lesion patients indicate that the right hemisphere is domi- Columnar organization also varies between the left and nant in the recognition and expression of emotion and is right posterior temporal areas. The left hemisphere has been preferentially activated during the experience of emotion. reported to be organized into clear columnar units, while col- Lesions of the right hemisphere are also often associated umns in the right hemisphere appear to be much less distinct with affective disorders. Many of the lesion results remain (Ong and Garey 1990; cf. COLUMNS AND MODULES). This controversial, but experimental studies do demonstrate a left difference may be related to previous reports that the left visual field/right hemisphere superiority for the recognition temporal lobe has greater columnar widths and intercolum- of emotions. nar distances. Sex differences in the density of neurons within cortical lamina have also been documented in poste- Structural Asymmetry rior temporal regions (Witelson, Glezer, and Kigar 1995), If the hemispheres are not symmetrical in their functioning and these results are beginning to support cognitive data sug- then the physical structure of the brain may also be asym- gesting that language functions in women are less lateralized metrical. Although many contradictory reports regarding than those in men (Strauss, Wada, and Goldwater 1992). Hemispheric Specialization 371 Differences in the fine dendritic structure of pyramidal Bellugi, U., H. Poizner, and E. S. Klima. (1989). Language, modality and the brain. Trends in Neuroscience 12: 380–388. cells in each hemisphere have also been reported within the Carlson, J., C. Netley, E. B. Hendrick, and J. S. Prichard. (1968). A frontal lobes (Scheibel 1984), and it has been suggested that reexamination of intellectual disabilities in hemispherecto- the total dendritic length of left hemisphere pyramidal cells mized patients. Transactions of the American Neurological is greater than that of right hemisphere pyramidal cells and Association 93: 198–201. that this asymmetry may decrease with age (Jacobs and Corballis, M. C. (1991). The Lopsided Ape. New York: Oxford Scheibel 1993). University Press. Cell size asymmetries have also been documented in Damasio, A. R., and D. Tranel. (1993). Nouns and verbs are these same areas. The cell size differences appear to be retrieved with differently distributed neural systems. Proceed- restricted to the largest of the large pyramidal cells within ings of the National Academy of Sciences, USA 90: 4957–4960. layer III of Broca's area and are not apparent in adjacent Davidson, R. J., Ed. (1995). Cerebral Asymmetry. Cambridge, MA: MIT Press. cortical regions (Hayes and Lewis 1995). This same size Dennis, M., and H. A. Whitaker. (1976). Language acquisition fol- difference also exists in posterior language regions, but is lowing hemidecortication: linguistic superiority of the left over spread throughout auditory areas, including the primary the right hemisphere. Brain and Language 3: 404–433. auditory cortex (Hutsler and Gazzaniga 1995). What is the Gazzaniga, M. S., and C. Smylie. (1983). Facial recognition and functional meaning of larger cell sizes? The answer is brain asymmetries: clues to underlying mechanisms. Annals of unclear, but differences in cell body size may indicate dif- Neurology 13: 536–540. ferences in the length of a cell's axon or degree of bifur- Gazzaniga, M. S. (1985). The Social Brain: Discovering the Net- cation. Thus, pyramidal cell size may be related to works of the Mind. New York: Basic Books. connectivity differences between the two hemispheres. Gazzaniga, M. S. (1995). Principles of human brain organization Recent studies of temporal lobe connectivity using newly- derived from split-brain studies. Neuron 14: 217–228. Geschwind, N. (1976). The apraxias. American Scientist 63: 188– developed tract-tracing methods may support this notion. 195. These studies demonstrate patchy connectivity within the Geschwind, N., and W. Levitsky. (1968). Human brain: left-right posterior segment of Brodmann's area 22 (Wernicke's area) asymmetries in temporal speech region. Science 162: 186–187. of both the left and right hemisphere. Additionally, the size Hart, J., R. S. Berndt, and A. Caramazza. (1985). Category- of individual patches is quite symmetric, but the distance specific naming deficit following cerebral infarction. Nature between individual patches of the left hemisphere is consis- 316: 439–440. tently greater than that found in the right (Schmidt et al. Hayes, T. L., and D. A. Lewis. (1995). Anatomical specilization of 1997). These connectional differences may play a role in the the anterior motor speech area: Hemispheric differences in anatomical underpinnings of temporal processing differ- magnopyramidal neurons. Brain and Language 49: 289–308. ences between the two hemispheres that could be critical in Hellige, J. B. (1993). Hemispheric Asymmetry. Cambridge, MA: Harvard University Press. asymmetric cognitive functions such as language analysis. Hellige, J. B. (1995). Hemispheric asymmetry for components of Although one might expect that symmetrical structure visual information processing. In R. J. Davidson and K. should be the norm in the human brain, symmetrical organi- Hugdahl, Eds., Brain Asymmetry. Cambridge, MA: MIT Press, zation of the body may largely be due to the requirements of pp. 99–121. locomotion (Corballis 1991). In addition to the symmetrical Hutsler, J. J., and M. S. Gazzaniga. (1995). Hemispheric differ- placement of the limbs, sense organs may be placed symmet- ences in layer III pyramidal cell sizes—a critical evaluation of rically so that an organism can attend and respond equally to asymmetries within auditory and language cortices. Society for both sides of the world. Brain organization for these functions Neuroscience Abstracts 21: 180.1. might mirror the body organization, but the hemispheric dis- Jacobs, B., and A. B. Scheibel. (1993). A quantitative dendritic tribution of many cognitive functions may not be constrained analysis of Wernicke's area in humans. I. Lifespan changes. Journal of Comparative Neurology 327: 83–96. in this way. Although there could be some advantage to hav- Loftus, W. C., M. J. Tramo, C. E. Thomas, R. L. Green, R. A. ing dual representations of functions not involved with loco- Nordgren, and M. S. Gazzaniga. (1993). Three-dimensional motion (for instance, in the case of damage to one side of the analysis of hemispheric asymmetry in the human superior tem- brain), these benefits are likely outweighed by the disadvan- poral region. Cerebral Cortex 3: 348–355. tages of delayed transmission across long fibers of the corpus Mangun, G. R., R. Plager, W. Loftus, S. A. Hillyard, S. J. Luck, V. callosum. When viewed in this context, it makes sense that Clark, T. Handy, and M. S. Gazzaniga. (1994). Monitoring the certain functions would become largely the domain of one visual world: hemispheric asymmetries and subcortical pro- cerebral hemisphere and that damage to the normal brain, cesses in attention. Journal of Cognitive Neuroscience 6: 265– either through unilateral lesions or commissurotomy, would 273. reveal a remarkable array of behavioral results. Mendelsohn, S. (1988). Language lateralization in bilinguals: facts and fantasy. Journal of Neurolinguistics 3: 261–292. See also BILINGUALISM AND THE BRAIN; MODULARITY Nass, R. D., and M. S. Gazzaniga. (1985). Cerebral lateralization AND LANGUAGE; SIGN LANGUAGE AND THE BRAIN and specialization in human central nervous system. In F. Plum, Ed., Handbook of Physiology. Bethesda, MD: The American —Michael S. Gazzaniga and Jeffrey J. Hutsler Physiological Society, pp. 701–761. Ong, Y., and L. J. Garey. (1990). Neuronal architecture of the human References temporal cortex. Anatomy and Embryology 181: 351–364. Pfeifer, R. A. (1936). Pathologie der Horstrahlung und der Corti- Baynes, K., M. J. Tramo, and M. S. Gazzaniga. (1992). Reading calen Horsphare. In O. Bumke, Ed., Foerster, O, vol. 6. Berlin: with a limited lexicon in the right hemisphere of a callosotomy Springer. patient. Neuropsychologia 30: 187–200. 372 Heuristic Search visits the states in increasing order of their distance from the Puce, A., T. Allison, M. Asgari, J. C. Gore, and G. McCarthy. (1996). Differential sensitivity of human visual cortex to faces, start. A drawback of both these algorithms is that they letterstrings and textures: a functional magnetic resonance require enough memory to hold all the states considered so imaging study. Journal of Neuroscience 16: 5205–5215. far, which is prohibitive in very large problems (see the fol- Rubens, A. B., M. W. Mahowald, and T. Hutton. (1976). Asymme- lowing). try of the lateral (Sylvian) fissures in man. Neurology 26: 620– “Depth-first search” more closely approximates how one 624. would search if actually driving in the road network, rather Scheibel, A. B. (1984). A dendritic correlate of human speech. In than planning with a map. From the current location, depth- N. Geschwind and A. M. Galaburda, Eds., Cerebral Domi- first search extends the current path by following one of the nance: The Biological Foundations. Cambridge, MA: Harvard roads until a dead end is reached. It then backtracks to the University Press, pp. 43–52. last decision point that has not been completely explored, Schmidt, K. E., W. Schlote, H. Bgratzke, T. Rauen, W. Singer, and R. A. W. Galuske. (1997). Patterns of long range intrinsic con- and chooses a new path from there. The advantage of depth- nectivity in auditory and language areas of the human temporal first search is that it only requires enough memory to hold cortex. Society for Neuroscience Abstracts 23: 415.13. the current path from the initial state. Strauss, E., J. Wada, and B. Goldwater. (1992). Sex differences in “Bi-directional search” (Pohl 1971) searches forward interhemispheric reorganization of speech. Neuropsychologia from the initial state and backward from the goal state 30: 353–359. simultaneously, until the two searches meet. At that point, a Tan, Ü. (1992). Similarities between sylvian fissure asymmetries complete path from the initial state to the goal state has been in cat brain and planum temporale asymmetries in human brain. found. International Journal of Neuroscience 66: 163–175. A drawback of all brute-force searches is the amount of von Economo, C., and L. Horn. (1930). Uber windungsrelief, time they take to execute. For example, a breadth-first masse, und rindenarchtektonik der supratemporalfalche. Z Gest Neurol. Psychiat. 130: 678–755. search of a road network will explore a roughly circular Witelson, S. F., I. I. Glezer, and D. L. Kigar. (1995). Women have region whose center is the initial location, and whose radius greater density of neurons in posterior temporal cortex. Journal is the distance to the goal. It has no sense of where the goal of Neuroscience 15: 3418–3428. is until it stumbles upon it. Heuristic search, however, is directed toward the goal. A heuristic search, such as the “A* algorithm” (Hart, Nilsson, Heuristic Search and Raphael 1968), makes use of a “heuristic evaluation function,” which estimates the distance from any location to Heuristic search is the study of computer algorithms the goal. For example, the straight-line distance from a designed for PROBLEM SOLVING, based on trial-and-error given location to the goal is often a good estimate of the exploration of possible solutions. Problem solving tasks actual road distance. For every location visited by A*, it include “pathfinding problems,” game playing, and CON- estimates the total distance of a path to the goal that passes through that location. This is the sum of the distance from STRAINT SATISFACTION. The task of navigating in a network of roads from an ini- the start to the location, plus the straight-line distance from tial location to a desired goal location, with the aid of a road- the location to the goal. A* starts with the initial location, map, is an example of a pathfinding problem. The “states” of and generates all the locations immediately adjacent to it. It the problem are decision points, or intersections of two or then evaluates these locations in the above way. At each more roads. The “operators” are segments of road between step, it generates and evaluates the neighbors of the unex- two adjacent intersections. The navigation problem can be plored location with the lowest total estimate. It stops when viewed as finding a sequence of operators (road segments) it chooses a goal location. that go from the initial state (location) to the goal state (loca- A* is guaranteed to find a solution if one exists. Further- tion). In a game such as CHESS, the states are the legal board more, if the heuristic function never overestimates actual configurations, and the operators are the legal moves. cost, A* is guaranteed to find a shortest solution. For exam- A search algorithm may be systematic or nonsystematic. ple, because the shortest path between two points is a A systematic algorithm is guaranteed to find a solution if one straight line, A* using the straight-line distance heuristic exists, and may in fact guarantee a lowest-cost solution. Non- function is guaranteed to find a shortest solution to the road systematic algorithms, such as GREEDY LOCAL SEARCH, EVO- navigation problem. LUTIONARY COMPUTATION, and other stochastic approaches, The bane of all search algorithms is called “combinato- are not guaranteed to find a solution. We focus on systematic rial explosion.” In road navigation, the total number of algorithms here. intersections is quite manageable for a computer. Consider, The simplest systematic algorithms, called “brute-force however, the traveling salesman problem (TSP). Given a set search” algorithms, do not use any knowledge about the of N cities, the TSP is to find a shortest tour that visits all the problem other than the states, operators, and initial and goal cities and returns to the starting city. Given N cities, there states. For example, “breadth-first search” starts with the are (N – 1)! different orders in which the cities could be vis- initial state, then considers all states one operator away from ited. Clever algorithms can reduce the number of possibili- ties to 2N, but even if N is as small as 50, 2N is the initial state, then all states two operators away, and so on approximately 1015. Even if a computer could examine a until the goal is reached. Uniform-cost search, or Dijkstra’s million possibilities per second, examining 1015 possibili- algorithm (Dijkstra 1959), considers the different costs of operators, or lengths of road segments in our example, and ties would take 31.7 years. Hidden Markov Models 373 An algorithm that is well suited to problems such as the mixture of K probability distributions. The second assump- TSP is called depth-first branch-and-bound. A TSP solu- tion is that there is a discrete-time Markov chain with K tion is a sequence of cities. Depth-first branch-and-bound states, which is generating the observed data by visiting the systematically searches the possible tours depth-first, add- K distributions in Markov fashion. The “hidden” aspect of ing one city at a time to a partial tour, and backtracks when the model arises from the fact that the state-sequence is not a tour is completed. In addition, it keeps track of the length directly observed. Instead, one must infer the state-sequence of the shortest complete tour found so far, in a variable α. from a sequence of observed data using the probability Whenever a partial tour is found in which the sum of the model. Although the model is quite simple, it has been found trip segments already included exceeds α, we need not to be very useful in a variety of sequential modeling prob- consider any extensions of that partial tour, inasmuch as lems, most notably in SPEECH RECOGNITION (Rabiner 1989) the total cost can only be greater. In addition, a heuristic and more recently in other disciplines such as computational evaluation function can be used to estimate the cost of biology (Krogh et al. 1994). A key practical feature of the completing a partial tour. If the heuristic function never model is the fact that inference of the hidden state sequence overestimates the lowest completion cost, whenever the given observed data can be performed in time linear in the cost of the segments so far, plus the heuristic estimate of length of the sequence. Furthermore, this lays the foundation the completion cost, exceeds α, we can backtrack. All for efficient estimation algorithms that can determine the known algorithms that are guaranteed to find an optimal parameters of the HMM from training data. solution to such “combinatorial problems” are variants of A HMM imposes a simple set of dependence relations branch-and-bound. between a sequence of discrete-valued state variables S and observed variables Y. The state sequence is usually assumed See also ALGORITHM; COMPUTATIONAL COMPLEXITY to be first-order Markov, governed by a K × K matrix of —Richard Korf transition probabilities of the form p (St+1 = i | St+1 = j ), that is, the conditional probability that the system will transit to References state i at time t + 1 given that the system is in state j at time t (Markov 1913). The K probability distributions associated Dijkstra, E. W. (1959). A note on two problems in connexion with with each state, p(Yt | St = i), describe how the observed graphs. Numerische Mathematik 1: 269–271. data Y are distributed given that the system is in state i. The Hart, P. E., N. J. Nilsson, and B. Raphael. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE transition probability matrix and the parameters of the K Transactions on Systems Science and Cybernetics 4(2): 100– probability distributions are usually assumed not to vary 107. over time. The independence relations in this model can be Pohl, I. (1971). Bi-directional search. In B. Meltzer and D. Michie, summarized by the simple graph in figure 1. Each state Eds., Machine Intelligence 6. New York: American Elsevier, depends only on its predecessor, and each observable pp. 127–140. depends only on the current state. A large number of varia- tions of this basic model exist, for example, constraints on Further Readings the transition matrix to contain “null” transitions, generali- zations of the first-order Markov dependence assumption, Bolc, L., and J. Cytowski. (1992). Search Methods for Artificial Intelligence. London: Academic Press. generalizations to allow the observations Yt to depend on Kanal, L., and V. Kumar, Eds. (1988). Search in Artificial Intelli- past observations Yt–2, Yt–3, . . . , and different flexible gence. New York: Springer. parametrizations of the K probability distributions for the Korf, R. E. (1985). Depth-first iterative-deepening: an optimal observable Ys (Poritz 1988; Rabiner 1989). admissible tree search. Artificial Intelligence 27(1): 97–109. The application of HMMs to practical problems Korf, R. E. (1998). Artificial intelligence search algorithms. To involves the solution of two related but distinct problems. appear in M. J. Atallah, Ed., CRC Handbook of Algorithms and The first is the inference problem, where one assumes that Theory of Computation. Boca Raton, FL: CRC Press. the parameters of the model are known and one is given an Newell, A., and H. A. Simon. (1972). Human Problem Solving. observed data sequence {Y1, . . . , YT}. How can one calcu- Englewood Cliffs, NJ: Prentice-Hall. late p(Y1, . . . , YT |model), as in speech recognition (for Pearl, J. (1984). Heuristics. Reading, MA: Addison-Wesley. example) when finding which word model, from a set of word models, explains the observed data best? Or how can Heuristics one calculate the most likely sequence of hidden states to have generated {Y1, . . . , YT}, as in applications such as See GREEDY LOCAL SEARCH; HEURISTIC SEARCH; JUDGMENT decoding error-correcting codes, where the goal is to deter- mine which specific sequence of hidden states is most HEURISTICS likely to have generated the data. Both questions can be answered in an exact and computationally efficiently man- Hidden Markov Models ner by taking advantage of the independence structure of the HMM. Finding p(Y1, . . ., YT |model) can be solved by A hidden Markov model (HMM) is a widely used probabilis- the forward-backward procedure, which, as the name tic model for data that are observed in a sequential fashion implies, amounts to a forward pass through the possible (e.g., over time). A HMM makes two primary assumptions. hidden state sequences followed by a backward pass (Rabiner 1989). This procedure takes on the order of TK2 The first assumption is that the observed data arise from a 374 High-Level Vision sian network family (Smyth, Heckerman, and Jordan 1997). An advantage of this viewpoint is that it allows one to lever- age the flexible inference and estimation techniques of BAYESIAN NETWORKS when investigating more flexible and general models in the HMM class, such as more complex sequential dependencies and multiple hidden chains. See also BAYESIAN LEARNING; FOUNDATIONS OF PROBA- BILITY; PROBABILISTIC REASONING A graphical representation of the dependencies in a Figure 1. —Padhraic Smyth Hidden Markov Model. References operations (linear in the length of the observed sequence). Baum, L. E., T. Petrie, G. Soules, and N. Weiss. (1970). A Similarly, the most likely sequence of hidden states for a maximization technique occurring in the statistical analysis of given observation sequence can be found by the well- probabilistic functions of Markov chains. Ann. Math. Stat. 41: known Viterbi algorithm, which is a general application of 164–171. DYNAMIC PROGRAMMING to this problem (Forney 1973). It Forney, G. D. (1973). The Viterbi algorithm. Proceedings of the also involves a forward and backward pass through the data IEEE 61: 268–278. and also takes order of TK2 operations. Gauvain, J., and C. Lee. (1994). Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov The second general problem to be solved in practice is chains. IEEE Trans. Sig. Audio Proc. 2: 291–298. that of estimation, that is, finding values for the HMM Krogh, A., M. Brown, I. S. Mian, K. Sjolander, and D. Haussler. parameters; the K × K transition matrix, the parameters of (1994). Hidden Markov models in computational biology: the K observable probability distributions, and an initial applications to protein modeling. J. Mol. Bio. 235: 1501–1531. probability distribution over the states. Implicit is the Markov, A. A. (1913). An example of statistical investigation in assumption that K is known (usually assumed to be the case the text of “Eugene Onyegin” illustrating coupling of “tests” in in practice). Given K, the most widely used technique for chains. Proc. Acad. Sci. St. Petersburg VI Ser. 7: 153–162. estimation of HMM parameters is the Baum-Welch algo- McLachlan, G. J., and T. Krishnan. (1997). The EM Algorithm and rithm (Baum et al. 1970). The general idea behind the algo- Extensions. New York: Wiley. Poritz, A. M. (1988). Hidden Markov models: a guided tour. Pro- rithm is as follows. Assume that the parameters are fixed at ceedings of the IEEE International Conference on Acoustics, some tentative estimate. Use the forward-backward algo- Speech and Signal Processing 1: 7–13. rithm described earlier to infer the probabilities of all the Rabiner, L. R. (1989). A tutorial on hidden Markov models and possible transitions p(St+1 = j | St = i) between hidden states selected applications in speech recognition. Proceedings of the and all probabilities of observed data p(Yt | St = i), keeping IEEE 77: 257–285. the tentative parameters fixed. Now that one has estimates Smyth, P., D. Heckerman, and M. I. Jordan. (1997). Probabilistic of the transitions and observation probabilities, it is possible independence networks for hidden Markov probability models. to reestimate a new set of parameters in closed form. Neural Computation 9: 227–269. Remarkably, it can be shown that p(Y1, . . . , YT | θnew) ≥ p(Y1, . . . , YT | θold), that is, the likelihood of the new Further Readings parameters is at least as great as that of the old, where θ Elliott, R. J., L. Aggoun, and J. B. Moore. (1995). Hidden Markov denotes the particular HMM parameter set. Thus, iterative Models: Estimation and Control. New York: Springer. application of this two-step procedure of forward-backward Huang, X. D., Y. Ariki, and M. A. Jack. (1990). Hidden Markov calculations and parameter estimation yields an algorithm Models for Speech Recognition. Edinburgh: Edinburgh Univer- that climbs to a maximum of the likelihood function in sity Press. parameter space. It turns out that this procedure is a special MacDonald, I. L., and W. Zucchini. (1997). Hidden Markov and case of a general technique (known as the expectation- Other Models for Discrete-Valued Time Series. New York: maximization or EM procedure; McLachlan and Krishnan Chapman and Hall. 1997) for generating maximum-likelihood parameter esti- mates in the presence of hidden variables. Variations of the High-Level Vision EM algorithm are widely used in machine learning and sta- tistics for solving UNSUPERVISED LEARNING problems. The HMM parameter space can have many local maxima, mak- Aspects of vision that reflect influences from memory, con- ing estimation nontrivial in practice. The Baum-Welch and text, or intention are considered “high-level vision,” a term EM procedures can also be generalized to maximum a pos- originating in a hierarchical approach to vision. In currently teriori and Bayesian estimation, where prior information popular interactive hierarchical models, however, it is and posterior uncertainty are explicitly modeled (e.g., almost impossible to distinguish where one level of process- Gauvain and Lee 1994). ing ends and another begins. This is because partial outputs It can be also be useful to look at HMMs from other from lower-level processes initiate higher-level processes, viewpoints. For example, figure 1 can be interpreted as a and the outputs of higher-level processes feed back to influ- simple Bayesian network and, thus, a HMM can be treated ence processing at the lower levels (McClelland and Rum- in complete generality as a particular member of the Baye- elhart 1986). Thus, the distinctions between processes High-Level Vision 375 residing at high, intermediate, and low levels are difficult to A second major problem in high-level vision is the ques- draw. Indeed, substantial empirical evidence indicates that tion of how scenes are perceived and, in particular, how the some high-level processes influence behaviors that are tradi- semantic and spatial context provided by a scene influences tionally considered low-level or MID-LEVEL VISION. With the identification of the individual objects within the scene. this caveat in mind, the following topics will be considered Any effects of scene context require the interaction of spa- under the heading “high-level vision”: object and face rec- tially local and spatially global processing mechanisms; the ognition, scene perception and context effects, effects of means by which this is accomplished have yet to be identi- intention and object knowledge on perception, and the men- fied. Research indicates that scene-consistent objects are tal structures used to integrate across successive glances at identified faster and more accurately when placed in a con- an object or a scene. textually appropriate spatial location rather than one that is One major focus of theory and research in high-level contextually inappropriate (Biederman, Mezzanotte, and vision is an attempt to understand how humans manage to Rabinowitz 1982). In addition, recent evidence (Diwadkar recognize and categorize familiar objects quickly and reli- and McNamara 1997) suggests that scene memory is view- ably. An adequate theory of OBJECT RECOGNITION must point dependent, just as object memory is orientation- account for (1) the accuracy of object recognition over dependent. Such dependencies and similarities in the changes in object size, location, and orientation (preferably, processing of scenes and objects raise questions about the this account would not posit a different memory record for extent to which the mechanisms for processing scenes and each view of every object ever seen); (2) the means by objects overlap, despite the apparent specialization of the which the spatial relationships between the parts or features two different visual processing streams. Nevertheless, much of an object are represented (given that objects and spaces research continues to argue for fundamental differences in seem to be coded in different VISUAL PROCESSING the representation of spaces and objects. An example is evi- STREAMS, with object processing occurring in ventral path- dence that when no semantic context is present, memory for ways and space processing occurring in dorsal pathways); spatial configuration is excellent under conditions in which and (3) the attributes of both basic-level and subordinate- memory for object identity is impaired (Simons 1996). It is level recognition (e.g., recognition of a finch as both a bird worth pointing out that whereas context effects are prevalent and as a specific kind of bird). Current competing object in visual perception, their influence may not extend to recognition theories differ in their approach to each of these motor responses generated on the basis of visual input (Mil- factors (see Biederman 1987; Tarr 1995). According to Bie- ner and Goodale 1995). Experiments measuring motor derman (1987), objects are parsed into parts at concave por- responses raise the possibility that the different visual pro- tions of their bounding contours, and the parts are cessing streams associated with ventral and dorsal anatomi- represented in memory by a set of abstract components cal pathways are specialized for vision and action, (generalized cylinders); the claim is that these components respectively, rather than for the visual perception of objects can be extracted from an image independent of changes in and spaces, as originally hypothesized. orientation (up to an accidental view rendering certain com- A third question central to investigations of high-level ponent features invisible). On Biederman’s view, (1) object vision concerns the mechanisms by which successive recognition should be robust to orientation changes as long glances at an object or a scene are integrated. Phenomeno- as the same components can be extracted from the image; logically, perception of objects and scenes seems to be and (2) very few views of each object need be represented holistic and fully elaborated rather than piecemeal, abstract, in memory. Tarr (1995) adopts a different theoretical and schematic. Contrary to the phenomenological impres- approach, proposing that specific views of objects are rep- sions, evidence indicates that perception is not “everywhere resented by salient features, and that object recognition is dense” (Hochberg 1968); instead, visual percepts are largely orientation-dependent. On Tarr’s approach, multiple views determined by the stimulation obtained at the locus of fixa- of each object are stored in memory, and objects seen in tion or attention, even when inconsistent information lies new views must undergo some time-consuming process nearby (Hochberg and Peterson 1987; Peterson and Gibson before they are recognized. The empirical evidence sug- 1991; Rensink O’Regan, and Clark 1997). It has been gesting that object recognition is orientation-dependent is shown that the structures used to integrate the information accumulating, favoring the multiple-views approach. How- obtained in successive glances are abstract and schematic in ever, evidence indicates that the concave portions of bound- nature (Irwin 1996); hence, they can tolerate the integration ing contours are more important for recognition than other of inconsistent information. Similarly, visual memories, contour segments, supporting the idea that part structure is assessed via mental IMAGERY research, are known to be critically important for object recognition, consistent with schematic compared to visual percepts (Kosslyn 1990; an approach like Biederman’s. Peterson 1993). One of the abiding questions in high-level A related, but independent, research focus is FACE REC- vision is, given such circumstances, how can one account OGNITION. Behavioral evidence obtained from both normal for the phenomenological impressions that percepts are and brain damaged populations suggests that different detailed and fully elaborated? A recent appealing proposal mechanisms are used to represent faces and objects, and in is that the apparent richness of visual percepts is an illusion, particular, that holistic, configural processing seems to be made possible because eye movements (see EYE MOVE- more critical for face than for object recognition (e.g., MENTS AND VISUAL ATTENTION) can be made rapidly to real Farah, Tanaka, and Drain 1995; Moscovitch, Winocur, and world locations containing the perceptual details required to Behrmann 1997). answer perceptual inquiries (O’Regan 1992). On this view, 376 High-Level Vision these higher-order processes, we will undoubtedly learn the world serves as an external memory, filling in and sup- more about both. The result will be a deeper understanding plementing abstract percepts on demand. of high-level vision and its component processes. Other research in high-level vision investigates various forms of TOP-DOWN PROCESSING IN VISION. Included in this See also PICTORIAL ART AND VISION; SHAPE PERCEPTION; domain are experiments concerning the effects of observers’ SPATIAL PERCEPTION; STRUCTURE FROM VISUAL INFORMA- intentions on perception (where intentions are manipulated TION SOURCES; VISUAL OBJECT RECOGNITION, AI via instructions; Hochberg and Peterson 1987) and investi- —Mary A. Peterson gations of how object knowledge affects the perception of moving or stationary displays. For example, detection thresholds are lower for known objects than for their scram- References bled counterparts (Purcell and Stewart 1991). In addition, Biederman, I. (1987). Recognition by components: a theory of object recognition cues contribute to DEPTH PERCEPTION, human image understanding. Psychological Review 94: 115– along with the classic depth cues and the configural cues of 147. GESTALT PERCEPTION (Peterson 1994). For moving dis- Biederman, I., R. J. Mezzanotte, and J. C. Rabinowitz. (1982). plays, influences from object memories affect the direction Scene perception: detecting and judging objects undergoing in which ambiguous displays appear to move (McBeath, relational violations. Cognitive Psychology 14: 143–177. Morikowa, and Kaiser 1992). Moreover, although apparent Diwadkar, V. A., and T. P. McNamara. (1997). Viewpoint depen- motion typically seems to take the shortest path between dence in scene recognition. Psychological Science 8: 302–307. two locations, Shiffrar and Freyd (1993) found that, under Farah, M. J., J. W. Tanaka, and H. M. Drain. (1995). What causes certain timing conditions, object-appropriate pathways are the face inversion effect? Journal of Experimental Psychology: Human Perception and Performance 21: 628–634. preferred over the shortest pathways. Much early research Hochberg, J. (1968). In the mind’s eye. In R. N. Haber, Ed., Con- investigating the contributions to perception from knowl- temporary Theory and Research in Visual Perception. New edge, motivation, and intention was discredited by later York: Holt, Rinehart, and Winston, pp. 309–331. research showing that the original results were due to Hochberg, J., and M. A. Peterson. (1987). Piecemeal organization response bias (Pastore 1949). Hence, it is important to and cognitive components in object perception: perceptually ascertain whether effects of knowledge and intentions lie in coupled responses to moving objects. Journal of Experimental perception per se rather than in memory or response bias. Psychology: General 116: 370–380. One way to do this is to measure perceptual processes on- Irwin, D. E. (1996). Integrating information across saccadic eye line; another way is to measure perception indirectly by ask- movements. Current Directions in Psychological Science 5: ing observers to report about variables that are perceptually 94–100. Kosslyn, S. M. (1990). Mental imagery. In D. N. Osherson, S. M. coupled to the variable to which intention or knowledge Kosslyn, and J. M. Hollerbach, Eds., Visual Cognition and refers (Hochberg and Peterson 1987). Many of these recent Action: An Invitation to Cognitive Science, vol. 2. Cambridge, experiments have succeeded in localizing the effects of MA: MIT Press. intention and knowledge in perception per se by using one McBeath, M. C., K. Morikowa, and M. Kaiser. (1992). Perceptual or more of these methods; hence, representing an advance bias for forward-facing motion. Psychological Science 3: 362– over previous attempts to study top-down effects on percep- 367. tion. McClelland, J. L., and D. E. Rumelhart. (1986). Parallel Distrib- It is important to point out that not all forms of knowl- uted Processing: Explorations in the Microstructure of Cogni- edge or memory can influence perception and not all tion, vol. 2. Cambridge, MA: MIT Press. aspects of perception can be influenced by knowledge and Milner, A. D., and M. Goodale. (1995). The Visual Brain in Action. Oxford: Oxford University Press. memory. Consider the moon illusion, for example. When Moscovitch, M., G. Winocur, and M. Behrmann. (1997). What is the moon is viewed near the horizon, it appears much larger special about face recognition? Nineteen experiments on a per- than it does when it is viewed in the zenith; yet the moon son with visual object agnosia and dyslexia but normal face rec- itself does not change size, nor does it cover areas of differ- ognition. Journal of Cognitive Neuroscience 9: 555–604. ent size on the viewer’s retina in the two viewing conditions. O’Regan, D. (1992). Solving the “real” mysteries of visual percep- The difference in apparent size is an illusion, most likely tion: the world as an outside memory. Canadian Journal of caused by the presence of many depth cues in the horizon Psychology 46: 461–488. condition and by the absence of depth cues in the zenith Pastore, N. (1949). Need as a determinant of perception. The Jour- condition. However, knowledge that the apparent size dif- nal of Psychology 28: 457–475. ference is an illusion does not eliminate or even reduce the Peterson, M. A. (1993). The ambiguity of mental images: Insights regarding the structure of shape memory and its function in cre- illusion; the same is true for many illusions. The boundaries ativity. In B. Roskos-Ewoldsen, M. J. Intons-Peterson, and R. of the effects of knowledge and intentions on perception Anderson, Eds., Imagery, Creativity, and Discovery: A Cogni- have yet to be firmly established. One possibility is that per- tive Perspective. Amsterdam: North Holland, pp. 151–185. ception can be altered only by knowledge residing in the Peterson, M. A. (1994). Shape recognition can and does occur structures normally accessed in the course of perceptual before figure-ground organization. Current Directions in Psy- organization (Peterson et al. 1996). chological Science 3: 105–111. In summary, research in high-level vision focuses on Peterson, M. A., and B. S. Gibson. (1991). Directing spatial atten- questions regarding how context, memory, knowledge, and tion within an object: Altering the functional equivalence of intention can influence visual perception. In the course of shape descriptions. Journal of Experimental Psychology: investigations into the interaction between perception and Human Perception and Performance 17: 170–182. Hippocampus 377 Peterson, M. A., L. Nadel, P. Bloom, and M. F. Garrett. (1996). Space and Language. In P. Bloom, M. A. Peterson, L. Nadel, and M. F. Garrett, Eds., Language and Space. Cambridge, MA: MIT Press, pp. 553–577. Purcell, D. G., and A. L. Stewart. (1991). The object-detection effect: configuration enhances perception. Perception and Psy- chophysics 50: 215–224. Rensink, R. A., J. K. O’Regan, and J. J. Clark. (1997). To see or not to see: the need for attention to perceive changes. Psycho- logical Science 8: 368–373. Shiffrar, M., and J. J. Freyd. (1993). Timing and apparent motion path choice with human body photographs. Psychological Sci- ence 4: 379–384. Simons, D. (1996). In sight, out of mind: when object representa- tions fail. Psychological Science 5: 301–305. Tarr, M. J. (1995). Rotating objects to recognize them: a case study on the role of viewpoint dependency in the recognition of three-dimensional objects. Psychonomic Bulletin and Review 2: 55–82. Hippocampus The hippocampus is a brain structure located deep within the temporal lobe, surrounded by the lateral ventricle, and connected to subcortical nuclei via the fornix and to the neocortex via the parahippocampal region. Considerations Figure 1. of the information-processing functions of the hippocampus highlight its position as the final convergence site for out- puts from many areas of the CEREBRAL CORTEX, and its removal of substantial portions of both the hippocampus and divergent outputs that return to influence or organize corti- parahippocampal region. H. M. demonstrated an almost cal memory representations (figure 1). complete failure to learn new material, whereas his remote The neocortex provides information to the hippocampus autobiographical memories and short term memory were only from the highest sensory processing areas, plus multi- completely intact, leading to the view that the hippocampal modal and LIMBIC SYSTEM cortical areas and the olfactory region plays a specific role in the consolidation of short term cortex. These inputs follow a coarse rostral-to-caudal topog- memories into a permanent store. In addition, the amnesic raphy arriving in the parahippocampal region, composed of impairment is also selective to declarative or explicit mem- the perirhinal, parahippocampal, and entorhinal cortices ory (cf. IMPLICIT VS. EXPLICIT MEMORY), the capacity for (Burwell, Witter, and Amaral 1995). The latter areas project conscious and direct expression of both episodic and seman- onto the hippocampus itself at each of its main subdi- tic memory (Corkin 1984; Squire et al. 1993; see also EPI- visions, the dentate gyrus, the CA3 and CA1 components of SODIC VS. SEMANTIC MEMORY). Conversely, amnesiacs Ammon’s horn, and the subiculum (figure 1). The main flow demonstrate normal MOTOR LEARNING and CONDITIONING, of information through the hippocampus involves serial con- and normal sensory adaptations and “priming” of perceptual nections from the dentate gyrus to CA3, CA3 to CA1, and stimuli; such forms of implicit memory occur despite their then CA1 to the subiculum (Amaral and Witter 1989). The inability to recall or recognize the learning materials or the intrinsic hippocampal pathway partially preserves the topo- events surrounding the learning experience (see MEMORY, graphical gradients of neocortical input, but there is also HUMAN NEUROPSYCHOLOGY). The development of a nonhu- considerable divergence and associational connections par- man primate model has demonstrated a parallel dissociation ticularly at the CA3 stage. Outputs of the subiculum, and to between severely impaired recognition memory and pre- a lesser extent CA1, are directed back to the parahippocam- served acquisition of motor skills and perceptual discrimina- pal region, which in turn projects back onto the neocortical tions following damage to the same hippocampal areas and olfactory areas that were the source of cortical inputs. removed in H. M. (see MEMORY, ANIMAL STUDIES). These aspects of hippocampal organization maximize the A central open question is precisely what role the hip- potential for association of information from many cortical pocampus plays in declarative memory processing. Studies streams, and the potential for such associations to influence of the consequences of hippocampal damage in animals cortical processing broadly. Furthermore, the capacity for have generated several proposals about hippocampal func- associative plasticity in the form of LONG-TERM POTENTIA- tion, each suggesting a specific form of hippocampal- TION at dentate and CA1 synapses is well established, and dependent and hippocampal-independent memory. Perhaps has been related to normal rhythmic (theta) bursting activity the most prominent of these is the hypothesis that the hip- in the hippocampus and to hippocampal memory function. pocampus constitutes a COGNITIVE MAP, a representation of In 1957 Scoville and Milner described a patient known as allocentric space (O’Keefe and Nadel 1978). This notion H. M. who suffered profound amnesia following bilateral captures the multimodal nature of hippocampal inputs and 378 Historical Linguistics physiological features have been simulated in artificial asso- accounts for deficits in place learning observed following ciative networks employed to accomplish distributed recod- hippocampal damage. However, this view does not account ings of inputs and to perform basic computations that are for the global amnesia observed in humans or for impair- reflected in hippocampal neural activity (see Gluck 1996). ments observed on some nonspatial learning tasks in ani- Some models have focused on the central features of cogni- mals with hippocampal damage. A reconciliation of its tive maps, such as the ability to solve problems from partial declarative and spatial functions may be possible by consid- information, take shortcuts, and navigate via novel routes. ering a fundamental role for the hippocampus in represent- Other models have focused on sequence prediction that ing relations among items in a memory network and in employs the recall of temporal patterns to accomplish spa- “flexibility” of memory expression by which all items can tial and nonspatial pattern completion and disambiguation, be accessed through any point in the network (Dusek and and more generally show how such network memory repre- Eichenbaum 1997). Recent findings using both monkeys sentations can provide the flexibility of memory expression and rats have shown that animals with hippocampal damage conferred by the hippocampus. are severely impaired when challenged to express learned relations between items in a flexible way, and the lack of See also MEMORY; MEMORY STORAGE, MODULATION OF; such flexibility is characteristic of human amnesia (see WORKING MEMORY, NEURAL BASIS OF Eichenbaum 1997; Squire 1992). —Howard Eichenbaum Complementary evidence about the memory processing accomplished by the hippocampus has been derived from stud- References ies of the firing patterns of cortical and hippocampal neurons in behaving animals. Recordings at successive cortical stages Amaral, D. G., and M. P. Witter. (1989). The three-dimensional leading to the hippocampus reflect increasing sensory conver- organization of the hippocampal formation: a review of ana- gence, from the encoding of specific perceptual features or tomical data. Neuroscience 31: 571–591. Burwell R. D., M. P. Witter, and D. G. Amaral. (1995). Perirhinal movement parameters in early cortical areas to that of increas- and postrhinal cortices in the rat: a review of the neuroanatomi- ingly complicated and multimodal objects and behaviors at cal literature and comparison with findings from the monkey higher cortical stages. Consistent with the view that the hip- brain. Hippocampus 5: 390–408. pocampus is the ultimate stage of hierarchical processing, the Deadwyler, S. A., and R. E. Hampson. (1997). The significance of functional correlates of hippocampal cells are “supramodal” in neural ensemble codes during behavior and cognition. Annual that they appear to encode the abstract stimulus configurations Review of Neuroscience 20: 217–244. that are independent of any particular sensory input. Most Dusek, J. A., and H. Eichenbaum. (1997). The hippocampus and prominent among the functional types of hippocampal princi- memory for orderly stimulus relations. Proceedings of the pal neurons are cells that fire selectively when a rat is in a par- National Acadamy of Sciences USA 94: 7109–7114. ticular location in its environment as defined by the spatial Eichenbaum, H. (1997). Declarative memory: insights from cogni- tive neurobiology. Annual Review of Psychology 48: 547–572. relations among multiple and multimodal stimuli (O’Keefe Gluck, M. A., Ed. (1996). Computational models of hippocampal 1976). The firing of such “place cells” is characteristically not function in memory. Special Issue of Hippocampus, vol. 6, dependent upon any particular stimulus element and is not no. 6. affected even if all the stimuli are removed, so long as the ani- O’Keefe, J. A. (1976). Place units in the hippocampus of the freely mal behaves as if it is in the same environment. However, hip- moving rat. Experimental Neurology 51: 78–109. pocampal neuronal activity is not limited to the encoding of O’Keefe, J., and L. Nadel. (1978). The Hippocampus as a Cogni- spatial cues, but has also been related to meaningful movement tive Map. New York: Oxford University Press. trajectories and actions in rats as well as conditioned motor Scoville, W. B., and B. Milner. (1957). Loss of recent memory responses in restrained rabbits and monkeys. In addition, after bilateral hippocampal lesions. Journal of Neurology, Neu- across a variety of learning tasks hippocampal neurons are rosurgery and Psychiatry 20: 11–12. Squire, L. R. (1992). Memory and the hippocampus: a synthesis of activated by relevant olfactory, visual, tactile, or auditory cues, findings with rats, monkeys, and humans. Psychological and these encodings prominently reflect nonallocentric spatial, Reviews 99: 195–231. temporal, and other relations among the cues that guide perfor- Squire, L. R., B. Knowlton, and G. Musen. (1993). The structure mance (Wood, Dudchencko, and Eichenbaum, 1999; Deadw- and organization of memory. Annual Review of Psychology 44: yler and Hampson 1997). These findings extend the range of 453–495. hippocampal coding to reflect its global involvement in mem- Wood, E. R., P. A. Dudchencko, and H. Eichenbaum. (1999). The ory and serve to reinforce the conclusion that the hippocampus global record of memory in hippocampal neuronal activity. supports relational representations. Nature (in press). Efforts to understand how hippocampal circuitry medi- ates memory processing have focused on special aspects of Further Readings hippocampal architecture: a high convergence of sensory Cohen, N. J., and H. Eichenbaum. (1993). Memory, Amnesia, and information onto the hippocampus, sparse connectivity the Hippocampal System. Cambridge, MA: MIT Press. within the broad serial divergence and associative connec- tions across the cell population, recurrent connections that characterize dentate gyrus and CA3 pyramidal cells, the Historical Linguistics small fraction of excited afferent fibers required to drive CA1 cells, and rapid adjustments of synaptic weights at See CREOLES; LANGUAGE VARIATION AND CHANGE; TYPOLOGY each stage via long term potentiation. These anatomical and Human-Computer Interaction 379 current commercial systems. Their interfaces bear startling HMMs resemblances to those of the early Xerox Alto (Lampson 1988). What has changed since the days of the Alto, in addi- See HIDDEN MARKOV MODELS tion to the continual doubling of computing power every eighteen months and all that this doubling makes possible, Hopfield Networks is that we now have more principled understandings of how to create effective interfaces. This is primarily a result of the development and acceptance of user-centered approaches See RECURRENT NETWORKS (Norman and Draper 1986; Nielsen 1993; Nickerson and Landauer 1997) to system design. HPSG Research in human-computer interaction (Helenader, Landauer, and Prabhu 1997), as in most of the cognitive sci- ences, draws on many disciplines in that it involves both See HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR people and computer technologies. The goal of creating effective and enjoyable systems for people to use makes Human-Computer Interaction HCI a design activity (Winograd 1996). As designed arti- facts, interface development involves what Schon (1983) Human-computer interaction (HCI) studies how people terms a reflective conversation with materials. To be effec- design, implement, and use computer interfaces. With tive, though, interfaces must be well suited to and situated in computer-based systems playing increasingly significant the environments of users. Designers must ensure that they roles in our lives and in the basic infrastructure of science, remain centered on human concerns. Thus, although HCI business, and society, HCI is an area of singular importance. can be viewed simply as an important area of software A key to understanding human-computer interaction is to design, inasmuch as interfaces account for more than 50 appreciate that interactive interfaces mediate redistribution percent of code and significant portions of design effort, it is of cognitive tasks between people and machines. much more than that. Interfaces are the locus for new inter- Designed to aid cognition and simplify tasks, interfaces active representations and make possible new classes of function as COGNITIVE ARTIFACTS. Two features distinguish computationally based work materials. interfaces from other cognitive artifacts: they provide the There are many spheres of research activity in HCI. most plastic representational medium we have ever known, Three areas are of special interest. The first draws on what and they enable novel forms of communication. Interfaces we know about human perception and cognition, coupling it are plastic in the sense that they can readily mimic represen- with task analysis methods, to develop an engineering disci- tational characteristics of other media. This plasticity in pline for HCI. For examples, see the early work of Card, combination with the dynamic character of computation Moran, and Newell (1983) on the psychology of human- makes possible new interactive representations and forms of computer interaction, analysis techniques based on models communication that are impossible in other media. of human performance (John and Kieras 1997), the evolving The historical roots of human-computer interaction can subdiscipline of usability engineering (Nielsen 1993), work be traced to a human information-processing approach on human error (Woods 1988; Reason 1990), and the devel- to cognitive psychology. Human information processing opment of toolkits for interface design (Myers, Hollan, and (Card, Moran, and Newell 1983; Lindsay and Norman Cruz 1996). 1977) explicitly took the digital computer as the primary A second research activity explores interfaces that metaphorical resource for thinking about cognition. HCI expand representational possibilities beyond the menus and as a field grew out of early human information-processing icons of the desktop metaphor. The new field of information research and still reflects that lineage. Just as cognitive visualization (Hollan, Bederson, and Helfman 1997) pro- psychology focused on identifying the characteristics of vides many examples. The Information Visualizer (Card, individual cognition, human-computer interaction has, until Robertson, and Mackinlay 1991), a cognitive coprocessor very recently, focused almost exclusively on single indi- architecture and collection of 3-D visualization techniques, viduals interacting with applications derived from decom- supports navigation and browsing of large information positions of work activities into individual tasks. This spaces. Numerous techniques are being developed to help theoretical approach has dominated human-computer inter- visualize large complex systems (Eick and Joseph 1993; action for over twenty years, leading to a computing infra- Church and Helfman 1993), gather histories of interactions structure built around the personal computer and based with digital objects (Hill and Hollan 1994; Eick and Joseph on the desktop interface metaphor. 1993), and provide interactive multiscale views of informa- The desktop metaphor and associated graphical user tion spaces (Perlin and Fox 1993; Bederson et al. 1996). interface evolved from Sutherland’s Sketchpad (Sutherland A third active research area is computer-supported coop- 1963), a seminal system that introduced the interactive erative work (CSCW). (See Olson and Olson [1997] for a graphical interface in the early 1960s. The desktop meta- recent survey.) The roots of CSCW can be traced to Engel- phor and underlying technologies on which it is based were bart’s NLS system (Engelbart and English 1994). Among cast in modern form in a number of university and industrial other things, it provided the first demonstration of computer- research centers, most notably Xerox Parc. We now see the mediated interactions between people at remote sites. CSCW legacy of that work in the ubiquitous graphical interface of takes seriously what Hutchins (1995) has termed distributed 380 Human Nature cognition to highlight the fact that most thinking tasks Hill, W. C., and J. D. Hollan. (1994). History-enriched digital objects: prototypes and policy issues. The Information Society involve multiple individuals and shared artifacts. 10: 139–145. Overall, as Grudin (1993) has pointed out, we can view Hollan, J. D., B. B. Bederson, and J. Helfman. (1997). Information the development of HCI as a movement from early concerns visualization. In M. G. Helenader, T. K. Landauer, and P. with low-level computer issues, to a focus on people’s indi- Prabhu, Eds., The Handbook of Human Computer Interaction. vidual tasks and how better to support them, to current con- Amsterdam: Elsevier Science, pp. 33–48. cerns with supporting collaboration and sharing of Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT information within organizations. The phenomenal growth Press. of the World Wide Web and the associated changes in the John, B. E. and D. E. Kieras. (1997). Using GOMS for user inter- way we work and play are important demonstrations of the face design and evaluation: which technique? ACM Transac- impact of interface changes on sharing information. Not to tions on Computer-Human Interaction. Lampson, B. W. (1988). Personal distributed computing: the Alto minimize the importance of the underlying technologies and Ethernet software. In A. Goldberg, Ed., A History of Per- required for the Web (networks, file transfer protocols, etc.), sonal Workstations. New York: ACM Press, pp. 293–335. it is instructive to realize that they have all been available Lindsay, P. H., and D. A. Norman. (1977). Human Information since the early days of the Arpanet, the precursor to the Processing: An Introduction to Psychology. 2nd ed. New York: modern Internet. Changes at the level of the interface, mak- Academic Press. ing access to information on systems almost anywhere only Myers, B. A., J. D. Hollan, and I. F. Cruz. (1996). Strategic direc- a matter of clicking on a link, opened the Web to users and tions in human-computer interaction. ACM Computing Surveys resulted in its massive impact not only on scientific activi- 28(4): 794–809. ties but also on commercial and social interaction. Nickerson, R. S., and T. K. Landauer. (1997). Human-computer Myriad important issues, ranging from complex issues of interaction: background and issues. In G. Helenader, T. K. Lan- dauer, and P. Prabhu, Eds., The Handbook of Human Computer privacy and ownership of information to the challenges of Interaction. Amsterdam: Elsevier Science, pp. 3–31. creating new representations and understanding how to Nielsen, J. (1993). Usability Engineering. New York: Academic effectively accomplish what one might term urban planning Press. for electronic communities, face the HCI discipline. As long Norman, D. A., and S. Draper. (1986). User Centered System as the evaluative metric continues to be whether interfaces Design. Hillsdale, NJ: Erlbaum. help us accomplish our tasks more effectively and enjoyably Olson, G. M., and J. S. Olson. (1997). Research on computer sup- and we continue to explore the potential of new interactive ported cooperative work. In M. G. Helenader, T. K. Landauer, representations to allow us to know the world better and and P. Prabhu, Eds., The Handbook of Human Computer Inter- improve our relationships with others, the future of HCI will action. Amsterdam: Elsevier Science, pp. 1433–1456. remain bright and exciting. Perlin, K., and D. Fox. (1993). D. Pad: an alternative approach to the computer interface. In J. T. Kajiya, Ed., Computer Graphics See also MULTIAGENT SYSTEMS; SITUATEDNESS/EMBED- (SIGGRAPH ‘93 Proceedings). Vol. 27, pp. 57–64. DEDNESS Reason, J. (1990). Human Error. New York: Cambridge University Press. —James D. Hollan Schon, D. A. (1983). The Reflective Practitioner: How Profession- als Think in Action. New York: Basic Books. Sutherland, I. E. (1963). Sketchpad: a man-machine graphical References communication system. Proceedings AFIPS Spring Joint Com- Bederson, B. B., J. D. Hollan, K. Perlin, J. Meyer, D. Bacon, and puter Conference 1 23: 329–346. G. Furnas. (1996). Pad++: a zoomable graphical sketchpad for Winograd, T., Ed. (1996). Bringing Design to Software. Reading, exploring alternate interface physic. Journal of Visual Lan- MA: Addison-Wesley. guages and Computing 7: 3–31. Woods, D. D. (1988). Coping with complexity: the psychology of Card, S., T. Moran, and A. Newell. (1983). The Psychology of human behaviour in complex systems. In L. P. Goodstein, H. B. Human-Computer Interaction. Hillsdale, NJ: Erlbaum. Anderson, and S. E. Olsen, Eds., Task, Errors and Mental Mod- Card, S., G. Robertson, and J. Mackinlay. (1991). The information els. London: Taylor and Francis, pp. 128–148. visualizer. Proceedings of ACM CHI ’91 Conference on Human Factors in Computing Systems, pp. 181–188. Human Nature Church, K. W., and J. I. Helfman. (1993). Dotplot: a program for exploring self-similarity in millions of lines of text and code. Journal of Computational and Graphical Statistics See CULTURAL RELATIVISM; CULTURAL VARIATION; HUMAN 2(2): 153–174. UNIVERSALS Eick, S. G., and L. S. Joseph. (1993). Seesoft: a tool for visualizing line-oriented software statistics. IEEE Transactions on Soft- Human Navigation ware Engineering 18(11): 957–968. Engelbart, D., and W. A. English. (1994). Research center for augmenting human intellect. ACM Siggraph Video Review, p. Although the term navigation originates from the Latin 106. word for ship, the term has come to be used in a very gen- Grudin, J. (1993). Interface: an evolving concept. Communications eral way. It refers to the practice and skill of animals as well of the ACM 236(4): 110–119. as humans in finding their way and moving from one place Helenader, M. G., T. K. Landauer, and P. Prabhu, Eds. (1997). The to another by any means. Generally moving from place to Handbook of Human Computer Interaction. Amsterdam: Elsevier Science. place requires knowledge of where one is starting, where Human Navigation 381 the goal is, what the possible paths are, and how one is pro- tures to which Micronesian sailors attend that others might gressing during the movement. miss, such as slight changes of water color indicating Knowledge of where one is starting in one sense is obvi- underwater landmarks, changes in the wave patterns indi- ous; one can see the immediate surrounds. However, this is cating disturbance by nearby islands, the sighting of birds of little help if the relation of that place to the rest of the flying toward or away from islands at different times of the spatial layout is not known. This becomes of practical inter- day, and so on. est in the so-called drop-off situation, where one is dropped Across the open sea there are few constraints on what off at an unknown position and has to determine the loca- paths to take, so maintaining a steady course becomes quite tion. Practically, this could happen in a plane crash when important. Micronesian navigators have developed a form of flying over unknown territory, or more generally if one is celestial navigation in which direction at night is determined lost for any reason. in relation to the position on the horizon at which particular In such cases having a map is useful but requires match- stars rise and set. In fact, stars rising at the same point on the ing the perception of the surrounding environment with a horizon all follow the same track across the sky and set at particular position on the map. That problem can be cogni- the same place in what is called a linear constellation tively quite difficult. An observer has a particular viewpoint (Hutchins 1995). During the day the direction of the sun at of the environment that in general is rather limited. The map different times and the direction of wave patterns can be typically covers a much larger area and, of course, has an used to maintain course. infinite number of possible viewpoints. In addition, in many Most intriguing about the Micronesian system is how the locales the individual features are ambiguous; limited views islanders keep track of where along the journey they are. of any one hill, valley, or stream often look like others. Very The Micronesian navigators conceptualize the trip in terms experienced map readers have strategies that help them of a stationary canoe with an out-of-sight reference island overcome these difficulties (Pick et al. 1995). For example, moving past it. Thus the bearing from the canoe to such an in trying to figure out where on a topographical map they island changes as the journey progresses. Because the refer- are, successful readers focus initially on the terrain rather ence islands are generally out of sight, it makes no differ- than the map. This makes sense inasmuch as knowing the ence if the island really exists and, in fact, if there is not an details of the terrain would constrain the possibilities more actual convenient reference island an imaginary one is used. than knowledge of the big picture provided by the map. Suc- Because motion is relative, it makes no logical difference cessful readers look for configurations of features. As noted, whether one conceptualizes travel as involving a stationary any individual feature is ambiguous; configurations provide world and a moving canoe or a stationary canoe and a powerful constraints as to where one might be. moving world. Hutchins and Hinton (1984) hypothesize a Identifying possible paths from one place to another can very interesting explanation for why this conceptualization be based on prior spatial layout knowledge (see COGNITIVE makes sense on the basis of the sensory information avail- able at sea to the Micronesians. MAPS), but it can also be accomplished on the basis of maps. The changing bearing of a reference island is a conve- Depending on the type of map, constraints on possible paths nient way to keep track of where one is on a journey, as is will also be indicated, for example, roads on urban and plotting one’s position on a map or chart. But how does highway maps; mountains, streams, and the like on topo- one know how fast the bearing is changing or how far to graphic maps; and reefs, islands, water depths, and so forth move one’s chart position? If the environment is too on nautical charts. Many maps provide a very general and impoverished to keep track of position with respect to powerful reference system. The geographic coordinate sys- features, it must be done by somehow keeping track of tem, as Hutchins (1995) points out, not only enables specifi- one’s velocity and integrating over time to obtain distance cation of the location of starting point and destination, but moved, a procedure known as dead reckoning (Gallistel also permits easy determination of paths by graphic or 1990). Crossing undifferentiated expanses of sea or land, numerical computation. dead reckoning must be relied on for registering progress In many cases such as piloting a ship along a coast line, between celestial fixes unless modern navigational instru- navigation often involves specifying position in relation to ments are used. landmarks rather than the coordinate systems of maps or Another relevant case of impoverished environmental charts. Keeping track of one’s progress during travel is information is the very mundane activity of walking with- particularly problematic when information about position out vision. This is, of course, the common situation of blind in relation to landmarks is impoverished. One of the most people. They, like the Micronesian sailors, are able to keep impressive and cognitively interesting examples of sea track of their progress by attending to information often navigation is that of Micronesian islanders who tradi- ignored by others, for example odors marking particular tionally traveled from island to island across wide-open locations, and changes of air currents and acoustic reso- stretches of the South Pacific without navigational instru- nance properties that signify open passageways and the like ments. Their skill has been carefully studied by anthropol- (Welsh and Blasch 1980). However, internal information ogists (e.g., Gladwin 1970; Lewis 1978; Hutchins 1995). also specifies movement, for example by proprioception The islanders’ knowledge of the paths from one island to and/or motor commands. There is some evidence that blind another is in the form of sailing directions as to courses to people do not use this source of information to update their steer and features that are observable along the way. By position as well as do blindfolded sighted persons (Rieser, our standards, information in relation to such features is Guth, and Hill 1987; but see Loomis et al. 1993). This clearly impoverished. However, there are observable fea- 382 Human Universals advantage of sighted people may be due to optical flow examples are such disparate phenomena as tools, myths and information when walking with vision serving to calibrate legends, sex roles, social groups, aggression, gestures, nonvisual locomotion (Rieser et al. 1995). Thus movement grammar, phonemes, EMOTIONS, and psychological defense in nonvisual locomotion presents on a small scale some of mechanisms. Broadly defined universals often contain more the same problems involved in much grander global navi- specific universals, as in the case of kinship statuses, which gation. are universally included among social statuses. In some cases, the content of a universal is highly specific, as in the See also ANIMAL NAVIGATION; ANIMAL NAVIGATION, smile, frown, or other facial expressions of basic emotions NEURAL NETWORKS; BEHAVIOR-BASED ROBOTICS; COGNI- (Ekman, Sorenson, and Friesen 1969) and in the more com- TIVE ARTIFACTS; MOBILE ROBOTS; SPATIAL PERCEPTION plex “coyness display” (Eibl-Eibesfeldt 1989). —Herbert Pick Some universals have a collective referent-being found in all societies, languages, or cultures, but having a contingent References relation to individuals. Thus, dance is found in all societies or cultures, but not all individuals dance. Other universals, Gallistel, C. R. (1990). The Organization of Learning. Cambridge, such as a grasp of elementary logical concepts (not, and, or, MA: MIT Press. kind of, greater/lesser, etc.) or the use of gestures, character- Gladwin, T. (1970). East is a Big Bird. Cambridge, MA: Harvard ize the psyche or behavior of all (normal) individuals. Some University Press. universals—such as the predominance of women in infant Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT socialization (Levy 1989) or the ease with which youngsters Press. Hutchins, E., and G. E. Hinton. (1984). Why the islands move. acquire language—characterize all (normal) individuals of Perception 13: 629–632. one sex or age range. Lewis, D. (1978). The Voyaging Stars: Secrets of Pacific Island Human universals commanded attention from the found- Navigators. New York: W. W. Norton. ing of academic anthropology, but for much of this century Loomis, J. M., R. L. Klatzky, R. G. Golledge, J. G. Cicinelli, J. W. a variety of factors promoted an emphasis of cultural partic- Pellegrino, and P. A. Fry. (1993). Nonvisual navigation by blind ulars and a deemphasis of universals and the psychobiologi- and sighted: assessment of path integration ability. Journal of cal features that might underlie them (Brown 1991; Degler Experimental Psychology: General 122: 73–91. 1991; see also CULTURAL RELATIVISM). Seminal mid- Pick, H. L., M. R. Heinrichs, D. R. Montello, K. Smith, C. N. Sul- century essays on human universals (Murdock 1945; Kluck- livan, and W. B. Thompson. (1995). Topographic map reading. hohn 1953) were followed by the emergence of COGNITIVE In J. Flach, P. A. Hancock, J. K. Caird, and K. Vincente, Eds., The Ecology of Human-Machine Systems, vol. 2. Hillsdale, NJ: ANTHROPOLOGY, a fruitful field for the discovery and per- Erlbaum. suasive demonstration of universals (D’Andrade 1995). Rieser, J. J., D. A. Guth, and E. W. Hill. (1986). Sensitivity to per- Cognitive anthropology and the study of universals in gen- ceptive structure while walking without vision. Perception 15: eral have drawn heavily on developments in linguistics (see 173–188. LINGUISTIC UNIVERSALS). Rieser, J. J., H. L. Pick, Jr., D. H. Ashmead, and A. E. Garing. Anthropologists and linguists generally assume that (1995). Calibration of human locomotion and models of claims of universality should be validated by cross-cultural perceptual-motor organization. Journal of Experimental Psy- or cross-language research. However, a considerable amount chology: Human Perception and Performance 21: 480–497. of research in economics, political science, psychology, and Welsh, R. L., and B. B. Blasch. (1980). Foundations of Orientation sociology implicitly assumes universality without demon- and Mobility. New York: American Foundation for the Blind. strating it (but see COMPARATIVE PSYCHOLOGY). Thus, many Further Readings research findings from these fields may or may not have uni- versal validity. Cornell, E. H., C. D. Heth, and D. M. Alberts. (1994). Place recog- Because of the practical difficulties that are involved, nition and way finding by children and adults. Memory and the existence of particular universals is normally demon- Cognition 22: 633–643. strated not by exhaustive examination of the historical and Eley, M. G. (1988). Determining the shapes of land surfaces from ethnographic records but rather by some form of sampling. topographical maps. Ergonomics 31: 355–376. In spite of these difficulties, existing lists of universals Maguire, E. A., N. Burgess, J. G. Donnett, R. S. J. Frackowiak, C. show substantial overlap (e.g., Brown 1991; Murdock 1945; D. Frith, and J. O’Keefe. (1998). Knowing where and getting there: a human navigational network. Science 280: 921–924. Hockett 1973; Tiger and Fox 1971). Thorndyke, P. W., and B. Hayes-Roth. (1982). Differences in spa- Among the variations on the basic concept of human tial knowledge acquired from maps and navigation. Cognitive universals are conditional (or implicational) universals, sta- Psychology 14: 560–589. tistical universals, near universals, and universal pools. A conditional universal refers to a cross-culturally invariant rule or linkage whereby if condition x obtains, then y will Human Universals obtain. The evolution of “basic color terms” provides a well-documented example: if a language possesses only Human universals comprise those features of culture, soci- three basic color terms, they will be black, white, and red ety, language, behavior, and psyche for which there are no (Berlin and Kay 1969; see also COLOR CATEGORIZATION). known exceptions to their existence in all ethnographically The real universality in such cases consists not in the mani- or historically recorded human societies. Among the many fest occurrence of specific phenomena but in a pattern of Human Universals 383 co-occurrences and its underlying causation (see the discus- modules that underpin complex, innate features of the sion of universal mechanisms beneath variable behavior in human psyche (Hirschfeld and Gelman 1994; see also Tooby and Cosmides 1992). DOMAIN SPECIFICITY). Elucidating the evolved architecture Similarly, although the manifest cross-cultural frequency of the (universal) human mind and its causal role in the con- of occurrence of a statistical universal need only be greater struction of society and culture is the domain of EVOLUTION- than chance, it implies a universal explanation rather than a ARY PSYCHOLOGY (Barkow, Cosmides, and Tooby 1992). series of culturally specific explanations. For example, Studies within this framework seek to specify the univer- given all the possible terms that might be used to refer to the sals of mind that underlie manifest universals and to explain pupil of the eye, in a very disproportionate number of lan- them in both ultimate (evolutionary) and proximate terms. guages the term refers to a little person, presumably because Thus Symons (1979) explains several species-typical sex people everywhere see their own reflections in the pupils of differences in human sexuality in terms of a theory derived other people’s eyes (Brown and Witkowski 1981; see also from a wide comparison of animal species. For example, men compete more intensely or violently for mates, desire FIGURATIVE LANGUAGE). Keeping domestic dogs is only a near universal, as there more variety of sexual partners, and attend more to the were peoples who, until recently, lacked them. In many physical features (especially signs of youth) of their mates. cases the explanations for near universals and (absolute) Both sexes assume that sex is a service or favor that women universals are essentially the same. The designation of near provide to men. The theory (Trivers 1972) predicts these universality sometimes merely indicates uncertainty about differences as consequences of the typically larger female the (absolute) universality of the item in question. investment in each offspring (e.g., the minimum invest- A universal pool is a fixed set of possibilities from which ments are the female’s nine months of gestation and the particular manifestations are everywhere drawn. For exam- male’s insemination). Studies of this sort offer a more com- ple, a classic study found that in a sample of diverse kinship prehensive illumination of human universals than had hith- terminologies only a small pool of semantic features (e.g., erto been the case. sex of speaker, sex of relative, lineal versus collateral rela- See also ADAPTATION AND ADAPTATIONISM; COGNITIVE tive, etc.) were drawn upon to distinguish one kin term from ARCHITECTURE; CULTURAL VARIATION; SEXUAL ATTRAC- another (Kroeber 1909). TION, EVOLUTIONARY PSYCHOLOGY OF There are only a few general explanations for universals. —Donald Brown Some cultural universals, for example, appear to be inven- tions that, due to their great antiquity and usefulness, have diffused to all societies. The use of fire and the more spe- References cific use of fire to cook food are examples. The dog Barkow, J. H., L. Cosmides, and J. Tooby, Eds. (1992). The achieved near universality for the same reasons. Other uni- Adapted Mind: Evolutionary Psychology and the Generation of versals appear to be reflections in culture of noncultural fea- Culture. New York: Oxford University Press. tures that are ubiquitous and important for one reason or Berlin, B. (1992). Ethnobiological Classification: Principles of another. Kinship terminologies (which are simultaneously Categorization of Plants and Animals in Traditional Societies. social, cultural, and linguistic) are found among all peoples, Princeton: Princeton University Press. and in all cases they reflect (at least in part) the relationships Berlin, B., and P. Kay. (1969). Basic Color Terms: Their Univer- necessarily generated by the facts of biological reproduc- sality and Evolution. Berkeley: University of California Press. tion. Systems of classification—of plants and animals as Brown, C. H., and S. R. Witkowski. (1981). Figurative language in well as kin—were among the most important arenas for the a universalist perspective. American Ethnologist 9: 596–615. Brown, D. E. (1991). Human Universals. New York: McGraw- development of cognitive studies in anthropology (Berlin Hill. 1992; D’Andrade 1995; see NATURAL KINDS). Chomsky, N. (1959). Review of B. F. Skinner’s verbal behavior. Yet other universals spring more or less directly from Language 35: 26–58. human nature and, thus, are causally formative of societies, D’Andrade, R. G. (1995). The Development of Cognitive Anthro- cultures, and languages—and even the course of history. pology. Cambridge: Cambridge University Press. The syndrome of cognitive and emotional traits comprising Degler, C. (1991). In Search of Human Nature: The Decline and romantic love, for instance, is known everywhere, inspiring Revival of Darwinism in American Social Thought. New York: poetry as well as reproduction, while giving rise to families Oxford University Press. and much human conflict (Harris 1995; Jankowiak 1995). Eibl-Eibesfeldt, I. (1989). Human Ethology. New York: Aldine de In recent decades, an inclusive framework for explaining Gruyter. Ekman, P., E. R. Sorenson, and W. V. Friesen. (1969). Pan-cultural those universals embodied in or springing from human elements in facial displays of emotion. Science 164: 86–88. nature has emerged from a combination of fields. From Hamilton, W. D. (1964). The genetical evolution of social behav- ETHOLOGY and animal behavior have come the identification ior, parts 1 and 2. Journal of Theoretical Biology 7: 1–52. of species-typical behaviors and the developmental pro- Harris, H. (1995). Human Nature and the Nature of Romantic cesses (combining innateness and learning) that produce Love. Ph.D. diss., University of California at Santa Barbara. them (see, e.g., Eibl-Eibesfeldt 1989; Seligman and Hager Hirschfeld, L. A., and S. A. Gelman, Eds. (1994). Mapping the 1972; Tiger and Fox 1971); from SOCIOBIOLOGY have come Mind: Domain Specificity in Cognition and Culture. Cam- ultimate explanations for such universals as kin altruism and bridge: Cambridge University Press. the norm of reciprocity (Hamilton 1964; Trivers 1971); from Hockett, C. F. (1973). Man’s Place in Nature. New York: Chomsky (1959) has come the notion of mental organs or McGraw-Hill. 384 Hume, David Hume held that the ability to form beliefs, in contrast to Jankowiak, W. (1995). Romantic Passion: A Universal Experi- ence? New York: Columbia University Press. having sensations and emotions, was a matter of inference. Kluckhohn, C. (1953). Universal categories of culture. In Anthro- The belief that bread nourishes is an inference from the con- pology Today: An Encyclopedic Inventory. Chicago: University stant conjunction of the ingestion of bread with nourish- of Chicago Press, pp. 507–523. ment. In what is now called the problem of INDUCTION, Kroeber, A. L. (1909). Classificatory systems of relationship. Jour- Hume argued that this inference is not a deductive infer- nal of the Royal Anthropological Institute 39: 77–84. ence, because the proof of the conclusion is not guaranteed Levy, M. J. Jr. (1989). Maternal Influence: The Search For Social by the truth of the premises, and any proof based on experi- Universals. Berkeley: University of California Press. ence is itself an inductive inference, making it thus a circu- Murdock, P. (1945). The common denominator of cultures. In R. lar proof. Hume’s conclusion is a skeptical one. There is no Linton, Ed., The Science of Man in the World Crisis. New York: rational justification for CAUSAL REASONING. Columbia University Press, pp. 123–142. Seligman, M. and J. Hager. (1972). Biological Boundaries of Causal inference leading to belief is a matter of custom Learning. New York: Appleton-Century-Crofts. and habit. The constant conjunction of perceptions experi- Symons, D. (1979). The Evolution of Human Sexuality. New York: enced lead cognitive agents to have certain lively ideas or Oxford University Press. beliefs. One cannot help but believe that fire is hot. Hume Tiger, L., and R. Fox. (1971). The Imperial Animal. New York: emphasized that both humans and other animals make such Holt, Rinehart and Winston. inductive or causal inferences to predict and explain the Tooby, J., and L. Cosmides. (1992). The evolutionary and psycho- world, in spite of the fact that such inferences cannot be logical foundations of culture. In J. H. Barkow, L. Cosmides, rationally justified. In the section of the Treatise entitled “Of and J. Tooby, Eds., The Adapted Mind: Evolutionary Psychol- the reason of animals,” Hume anticipated COGNITIVE ogy and the Generation of Culture. New York: Oxford Univer- ETHOLOGY by appealing to evidence about nonhuman ani- sity Press, pp. 3–136. Trivers, R. L. (1971). The evolution of reciprocal altruism. Quar- mals in support of the claim that inferences from past terly Review of Biology 46: 35–57. instances are made by members of many species. The pos- Trivers, R. L. (1972). Parental investment and sexual selection. In session of language by humans, Hume held, makes it possi- B. Campbell, Ed., Sexual Selection and the Descent of Man. ble for humans to make more precise inferences than other Chicago: Aldine, pp. 136–179. animals, but this is a matter of degree, not of kind. If beliefs are habitual responses to environmental regu- larities, how is it that beliefs that deny such regularities are Hume, David held? Hume’s critical examination of religious belief and the belief in the existence of miracles inspired him to offer a Impressed by Isaac Newton’s success at explaining the fuller account of the nature of belief formation and credu- apparently diverse and chaotic physical world with a few lity. Hume noted several belief-enlivening associative universal principles, David Hume (1711–1776), while still mechanisms in addition to constant conjunction. Belief is in his teens, proposed that the same might be done for the influenced by such factors as proximity, resemblance, and realm of the mind. Through observation and experimenta- repetition. A pilgrimage to the Red Sea, for example, will tion, Hume hoped to uncover the mind’s “secret springs and serve to make one more receptive to the claim that the sea principles.” Hume’s proposal for a science of the mind was parted. Hume’s treatment of the factors influencing belief published as A Treatise of Human Nature in 1740, and subti- anticipates the studies of Kahneman and TVERSKY (Tversky tled “An Attempt to introduce the experimental Method of and Kahneman 1974) on the selective availability of evi- Reasoning into moral subjects.” Though it is now one of the dence in PROBABILISTIC REASONING. most widely read works in Western philosophy, the recep- Hume rejected DESCARTES’s claim that the mind is a tion of the Treatise in Hume’s lifetime was disappointing. In mental substance on the grounds that there is no impression My Own Life Hume says that the Treatise “fell dead-born from which such an idea of mental substance could be from the press.” Considered an atheist by the clergy, which derived. Introspection provides access to the mind’s percep- controlled university appointments, Hume sought but never tions, but not to anything in which those perceptions inhere. received a professorship. “When I enter most intimately into what I call myself, I Hume is widely regarded as belonging with Locke and always stumble on some particular perception or other, of Berkeley to the philosophical movement called British Empir- heat or cold, light or shade, love or hatred, pain or pleasure. icism (see RATIONALISM VS. EMPIRICISM). The mind contains I never can catch myself at any time without a perception, two kinds of perceptions: impressions and ideas. Impressions and never can observe any thing but the perception” (Trea- are the original lively SENSATIONS and EMOTIONS. Ideas are tise, p. 252). The mind, Hume concludes, “is nothing but a fainter copies of impressions. Like Locke, Hume rejected the heap or collection of different perceptions.” NATIVISM of the rationalists. There are no ideas without prior The bundle theory of the mind has been criticized by impressions, so no ideas are innate. Impressions and ideas recent philosophers of mind. Hume attempted to account for may be simple or complex. The imagination freely concate- mental representation by the dynamic interaction of mental nates perceptions; the understanding and the passions organize items—impressions and ideas. It is not clear that Hume was perceptions by more regular rules of association. MEMORY able to characterize such interaction as mental without ideas, for example, preserve the order and position of the appealing to the fact that impressions and ideas are the per- impressions from which they derive. Hume’s theory of ideas is ceptions of a mind. According to Hume’s critics, Hume helps an empiricist account of concept formation (see CONCEPTS). himself to the concept of mind rather than account for it in Illusions 385 nonmentalistic terms. Dennett (1978) refers to this as Hume’s Illusions Problem. Haugeland (1984) argues that such mechanistic accounts of the mind, which predate the notion of automatic A hallucination is a false perception seen by one person, symbol manipulation, cannot avoid Hume’s problem. often drugged or mentally ill, in the absence of stimulation, The perceptions of the mind include emotions and whereas an illusion is a misinterpretation of a stimulus con- passions as well as beliefs, and Hume attempted to offer a sistently experienced by most normal people. Illusions are unified account of all mental operations. Beliefs are lively characteristic errors of the visual system that may give us or vivacious ideas that result from a certain kind of mental clues to underlying mechanisms. There are several types of preparation, a constant conjunction of pairs of impressions visual illusion: such as impressions of flame joined with impressions of heat. The lively idea of heat gets its vivacity from the habit Geometrical: Illusions of Angle and Size or custom formed by the experience of the constantly conjoined impression pair. Beliefs, then, are themselves Angles: In the Zollner and Hering (1861) illusions, the feelings. Both emotions and beliefs are strongly held long parallel lines appear to be tilted away from the orienta- perceptions, and the mechanisms that actuate one can tion of the background fins (figure 1). Blakemore (1973) influence the other. Fear of falling from a precipice may, attributes this to “tilt contrast,” caused by lateral inhibition for example, displace a belief that one is secure. Like between neural signals of orientation, which will expand the recent theorists, Hume held that both probability and the appearance of acute angles. This theory has problems with degree of the severity of anticipated pain or pleasure play a the Poggendorff (1860) illusion, which still persists when role in the resolution of conflicts of emotion and only the obtuse angles are present, but vanishes when only judgment. the acute angles are present (see Burmester 1986). In See also CAUSATION; KANT, IMMANUEL; NATIVISM, HIS- Fraser’s (1908) twisted-rope spiral illusion, the circles look TORY OF; RELIGIOUS IDEAS AND PRACTICES; SELF like spirals because of the twisted local ropes, abetted by the diamonds in the background. The visual system seems to —Saul Traiger take “local votes” of orientation and fails to integrate them correctly. References Size: Gregory points out that size constancy scaling Dennett, D. C. (1978). A cure for the common code? In Brain- allows the correct perception of the size of objects even storms. Montgomery, VT: Bradford Books. when their distance, and hence retinal size, changes. He Haugeland, J. (1984). Artificial Intelligence: The Very Idea. Cam- attributes the geometrical illusions of size to an inappropri- bridge, MA: MIT Press. ate application of size constancy. Thus the nearby man in Hume, D. (1973). A Treatise of Human Nature. 2nd ed. L. A. figure 1 looks tiny compared to the distant man, even Selby-Bigge and P. H. Nidditch, Eds. Oxford: Oxford Univer- though the retinal sizes are identical: depth cues from the sity Press. ground plane allow size constancy perceptually to expand Hume, D. (1975). Enquiries concerning Human Understanding and distant objects. More subtly, in the Muller Lyer illusion, the concerning the Principles of Morals. 3rd ed. L. A. Selby-Bigge arrow fins provide cues similar to those from receding and and P. H. Nidditch, Eds. Oxford: Oxford University Press. Hume, D. (1874). My own life. In T. H. Green and T. H. Grose, advancing corners, and in Roger Shepard’s table illusion Eds., The Philosophical Works of David Hume. London. (figure 2, top), the two table tops are geometrically identical Tversky, A., and D. Kahneman. (1974). Judgments under uncer- on the page but look like completely different 3-D shapes, tainty: heuristics and biases. Science 185: 1124–1131. because size and shape constancy subjectively expand the near-far dimension along the line of sight to compensate for Further Readings geometrical foreshortening. Baier, A. C. (1991). A Progress of Sentiments: Reflections on Illusions of Lightness and Color Hume’s Treatise. Cambridge, MA: Harvard University Press. Biro, J. (1985). Hume and cognitive science. History of Philosophy A gray cross looks darker on a white than on a black sur- Quarterly 2(3) July. round. On a red surround it looks greenish, probably owing Flanagan, O. (1992). Consciousness Reconsidered. Cambridge, to lateral inhibition from neurons that sense the white or MA: MIT Press. Kemp Smith, N. (1941). The Philosophy of David Hume. London: Macmillan. Smith, J.-C. (1990). Historical Foundations of Cognitive Science. Dordrecht: Kluwer Academic. Traiger, S. (1994). The secret operations of the mind. Minds and Machines 4(3): 303–316. Wright, J. P. (1983). The Sceptical Realism of David Hume. Min- neapolis: University of Minnesota Press. Identity Theory See MIND-BODY PROBLEM; PHYSICALISM Figure 1. Illusion of size 386 Illusions Figure 2. Top: Illusion of size; Bottom: Illusions of interpretation. colored surround. In White’s illusion, the grey segments that replace black bars are bordered by much white above and below, so they “ought” to look darker than the gray seg- ments that replace white bars. But in fact they look lighter. Figure 3. Top: Illusions of angle and size; Middle: Illusions of This might show that lateral inhibition operates more lightness and color; Bottom: Illusions of interpretation. strongly along the bars than across them, but more likely it is a top-down “belongingness” effect based on colinearity. can be seen on CD-ROM (Scientific American) and on Al Illusions of Interpretation: Ambiguous, Seckel’s Web page at http://www.lainet.com/illusions/. Impossible, and Puzzle Pictures See also COLOR VISION; LIGHTNESS PERCEPTION; PICTO- RIAL ART AND VISION; SPATIAL PERCEPTION; TOP-DOWN In ambiguous pictures, the brain switches between two pos- PROCESSING IN VISION sible figure-ground interpretations—in figure 3, faces and a vase—even though the stimulus does not change. The shape —Stuart Anstis of the region selected as figure is perceived and remem- bered; that of the ground is not. References Impossible figures embody conflicting 3-D cues. Pen- Blakemore, C. B. (1973). The baffled brain. In R. L. Gregory and rose’s “impossible triangle” (figure 3) is not the projection E. H. Gombrich, Eds., Illusion in Nature and Art. London: of any possible physical object. It holds together by means Duckworth. of incorrect connections between its corners, which are cor- Block, J. R., and H. Yuker. (1992) Can You Believe Your Eyes? rect locally but incorrect globally. Shepard’s elephant (fig- Brunner/Mazel. ure 2) confuses its legs with the spaces in between. Local Burmester, E. (1986). Beitrag zur experimentellen Bestimmung votes about depth are not properly integrated. Impossible geometricsh–optischen. Z. Psychol. 12: 355 objects cannot be consistently painted with colors. Ernst, B. (1985). The Magic Mirror of M. C. Escher. Stradbroke, In puzzle pictures there is one possible interpretation, but England: Tarquin. Fraser, J. (1908). A new illusion of directing. British Journal of reduced cues make it hard to find. (Shepard’s Sara Nader, Psychology 2, 307–320. figure 2, is unusual in having two different possibilities). Gregory, R. L. (1974). Concepts and Mechanisms of Perception. Once found, immediate perceptual learning makes the pic- London: Duckworth. ture easier to remember, and immediately recognizable the Robinson, J. (1971). The Psychology of Visual Illusions. London: next time it is seen. Hutchinson. The Belgian surrealist painter René Magritte (1898– Rock, I., and D. H. Hubel. (1997). Illusion (CD-ROM). Scientific 1967) and the Dutch engraver Maurits Escher (1898–1972) American Library. Simon and Schuster/Byron Preiss Multimedia. filled their works with splendid visual illusions (http:// Shepard, R. N. (1990). Mindsights. New York: Freeman. www.bright-ideas.com/magrittegallery/aF.html and http:// Wade, N. (1982) The Art and Science of Visual Illusions. London: lonestar.texas.net/~escher/gallery/). Collections of illusions Routledge and Kegan Paul. Imagery 387 that can be visualized, such as “boat” or “cat,” than they are Imagery if they name an abstract idea, such as “justice” or “kind- ness.” The way objects are visualized can, however, affect The term imagery is inherently ambiguous—it can refer how easily the words they correspond to are remembered; to iconography, visual effects in cinematography, mental Bower (1972) found that forming an image of the interac- events, or even to some types of prose. This article focuses tion between a pair of objects made it easier to remember on mental imagery. For example, when asked to recall the the names of those objects than was the case by simply number of windows in one’s living room, most people forming an image of the objects existing separately. report that they visualize the room, and mentally scan over Imagery also affects the detection of perceptual stimuli. this image, counting the windows. In this article we con- Studies of imagery benefited from new techniques devel- sider not the introspections themselves, but rather the nature oped to investigate perception. For example, SIGNAL DETEC- of the underlying representations that give rise to them. TION THEORY (e.g., Green and Swets 1966) has been used to Because most research on imagery has addressed visual show that forming a visual mental image interrupts vision mental imagery, we will focus on that modality. more than it does hearing, but forming an auditory image Unlike most other cognitive activities (such as language interrupts hearing more than vision. Such results reveal that and memory), we only know that visual mental image rep- imagery draws on mechanisms used in like-modality per- resentations are present because we have the experience of ception. However, Craver-Lemley and Reeves (1992) report, seeing, but in the absence of the appropriate sensory input. this interference occurs only if the image overlaps the target. And here lies a central problem in the study of imagery (the It is also worth noting that images can be mistaken for per- “introspective dilemma”): there is no way that another per- cepts, both at the time and in memory (e.g., Johnson and son can verify what we “see” with our inner eye, and hence Raye 1981; see Kosslyn 1994 for a review). the phenomenon smacks of unscientific, fanciful confabu- One reason that behaviorism failed to hold sway was lation. that a viable alternative was introduced: cognitive psychol- Nevertheless, imagery played a central role in theories of ogy. Cognitive psychology likened the mind to software the mind for centuries. For example, the British Associa- running on a computer, and imagery—along with all other tionists conceptualized thought itself as sequences of mental processes—was conceptualized in terms of such images. And, WILHELM WUNDT, the founder of scientific “information processing.” This approach stressed that dif- psychology, emphasized the analysis of images. However, ferent sets of processes are used to accomplish different the central role of imagery in theories of mental activity was ends. For example, to plan how to arrange furniture in an undermined when Kulpe, in 1904, pointed out that some empty room, one could generate an image of the furniture thoughts are not accompanied by imagery (e.g., one is not and then mentally transform it (rotate it, perhaps imagine aware of the processes that allow one to decide which of stacking a shelf on a crate or desk); one must maintain the two objects is heavier). image as it is being inspected. Sometimes imagery involves The observation that images are not the hallmark of all only some of these processes; for instance, image transfor- thought processes was soon followed by the notion that mation is irrelevant in remembering the way to get to the images as ordinarily conceived (and, indeed, thoughts them- train station. Information processing is often studied by selves) may not even exist! John Watson (1913)—the measuring response times. founder of BEHAVIORISM—argued that images actually cor- Consider tasks that require image inspection. Do you respond to subtle movements of the larynx. Watson empha- know the shape of a Saint Bernard’s ears? Subjects will sized the precept that scientific phenomena must be publicly require more time to perform this task if they are first told to observable, and imagery clearly is not. This line of argu- visualize the dog far off in the distance; before they can ment led to diminished interest in imagery until the early answer the question, they will “zoom in” on the part of their 1960s. image containing the dog’s head. Similarly, subjects will Behaviorism ultimately had a salutary effect on the study require more time to answer this question if they are first of imagery; it made it rigorous. Whereas researchers in told to visualize a Saint Bernard’s tail; they will now men- Wundt’s lab were trained in INTROSPECTION, modern tally “scan” across the image of the dog in order to “look” at researchers are trained in measuring subtle aspects of its ears. Indeed, the time to respond increases linearly with behavior. In such cases, the behavior is like the track left by the amount of distance scanned—even though the subject’s a cosmic ray passing through a cloud chamber: It is a kind eyes are closed. Such results suggest that the image repre- of signature, a hallmark, that allows us to draw inferences sentations embody spatial extent (see Kosslyn 1994). about the phenomena that produced it. Now consider image generation. Images are generated by Imagery has effects on the accuracy of LEARNING and activating stored information. Images are created “piece by MEMORY. In the early 1960s the behaviorist approach to lan- piece,” and thus an image with more parts takes longer to guage led to the study of “verbal learning”: words were form. Indeed, the time to form an image often increases lin- treated as stimuli that could be learned like any other. Paivio early with the number of parts that are assembled. and his colleagues (e.g., Paivio 1971) showed that imagery Finally, consider image transformation. When objects in is a major factor that affects the ease of learning words. Not images are transformed, they often mimic the movements of only is a picture worth a thousand words, it is also easier to real objects. For example, rotating objects appear to shift remember. The same is true for “mental pictures.” Words through a trajectory; the farther one rotates an imaged are better learned and remembered if they name an object object, the longer it takes to do so (see MENTAL ROTATION). 388 Imagery Shepard (1984) suggests that this occurs because, due to ferent ways (with each combination corresponding to a dis- natural selection, certain laws of physics have been internal- tinct method or strategy) to allow one to accomplish any ized in the brain and act as constraints on the imagery pro- given imagery task. cess. Alternatively, it is possible that motor programs guide See also WORD MEANING, ACQUISITION OF; DREAMING; imagery and, consequently, objects are mentally manipu- HAPTIC PERCEPTION; PICTORIAL ART AND VISION lated in the same manner that real objects would be physi- —Stephen M. Kosslyn and Carolyn S. Rabin cally manipulated. In fact, motor parts of the brain have been shown to be activated during some image transforma- References tion tasks (e.g., Georgopoulos et al. 1989). The computer metaphor encouraged a kind of “disembod- Behrmann, M., S. M. Kosslyn, and M. Jeannerod, Eds. (1996). The ied mind” approach, which ignored certain classes of data Neuropsychology of Mental Imagery. New York: Pergamon. Bisiach, E., and C. Luzzatti. (1978). Unilateral neglect of represen- and constraints on theories. Recent years have seen a sharp tational space. Cortex 14: 129–133. increase in research on the neural bases of imagery. Such Boring, E. G. (1950). A History of Experimental Psychology. 2nd research has addressed each of the processes noted earlier, as ed. New York: Appleton-Century-Crofts. well as the issue of how images are internally represented. Bower, G. H. (1972). Mental imagery and associative learning. In It has long been known that some parts of the brain are L. Gregg, Ed., Cognition in Learning and Memory. New York: spatially organized (see Felleman and Van Essen 1991); Wiley. the pattern of activity on the retina is projected onto these Corballis, M. C., and J. Sergent. (1989). Hemispheric specializa- areas (albeit distorted in several ways). Studies reported by tion for mental rotation. Cortex 25: 15–25. five laboratories have now shown that some of these areas, Craver-Lemley, C., and A. Reeves. (1992). How visual imagery such as primary VISUAL CORTEX, are activated when peo- interferes with vision. Psychological Review 99: 633–649. Farah, M. J., (1984). The neurological basis of mental imagery: a ple visualize. Moreover, the pattern of activation is sys- componential analysis. Cognition 18: 245–272. tematically altered by spatial properties of the image, in a Felleman, D. J., and D. C. Van Essen. (1991). Distributed hierar- way similar to what occurs in perception. Such results sug- chical processing in the primate cerebral cortex. Cerebral Cor- gest that imagery relies on “depictive” representations, tex 1: 1–47. which use space to represent space. This result bears Georgopolous, A. P., J. T. Lurito, M. Petrides, A. B. Schwartz, and directly on the “imagery debate” of the 1970s and 1980s, J. T. Massey. (1989). Mental rotation of the neuronal population which focused on the question of whether a depictive rep- vector. Science 243: 234–236. resentation is used during imagery. However, this result is Green, D. M., and J. A. Swets. (1966). Signal Detection Theory not always obtained, and the crucial differences between and Psychophysics. New York: Wiley. the tasks that do and do not engender such representations Johnson, M. K., and C. L. Raye. (1981). Reality monitoring. Psy- chological Review 88: 67–85. have not yet been identified (Mellet et al. 1998). Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Image generation is most often disrupted by damage to Imagery Debate. Cambridge, MA: MIT Press. the posterior left hemisphere (Farah 1984). However, recent Logie, R. H., and M. Denis. (1991). Mental Images in Human Cog- studies have shown that images can be generated in at least nition. Amsterdam: North Holland. two ways, one of which uses stored descriptions to arrange Mellet, E., L. Petit, B. Mazoyer, M. Denis, and N. Tzourio. (1998). parts and relies primarily on left-hemisphere processes, and Reopening the mental imagery debate: Lessons from functional the other of which uses stored metric spatial information to neuroanatomy. NeuroImage 8: 129–139. arrange parts and relies primarily on the right hemisphere Paivio, A. (1971). Imagery and Verbal Processes. New York: Holt, (for a review of the literature on this topic, see Behrmann, Rinehart and Winston. Kosslyn and Jeannerod 1996). Ratcliff, G. (1979). Spatial thought, mental rotation and the right cerebral hemisphere. Neuropsychologia 17: 49–54. Patients with brain damage that impairs visual perception Shepard, R. N. (1984). Ecological constraints on internal represen- sometimes also have corresponding deficits when they tation: resonant kinematics of perceiving, imagining, thinking, inspect imaged objects. For example, some patients who and dreaming. Psychological Review 91: 417–447. have suffered damage to the posterior right parietal lobe dis- Watson, J. B. (1913). Psychology as the behaviorist views it. Psy- play a phenomenon known as “unilateral VISUAL NEGLECT”; chological Review 20: 158–177. they ignore objects to the left side of space. These patients may display the same behavior when visualizing—they Further Readings ignore objects on the left half of their images (e.g., Bisiach and Luzzatti 1978). Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological Review 85: 249–277. Image transformations are accomplished by a complex Arditi, A., J. D. Holtzman, and S. M. Kosslyn. (1988). Mental set of brain areas. However, studies with brain-damaged imagery and sensory experience in congenital blindness. Neu- patients suggest that the right hemisphere plays a crucial ropsychologia 26: 1–12. role in the transformation process itself (e.g., Corballis and Behrmann, M., G. Winocur, and M. Moscovitch. (1992). Dissocia- Sergent 1989; Ratcliff 1979). tion between mental imagery and object recognition in a brain- In conclusion, the recent research on visual mental imag- damaged patient. Nature 359: 636–637. ery reveals that imagery is not a unitary phenomenon, but Block, N. (1981). Imagery. Cambridge, MA: MIT Press. rather is accomplished by a collection of distinct processes. Brandimonte, M., and W. Gerbino. (1993). Mental image reversal These processes are implemented by neural systems, not and verbal recoding: when ducks become rabbits. Memory and discrete “centers.” These processes can be combined in dif- Cogniton 21: 23–33. Imitation 389 Classical theories of COGNITIVE DEVELOPMENT postu- Chambers, D., and D. Reisberg. (1985). Can mental images be ambiguous? Journal of Experimental Psychology: Human Per- lated that newborns did not understand the similarity ception and Performance 11: 317–328. between themselves and others. Newborns were said to be Charlot, V., M. Tzourio, M. Zilbovicius, B. Mazoyer, and M. “solipsistic,” experiencing their own internal sensations and Denis. (1992). Different mental imagery abilities result in dif- seeing the movements of others without linking the two ferent regional cerebral blood flow activation patterns during (Piaget 1962). According to Jean PIAGET, the imitation of cognitive tasks. Neuropsychologia 30: 565–580. facial gestures was first possible at one year of age, a land- Corballis, M. C., and R. McLaren. (1982). Interactions between mark development that was a prerequisite for representa- perceived and imagined rotation. Journal of Experimental tional abilities. In sharp contrast, modern empirical work Psychology: Human Perception and Performance 8: 215– has shown that infants as young as forty-two minutes old 224. successfully imitate adult facial gestures (Meltzoff and Denis, M., and M. Carfantan. (1985). People’s knowledge about images. Cognition 20: 49–60. Moore 1983). Imitation is innate in humans (figure 1). Eddy, J. K., and A. L. Glass. (1981). Reading and listening to high Facial imitation presents a puzzle. Infants can see an and low imagery sentences. Journal of Verbal Learning and adult’s face but cannot see their own faces. They can feel Verbal Behavior 20: 333–345. their own faces move but have no access to the feelings of Farah, M. J. (1988). Is visual imagery really visual? Overlooked movement in another person. How is facial imitation possi- evidence from neuropsychology. Psychological Review 95: ble? One candidate is “active intermodal mapping.” The 301–317. crux of the view is that an infant represents the adult facial Farah, M. J., and G. Ratcliff, Eds. (1994). The Neural Bases of expression and actively tries to make his or her own face Mental Imagery. Hillsdale, NJ: Erlbaum. match that target. Of course, infants do not see their own Farah, M. J., M. J. Soso, and R. M. Dasheiff. (1992). Visual angle facial movements, but they can use proprioception to moni- of the mind’s eye before and after unilateral occipital lobec- tomy. Journal of Experimental Psychology: Human Perception tor their own unseen actions and to correct their behavior. and Performance 18: 241–246. According to this view, the perception and production of Finke, R. A. (1989). Principles of Mental Imagery. Cambridge, human acts are represented in a common “supramodal” MA: MIT Press. framework and can be directly compared to one another. Finke, R. A., and S. Pinker. (1982). Spontaneous imagery scanning Meltzoff and Moore (1997) provide a detailed account of in mental extrapolation. Journal of Experimental Psychology: the metric used for establishing cross-modal equivalence of Learning, Memory, and Cognition 8: 142–147. human acts and its possible brain basis. Finke, R. A., and R. N. Shepard. (1986). Visual functions of men- The findings on imitation suggest a common coding for tal imagery. In K. R. Boff, L. Kaufman, and J. P. Thomas, Eds., perception and production. Work with adults analyzing Handbook of Perception and Human Performance. New York: brain sites and cognitive mechanisms involved in the imita- Wiley-Interscience, pp. 37–1 to 37–55. Hampson, P. J., D. F. Marks, and J. T. E. Richardson. (1990). Imag- tion, perception, and imagination of human acts suggests ery: Current Developments. London: Routledge. they tap common processes (Jeannerod and Decety 1995; Hinton, G. (1979). Some demonstrations of the effects of struc- Prinz 1990). Neurophysiological studies show that in some tural descriptions in mental imagery. Cognitive Science 3: cases the same neurons become activated when monkeys 231–250. perform an action as when they observe a similar action Jolicoeur, P., and S. M. Kosslyn. (1985). Is time to scan visual images due to demand characteristics? Memory and Cognition 13: 320–332. Kosslyn, S. M., and O. Koenig. (1992). Wet Mind: The New Cogni- tive Neuroscience. New York: Free Press. Paivio, A. (1986). Mental Representations. New York: Oxford University Press. Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind’s brain: a critique of mental imagery. Psychological Bulletin 80: 1–24. Sergent, J. (1990). The neuropsychology of visual image genera- tion: data, method, and theory. Brain and Cognition 13: 98– 129. Shepard, R. N., and L. A. Cooper. (1982). Mental Images and Their Transformations. Cambridge, MA: MIT Press. Tye, M. (1991). The Imagery Debate. Cambridge, MA: MIT Press. Imitation There has been an explosion of research in the development, evolution, and brain basis of imitation. Human beings are highly imitative. Recent discoveries reveal that newborn infants have an innate ability to imitate facial expressions. Figure 1. Photographs of two- to three-week-old infants imitating This has important implications for theories of FOLK PSY- facial acts demonstrated by an adult. From A. N. Meltzoff and M. CHOLOGY, MEMORY, CULTURE, and LANGUAGE. K. Moore (1977). Science 198 : 75–78. 390 Imitation made by another (Rizzolatti et al. 1996). Such “mirror neu- Knowlton, and Musen 1993), inasmuch as learning and rons” may provide a neurophysiological substrate for imita- recall of novel material occurs after one brief observation tion. with no motor training. Research is being directed to deter- Early imitation has implications for the philosophical mining the brain structures that mediate deferred imitation. problem of other minds. Imitation shows that young infants Amnesic adults (see MEMORY, HUMAN NEUROPSYCHOLOGY) are sensitive to the movements of themselves and other peo- are incapable of the same deferred imitation tasks accom- ple and can map self-other isomorphisms at the level of plished by infants, suggesting that it is mediated by the HIP- actions (see INTERSUBJECTIVITY). Through experience they POCAMPUS and related anatomical structures (McDonough et may learn that when they act in particular ways, they them- al. 1995). Compatible evidence comes from a study showing selves have certain concomitant internal states (propriocep- that children with AUTISM have a deficit in imitation, particu- tion, emotions, intentions, etc.). Having detected this larly deferred imitation, compared to mental-age matched regularity, infants have grounds for making the inference controls (Dawson et al. 1998). that when they see another person act in the same way that Comparative psychologists hotly debate the nature and they do, the person has internal states similar to their own. scope of imitation in nonhuman animals (Heyes and Galef Thus, one need not accept Fodor’s (1987) thesis that the 1996). Imitation among nonhuman animals is more limited adult THEORY OF MIND must be innate in humans (because it than human imitation (Tomasello, Kruger, and Ratner 1993; could not be learned via classical reinforcement procedures. Tomasello and Call 1997). Animals modify their behavior Imitation of body movements, vocalizations, and other goal- when observing others, but even higher primates are most directed behavior provides a vehicle for infants discovering often limited to duplicating the goal rather than the detailed that other people are “like me,” with internal states just like means used to achieve it. Moreover, social learning in ani- the SELF. Infant imitation may be a developmental precursor mals is typically motivated by obtaining food and other to developing a theory of mind (Meltzoff and Moore 1995, extrinsic rewards, whereas imitation is its own reward for the 1997; Gopnik and Meltzoff 1997; see also SIMULATION VS. human young. In humans, aspects of language development depend on imitation. Vocal imitation is a principal vehicle THEORY-THEORY). What motivates infants to imitate others? Imitation for infants’ learning of the phonetic inventory and prosodic serves many cognitive and social functions, but one possi- structure of their native language (Kuhl and Meltzoff 1996; bility is that very young infants use behavioral imitation to see PHONOLOGY, ACQUISITION OF and SPEECH PERCEPTION). sort out the identity of people. Young infants are concerned Human beings are the most imitative species on earth. with determining the identity of objects as they move in Imitation plays a crucial role in the development of culture space, disappear, and reappear (see INFANT COGNITION). and the distinctively human ability to pass on learned abili- Research shows that young infants use imitation of a per- ties from one generation to another. A current challenge in son’s acts to help them distinguish one individual from artificial intelligence is to create a robot that can learn another and reidentify people on subsequent encounters through imitation (Demiris and Hayes 1996). Creating more (Meltzoff and Moore 1998). Infants use the distinctive “humanlike” devices may hinge on embodying one of the actions of people as if they were functional properties that cornerstones of the human mind, the ability to imitate. can be elicited through interaction. Thus, infants identify a See also COMPARATIVE PSYCHOLOGY; CULTURAL EVO- person not only by featural characteristics (lips, eyes, hair), LUTION; LANGUAGE ACQUISITION; LEARNING; NATIVISM; but by how that individual acts and reacts. PRIMATE COGNITION As adults, we ascribe mental states to others. One tech- —Andrew N. Meltzoff nique for investigating the origins of theory of mind capital- izes on the proclivity for imitation (Meltzoff 1995a). Using References this technique, the adult tries but fails to perform certain tar- get acts. The results show that eighteen-month-olds imitate Dawson, G., A. N. Meltzoff, J. Osterling, and J. Rinaldi. (1998). what the adult “is trying to do,” not what that adult literally Neurophysiological correlates of early symptoms of autism. does do. This establishes that young children are not strict Child Development 69: 1276–1285. behaviorists, attuned only to the surface behavior of people. Demiris, J., and G. Hayes. (1996). Imitative learning mechanisms By eighteen months of age children have already adopted a in robots and humans. In V. Klingspor, Ed., Proceedings of the fundamental aspect of a folk psychology—actions of per- 5th European Workshop on Learning Robots. Bari, Italy. sons are understood within a framework involving goals and Fodor, J. A., (1987). Psychosemantics: The Problem of Meaning in intentions. the Philosophy of Mind. Cambridge, MA: MIT Press. Gopnik, A., and A. N. Meltzoff. (1997). Words, Thoughts, and Imitation illuminates the nature of preverbal memory Theories. Cambridge, MA: MIT Press. (Meltzoff 1995b). In these tests infants are shown a series of Heyes, C. M., and B. G. Galef. (1996). Social Learning in Ani- acts on novel objects but are not allowed to touch the objects. mals: The Roots of Culture. New York: Academic Press. A delay of hours or weeks is then imposed. Infants from six Jeannerod, M., and J. Decety. (1995). Mental motor imagery: a to fifteen months of age have been shown to perform window into the representational stages of action. Current deferred imitation after the delay, which establishes prever- Opinion in Neurobiology 5: 727–732. bal recall memory, not simply recognition of the objects. The Kuhl, P. K., and A. N. Meltzoff. (1996). Infant vocalizations in findings suggest that infants operate with what cognitive response to speech: vocal imitation and developmental neuroscientists call declarative memory as opposed to proce- change. Journal of the Acoustical Society of America 100: dural or habit memory (Sherry and Schacter 1987; Squire, 2425–2438. Implicature 391 McDonough, L., J. M. Mandler, R. D. McKee, and L. R. Squire. Meltzoff, A. N. (1990). Towards a developmental cognitive sci- (1995). The deferred imitation task as a nonverbal measure of ence: the implications of cross-modal matching and imitation declarative memory. Proceedings of the National Academy of for the development of representation and memory in infancy. Science 92: 7580–7584. In A. Diamond, Ed., The Development and Neural Bases of Meltzoff, A. N. (1995a). Understanding the intentions of others: Higher Cognitive Functions. New York: Annals of the New re-enactment of intended acts by 18-month-old children. Devel- York Academy of Sciences, vol. 608, 1–31. opmental Psychology 31: 838–850. Meltzoff, A. N., and A. Gopnik. (1993). The role of imitation in Meltzoff, A. N. (1995b). What infant memory tells us about infan- understanding persons and developing a theory of mind. In S. tile amnesia: Long-term recall and deferred imitation. Journal Baron-Cohen, H. Tager-Flusberg, and D. J. Cohen, Eds., Understanding Other Minds: Perspectives from Autism. New of Experimental Child Psychology 59: 497–515. York: Oxford University Press, pp. 335–366. Meltzoff, A. N., and M. K. Moore. (1977). Imitation of facial and Nadel, J., and G. E. Butterworth. (1998). Imitation in Infancy: manual gestures by human neonates. Science 198: 75–78. Progress and Prospects of Current Research. Cambridge: Cam- Meltzoff, A. N., and M. K. Moore. (1983). Newborn infants imi- bridge University Press. tate adult facial gestures. Child Development 54: 702–709. Rochat, P. (1995). The Self in Early Infancy: Theory and Research. Meltzoff, A. N., and M. K. Moore. (1995). Infants’ understanding of people and things: from body imitation to folk psychology. Amsterdam: North-Holland—Elsevier Science. In J. Bermudez, A. J. Marcel, and N. Eilan, Eds., The Body and Visalberghi, E., and D. M. Fragaszy. (1990). Do monkeys ape? In the Self. Cambridge, MA: MIT Press, pp. 43–69. S. Parker and K. Gibson, Eds., Language and Intelligence in Meltzoff, A. N., and M. K. Moore. (1997). Explaining facial imita- Monkeys and Apes: Comparative Developmental Perspectives. tion: a theoretical model. Early Development and Parenting 6: New York: Cambridge University Press, pp. 247–273. 179–192. Whiten, A., and R. Ham. (1992). On the nature and evolution of Meltzoff, A. N., and M. K. Moore. (1998). Object representation, imitation in the animal kingdom: reappraisal of a century of identity, and the paradox of early permanence: steps toward a research. In P. Slater, J. Rosenblatt, C. Beer, and M. Milinski, new framework. Infant Behavior and Development 21: 201– Eds., Advances in the Study of Behavior, vol. 21. New York: 235. Academic Press, pp. 239–283. Piaget, J. (1962). Play, Dreams and Imitation in Childhood. New Implicature York: Norton. Prinz, W. (1990). A common coding approach to perception and action. In O. Neumann and W. Prinz, Eds., Relationships Implicature is a nonlogical inference constituting part of Between Perception and Action. Berlin: Springer, pp. 167– what is conveyed by S[peaker] in uttering U within context 201. C, without being part of what is said in U. As stressed by H. Rizzolatti, G., L. Fadiga, V. Gallese, and L. Fogassi. (1996). Pre- motor cortex and the recognition of motor actions. Cognitive PAUL GRICE (1989), what is conveyed is generally far richer Brain Research 3: 131–141. than what is directly expressed; linguistic MEANING radi- Sherry, D. F., and D. L. Schacter. (1987). The evolution of multiple cally underdetermines utterance interpretation. Pragmatic memory systems. Psychological Review 94: 439–454. principles must be invoked to bridge this gap, for example, Squire, L. R., B. Knowlton, and G. Musen. (1993). The structure (1) (Harnish 1976: 340): and organization of memory. Annual Review of Psychology 44: 1. Make the strongest relevant claim justifiable by your evi- 453–495. dence. Tomasello, M., and J. Call. (1997). Primate Cognition. New York: Oxford University Press. Precursors of (1) were proposed by Augustus De Morgan Tomasello, M., A. C. Kruger, and H. H. Ratner. (1993). Cultural and John Stuart Mill in the mid-nineteenth century, and by learning. Behavioral and Brain Sciences 16: 495–552. P. F. Strawson and Robert Fogelin in the mid-twentieth (see Horn 1990), but the central contribution is Grice’s, along Further Readings with the recognition that such principles are not simply observed but rather systematically exploited to generate Barr, R., A. Dowden, and H. Hayne. (1996). Developmental nonlogically valid inferences. From S’s assertion that Some changes in deferred imitation by 6- to 24-month-old infants. Infant Behavior and Development 19: 159–170. of the apples are ripe, H[earer] will tend to infer that (for all Bauer, P. J., and J. M. Mandler. (1992). Putting the horse before the S knows) not all the apples are ripe, because if S had known cart: The use of temporal order in recall of events by one-year- all were ripe, she would have said so, given (1). old children. Developmental Psychology 28: 441–452. The first explicit and general account of such inferences Braten, S. (1998). Intersubjective Communication and Emotion in is given by Grice (1961: §3), who distinguishes the nonen- Early Ontogeny. New York: Cambridge University Press. tailment relations operative in (2): Campbell, J. (1994). Past, Space, and Self. Cambridge, MA: MIT Press. 2. a. She is poor but honest. Cole, J. (1998). About Face. Cambridge, MA: MIT Press. a´. There is some contrast between her poverty and her Decety, J., J. Grezes, N. Costes, D. Perani, M. Jeannerod, E. Pro- honesty. cyk, F. Grassi, and F. Fazio. (1997). Brain activity during obser- b. Jones has beautiful handwriting and his English is vation of actions: influence of action content and subject’s grammatical. strategy. Brain 120: 1763–1777. [Context: recommendation letter for philosophy job Gallagher, S. (1996). The moral significance of primitive self-con- candidate] sciousness. Ethics 107: 129–140. b´. Jones is no good at philosophy. Gallagher, S., and A. N. Meltzoff. (1996). The earliest sense of self c. My wife is either in the kitchen or in the bathroom. and others: Merleau-Ponty and recent developmental studies. c´. I don’t know for a fact that my wife is in the kitchen. Philosophical Psychology 9: 211–233. 392 Implicature Grice observes that although the inference in (2a,a´) cannot Relation: Be relevant. be cancelled (#She is poor but honest, but there’s no contrast Manner: Be perspicuous. between the two), it is detachable, because the same truth- 1. Avoid obscurity of expression. conditional content is expressible in a way that detaches 2. Avoid ambiguity. (removes) the inference: She is poor and honest. It is also 3. Be brief. (Avoid unnecessary prolixity.) irrelevant to truth conditions: (2a) is true if “she” is poor and 4. Be orderly. honest, false otherwise. Such detachable but noncancelable Although implicata generated by the first three categories of inferences that are neither constitutive of what is said nor cal- maxims are computed from propositional content (what is culable in any general way from what is said are conventional said), Manner applies to the way what is said is said; thus implicata, related to pragmatic presuppositions (see PRAG- the criterion of nondetachability applies only to implicata MATICS). Indeed, classic instances of conventional implica- induced by the “content” maxims. ture involve the standard pragmatic presupposition inducers: Since this schema was first proposed, it has been chal- focus particles like even and too, truth-conditionally transpar- lenged, as well as defended and extended, on conceptual ent verbs like manage to and bother to, and syntactic con- and empirical grounds (Keenan 1976; Brown and Levinson structions like clefts. Because conventional implicata are 1987), while neo- and post-Gricean pragmaticists have non-truth-conditional aspects of the conventional meaning of directed a variety of reductionist efforts at the inventory of linguistic expressions, which side of the semantics/pragmat- maxims. The first revisionist was Grice himself, maintain- ics border they inhabit depends on whether pragmatics is ing that all maxims are not created equal, with a privileged identified with the non-truth-conditional (as on Gazdar’s status accorded to Quality (though see Sperber and Wilson intentionally oversimplified equation: pragmatics = seman- 1986 for a dissenting view): “False information is not an tics – truth conditions) or with the nonconventional. (Kart- inferior kind of information; it just is not information . . . . tunen and Peters 1979 provide a formal compositional The importance of at least the first maxim of Quality is such treatment of conventional implicature.) that it should not be included in a scheme of the kind I am The inferences associated with (2b,c) are nonconven- constructing; other maxims come into operation only on the tional, being calculable from the utterance of such sentences assumption that this maxim of Quality is satisfied” (Grice in a particular context. In each case, the inference of the cor- 1989: 371). responding primed proposition is cancelable (either explic- Of those “other maxims,” the most productive is Quan- itly by appending material inconsistent with it—“but I don’t tity-1, which is systematically exploited to yield upper- mean to suggest that . . .”—or by altering the context of bounding generalized conversational implicatures associated utterance) but nondetachable, given that truth-conditionally with scalar operators (Horn 1972, 1989; Gazdar 1979; Hir- equivalent expressions license the same inference. An utter- schberg 1985). Grice seeks to defend a conservative bivalent ance of (2b) “does not standardly involve the implication . . . semantics; the shortfall between what standard logical attributed to it; it requires a special context to attach the semantics yields and what an intuitively adequate account of implication to its utterance” (Grice 1961: 130), whereas the utterance meaning requires is addressed by a pragmatic default inference from (2c), that S did not know in which of framework grounded on the assumption that S and H are the two rooms his wife was located, is induced in the absence observing CP and the attendant maxims. Quantity-based of a marked context (e.g., that of a game of hide-and-seek). scalar implicature in particular—my inviting you to infer (2b) exemplifies particularized conversational implicature, from my use of some . . . that for all I know not all . . .—is (2c) the more linguistically significant category of general- driven by your knowing (and my knowing your knowing) ized conversational implicature. In each case, it is not the that I expressed a weaker proposition, bypassing an equally proposition or sentence, but the speaker or utterance, that unmarked utterance that would have expressed a stronger induces the relevant implicatum in the appropriate context. proposition, one unilaterally entailing it. What is said in the Participants in a conversational exchange compute what use of a weak scalar value is the lower bound (at least some, was meant (by S’s uttering U in context C) from what was at least possible), with the upper bound (at most some, at said by assuming the application of the cooperative princi- most necessary) implicated as a cancelable inference (some ple (Grice 1989: 26)—“Make your conversational contribu- and possibly all, possible if not necessary). Negating such tion such as is required, at the stage at which it occurs”— predications denies the lower bound: to say that something and the four general and presumably universal maxims of is not possible is to say that it is impossible (less than possi- conversation on which all rational interchange is grounded: ble). When the upper bound is denied (It’s not possible, it’s 3. Maxims of Conversation (Grice [1967]1989: 26–27): necessary), grammatical and phonological diagnostics Quality: Try to make your contribution one that is true. reveal a metalinguistic or echoic use of negation, in which 1. Do not say what you believe to be false. the negative particle is used to object to any aspect of a 2. Do not say that for which you lack evidence. quoted utterance, including its conventional and conversa- tional implicata, register, morphosyntactic form or pronunci- Quantity: ation (Horn 1989; Carston 1996). One focus of pragmatic research has been on the interac- 1. Make your contribution as informative as is required tion of implicature with grammar and the LEXICON, in par- (for the current purposes of the exchange). 2. Do not make your contribution more informative than ticular on the conventionalization or grammaticalization of is required. conversational implicatures (see Bach and Harnish 1979 on Implicature 393 standardized nonliterality). Another issue is whether Grice’s Harnish, R. M. (1976). Logical form and implicature. In S. Davis, Ed., Pragmatics: A Reader. New York: Oxford University inventory of maxims is necessary and sufficient. One Press, pp. 316–364. response is a proposal (Horn 1984, 1989; see also Levinson Hirschberg, J. (1985). A Theory of Scalar Implicature. Ph. D. diss., 1987) to collapse the non-Quality maxims into two basic University of Pennsylvania. principles regulating the economy of linguistic information. Horn, L. (1972). On the Semantic Properties of Logical Operators The Q Principle, a hearer-based guarantee of the sufficiency in English. Ph.D. diss., UCLA. Distributed by Indiana Univer- of informative content (“Say as much as you can, modulo sity Linguistics Club, 1976. Quality and R”), collects Quantity-1 and Manner-1,2. It is Horn, L. (1984). Toward a new taxonomy for pragmatic inference: systematically exploited (as in the scalar cases) to generate Q-based and R-based implicature. In D. Schiffrin, Ed., Mean- upper-bounding implicata, based on H’s inference from S’s ing, Form, and Use in Context (GURT ’84). Washington: Geor- failure to use a more informative and/or briefer form that S getown University Press, pp. 11–42. was not in a position to do so. The R Principle, a correlate of Horn, L. (1989). A Natural History of Negation. Chicago: Univer- sity of Chicago Press. the law of least effort dictating minimization of form (‘Say Horn, L. (1990). Hamburgers and truth: Why Gricean inference is no more than you must, modulo Q’), encompasses Relation, Gricean. Berkeley Linguistics Society 16: 454–471. Quantity-2, and Manner-3,4. It is exploited to induce Horn, L. (1992). The said and the unsaid. Proceedings of the Sec- strengthening (lower-bounding) implicata typically moti- ond Conference on Semantics and Linguistic Theory: 163– vated on social rather than purely linguistic grounds, as 192. exemplified by indirect speech acts (e.g., euphemism) and Karttunen, L., and S. Peters. (1979). Conventional implicature. In so-called neg-raising, as in the tendency to pragmatically C.-K. Oh and D. Dinneen, Eds., Presupposition. Syntax and strengthen I don’t think that f to I think that not-f. Semantics, vol. 11. New York: Academic Press, pp. 1–56. A more radically reductionist model is offered in rele- Keenan, E. O. (1976). The universality of conversational postu- vance theory (Sperber and Wilson 1986), in which one suit- lates. Language in Society 5: 67–80. Levinson, S. C. (1987). Minimization and conversational infer- ably elaborated principle of RELEVANCE suffices for the ence. In J. Verschueren and M. Bertucelli-Papi, Eds., The entire gamut of inferential work performed by the Gricean Pragmatic Perspective. Amsterdam: John Benjamins, pp. 61– maxims. (The monistic approach of RT is closer to the dual- 130. istic Q/R model than it appears, both frameworks being Récanati, F. (1993). Direct Reference. Oxford: Blackwell. predicated on a minimax or cost/benefit tradeoff that sees Sperber, D., and D. Wilson. (1986). Relevance. Cambridge, MA: the goal of communication as maximizing contextual effects Harvard University Press. while minimizing processing effort.) RT stresses the radical underspecification of propositional content by linguistic Further Readings meaning; pragmatically derived aspects of meaning include Atlas, J. D., and S. C. Levinson. (1981). It-clefts, informativeness, not only implicatures but “explicatures,” that is, compo- and logical form. In P. Cole, Ed., Radical Pragmatics. New nents of enriched truth-conditional content. York: Academic Press, pp. 1–61. Although the issues surrounding the division of labor Bach, K. (1994). Conversational implicature. Mind and Language between Gricean implicature and RT-explicature await reso- 9: 125–162. lution (see Horn 1992; Récanati 1993), relevance theory has Carston, R. (1988). Implicature, explicature, and truth-theoretic proved a powerful construct for rethinking the role of prag- semantics. In S. Davis, Ed., Pragmatics: A Reader. New York: matic inferencing in utterance interpretation and other Oxford University Press, pp. 33–51. aspects of cognitive structure; see Blakemore (1992) for an Cole, P., Ed. (1978). Syntax and Semantics 9: Pragmatics. New overview. York: Academic Press. Davis, S., Ed. (1991). Pragmatics: A Reader. New York: Oxford See also DISCOURSE; LANGUAGE AND COMMUNICATION; University Press. LANGUAGE AND THOUGHT; METAREPRESENTATION; SEMAN- Green, G. (1989). Pragmatics and Natural Language Understand- TICS ing. Hillsdale, NJ: Erlbaum. Green, G. (1990). The universality of Gricean explanation. Berke- —Laurence Horn ley Linguistics Society 16: 411–428. Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge Uni- References versity Press. Levinson, S. C. (1987). Pragmatics and the grammar of anaphora: Bach, K., and R. M. Harnish. (1979). Linguistic Communication a partial pragmatic reduction of binding and control phenom- and Speech Acts. Cambridge, MA: MIT Press. ena. Journal of Linguistics 23: 379–434. Blakemore, D. (1992). Understanding Utterances. Oxford: Black- McCawley, J. D. (1978). Conversational implicature and the lexi- well. con. In P. Cole, Ed., Syntax and Semantics 9: Pragmatics. New Brown, P., and S. C. Levinson. (1987). Politeness: Some Univer- York: Academic Press, pp. 245–258. sals in Language Usage. Cambridge: Cambridge University Morgan, J. (1978). Two types of convention in indirect speech acts. Press. In P. Cole, Ed., Syntax and Semantics 9: Pragmatics. New Carston, R. (1996). Metalinguistic Negation and Echoic Use. Jour- York: Academic Press, pp. 261–280. nal of Pragmatics 25: 309–330. Neale, S. (1992). Paul Grice and the philosophy of language. Lin- Gazdar, G. (1979). Pragmatics. New York: Academic Press. guistics and Philosophy 15: 509–559. Grice, H. P. (1961). The causal theory of perception. Proceedings Récanati, F. (1989). The pragmatics of what is said. In S. Davis, of the Aristotelian Society, sup. vol. 35: 121–152. Ed., Pragmatics: A Reader. New York: Oxford University Grice, H. P. (1989). Studies in the Way of Words. Cambridge: Har- Press, pp. 97–120. vard University Press. 394 Implicit vs. Explicit Memory are characterized by a severe impairment in explicit memory Sadock, J. M. (1978). On testing for conversational implicature. In P. Cole, Ed., Syntax and Semantics 9: Pragmatics. New York: for recent events, despite relatively normal intelligence, per- Academic Press, pp. 281–298. ception, and language. This memory deficit is typically pro- Walker, R. (1975). Conversational implicatures. In S. Blackburn, duced by lesions to either medial temporal or diencephalic Ed., Meaning, Reference, and Necessity. Cambridge: Cam- brain regions. In contrast, a number of studies have demon- bridge University Press, pp. 133–181. strated that amnesic patients show intact implicit memory Wilson, D., and D. Sperber. (1986). Inference and implicature. In on tests of priming and skill learning. These observations S. Davis, Ed., Pragmatics: A Reader. New York: Oxford Uni- suggest that implicit memory is supported by different brain versity Press, pp. 377–392. systems than is explicit memory (Squire 1992). The evidence also shows that various forms of implicit Implicit vs. Explicit Memory memory can be dissociated from one another. A number of studies point toward a distinction between perceptual prim- Psychological studies of human MEMORY have traditionally ing and conceptual priming. Perceptual priming is little been concerned with conscious recollection or explicit affected by semantic versus nonsemantic study processing. memory for specific facts and episodes. During recent It is also modality specific (i.e., priming is enhanced when years, there has been growing interest in a nonconscious the sensory modality of study and test are the same), and in form of memory, referred to as implicit memory (Graf and some instances may be specific to the precise physical for- Schacter 1985; Schacter 1987), that does not require explicit mat of stimuli at study and test (cf. Church and Schacter recollection for specific episodes. Numerous experimental 1994; Curran, Schacter, and Bessenoff 1996; Graf and Ryan investigations have revealed dramatic differences between 1990; Tenpenny 1995). Conceptual priming, in contrast, is implicit and explicit memory, which have had a major not tied to a particular sensory modality and is increased by impact on psychological theories of the processes and sys- semantic study processing; it is observed most clearly on tems involved in human memory (cf. Roediger 1990; tests that require semantic analysis, such as producing cate- Schacter and Tulving 1994; Ratcliff and McKoon 1997). gory exemplars in response to a category cue (Hamman The hallmark of implicit memory is a change in perfor- 1990). Other evidence indicates that priming and skill learn- mance—attributable to information acquired during a spe- ing are dissociable forms of implicit memory. For instance, cific prior episode—on a test that does not require conscious studies of patients suffering from different forms of demen- recollection of the episode. This change is often referred to tia indicate that patients with Alzheimer’s dementia often as direct or repetition priming. One example of a test used to show impaired priming on stem completion tasks, yet show assess priming is known as stem completion, where people normal learning of motor skills; in contrast, patients with are asked to complete word stems (e.g., TAB) with the first Huntington’s disease (which affects the motor system) show word that comes to mind (e.g., TABLE); priming is inferred normal stem completion priming together with impaired from an enhanced tendency to complete the stems with pre- learning of motor skills (Salmon and Butters 1995). viously studied words relative to nonstudied words (for Recent studies using newly developed brain imaging tech- reviews, see Roediger and McDermott 1993; Schacter and niques, such as POSITRON EMISSION TOMOGRAPHY (PET) and Buckner 1998; Schacter, Chiu, and Ochsner 1993). Priming functional MAGNETIC RESONANCE IMAGING (fMRI) have is not the only type of implicit memory. For instance, tasks demonstrated that visual priming on such tests as stem com- in which people learn to perform motor or cognitive skills pletion is accompanied by decreased blood flow in regions of may involve implicit memory, because skill acquisition does visual cortex (Squire et al. 1992). Various other neuroimag- not require explicit recollection of a specific previous epi- ing studies have produced similar priming-related blood flow sode (for review, see Salmon and Butters 1995). decreases (see Schacter and Buckner 1998). These observa- Implicit memory can be separated or dissociated from tions are consistent with the idea that visual priming effects explicit memory through experimental manipulations that depend on a perceptual representation system that includes affect implicit and explicit memory differently (for method- posterior cortical regions that are involved in perceptual anal- ological considerations, see Jacoby 1991; Schacter, Bowers, ysis (Tulving and Schacter 1990); priming may produce and Booker 1989), and neurological conditions in which more efficient processing of test cues, perhaps resulting in explicit memory is impaired while implicit memory is decreased neural activity. Other studies have shown that con- spared. For example, it has been well established that per- ceptual priming is associated with blood flow reductions in formance on explicit recall and recognition tests is higher regions of left inferior frontal cortex that are known to be following semantic than following nonsemantic study of an involved in semantic processing (for review, see Schacter and item—the well-known levels of processing effect. In con- Buckner 1998). In contrast, neuroimaging studies of motor trast, however, the magnitude of priming on tasks that skill learning have shown that the development of skill across involve completing word stems or identifying briefly many sessions of practice is accompanied by increased activ- flashed words is often less affected, and sometimes unaf- ity in regions involved in motor processing, such as motor fected, by semantic versus nonsemantic study (see reviews cortex (Karni et al. 1995). Neuroimaging studies are also by Roediger and McDermott 1993; Schacter, Chin, and beginning to illuminate the networks of structures involved in Ochsner 1993). explicit remembering of recent episodes, including the Perhaps the most dramatic dissociation between implicit regions within the prefrontal cortex and medial temporal and explicit memory has been provided by studies of brain- lobes (e.g., Buckner et al. 1995; Schacter et al. 1996; Tulving damaged patients with organic amnesia. Amnesic patients et al. 1994; for review, see Cabeza and Nyberg 1997). Indexicals and Demonstratives 395 The exploration of implicit memory has opened up new Salmon, D. P., and N. Butters. (1995). Neurobiology of skill and habit learning. Current Opinion in Neurobiology 5: 184–190. vistas for memory research, providing a vivid reminder that Schacter, D. L. (1987). Implicit memory: history and current sta- many aspects of memory are expressed through means other tus. Journal of Experimental Psychology: Memory Learning than conscious, explicit recollection of past experiences. and Cognition 13: 501–518. Nonetheless, a great deal remains to be learned about the cog- Schacter, D. L., N. M. Alpert, C. R. Savage, S. L. Rauch, and M. S. nitive and neural mechanisms of implicit memory, and it Albert. (1996). Conscious recollection and the human hippo- seems likely that further empirical study and theoretical anal- campal formation: evidence from positron emission tomogra- ysis will continue to pay handsome dividends in the future. phy. Proceedings of the National Academy of Sciences 93: 321– See also AGING, MEMORY, AND THE BRAIN; EBBINGHAUS, 325. Schacter, D. L., J. Bowers, and J. Booker. (1989). Awareness, HERMANN; EPISODIC VS. SEMANTIC MEMORY; MEMORY, intention and implicit memory: the retrieval intentionality crite- HUMAN NEUROPSYCHOLOGY; MEMORY STORAGE, MODULA- rion. In S. J. C. Lewandowsky Dunn and K. Kirsner, Eds., TION OF; MOTOR LEARNING Implicit Memory: Theoretical Issues. Hillsdale, NJ: Erlbaum, —Daniel L. Schacter pp. 47–69. Schacter, D. L., and R. L. Buckner. (1998). Priming and the brain. Neuron 20: 185–195. References Schacter, D. L., C. Y. P. Chiu, and K. N. Ochsner. (1993). Implicit Buckner, R. L., S. E. Petersen, J. G. Ojemann, F. M. Miezin, L. R. memory: a selective review. Annual Review of Neuroscience Squire, and M. E. Raichle. (1995). Functional anatomical stud- 16: 159–182. ies of explicit and implicit memory retrieval tasks. Journal of Schacter, D. L., and E. Tulving. (1994). What are the memory sys- Neuroscience 15: 12–29. tems of 1994? In D. L. Schacter and E. Tulving, Eds., Memory Cabeza, R., and L. Nyberg. (1997). Imaging cognition: an empiri- Systems. Cambridge, MA: MIT Press, pp. 1–38. cal review of PET studies with normal subjects. Journal of Squire, L. R. (1992). Memory and the hippocampus: a synthesis Cognitive Neuroscience 9: 1–26. from findings with rats, monkeys and humans. Psychological Church, B. A., and D. L. Schacter. (1994). Perceptual specificity of Review 99: 195–231. auditory priming: implicit memory for voice intonation and Squire, L. R., J. G. Ojemann, F. M. Miezin, S. E. Petersen, T. O. fundamental frequency. Journal of Experimental Psychology: Videen, and M. E. Raichle. (1992). Activation of the hippocam- Learning, Memory, and Cognition 20: 521–533. pus in normal humans: a functional anatomical study of mem- Curran, T., D. L. Schacter, and G. Bessenoff. (1996). Visual speci- ory. Proceedings of the National Academy of Sciences 89: ficity effects on word stem completion: beyond transfer appro- 1837–1841. priate processing? Canadian Journal of Experimental Tenpenny, P. L. (1995). Abstractionist versus episodic theories of Psychology 50: 22–33. repetition priming and word identification. Psychonomic Bulle- Gabrieli, J. D. E., D. A. Fleischman, M. M. Keane, S. L. Reminger, tin and Review 2: 339–363. and F. Morrell. (1995). Double dissociation between memory Tulving, E., S. Kapur, H. J. Markowitsch, F. I. M. Craik, R. Habib, systems underlying explicit and implicit memory in the human and S. Houle (1994). Neuroanatomical correlates of retrieval in brain. Psychological Science 6: 76–82. episodic memory: auditory sentence recognition. Proceedings Graf, P., and L. Ryan. (1990). Transfer-appropriate processing for of the National Academy of Sciences 91: 2012–2015. implicit and explicit memory. Journal of Experimental Psy- Tulving, E., and D. L. Schacter. (1990). Priming and human mem- chology: Learning, Memory, and Cognition 16: 978–992. ory systems. Science 247: 301–306. Graf, P., and D. L. Schacter. (1985). Implicit and explicit memory for new associations in normal subjects and amnesic patients. Incompleteness Journal of Experimental Psychology: Learning, Memory, and Cognition 11: 501–518. Hamman, S. B. (1990). Level-of-processing effects in conceptually See FORMAL SYSTEMS, PROPERTIES OF; GÖDEL’S THEOREMS driven implicit tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition 16: 970–977. Indexicals and Demonstratives Jacoby, L. L. (1991). A process dissociation framework: separating automatic from intentional uses of memory. Journal of Memory and Language 30: 513–541. When you use “I,” you refer to yourself. When I use it, I Karni, A., G. Meyer, P. Jezzard, M. M. Adams, R. Turner, and L. refer to myself. We use the same linguistic expression with G. Ungerleider. (1995). Functional MRI evidence for adult the same conventional meaning. It is a matter of who uses motor cortex plasticity during motor skill learning. Nature 377: it that determines who is the referent. Moreover, when Jon, 155–158. pointing to Sue, says “she” or “you,” he refers to Sue, Nyberg, L., A. R. McIntosh, S. Houle, L -G. Nilsson, and E. Tulv- whereas Sue can neither use “she” nor “you” to refer to ing. (1996). Activation of medial temporal structures during episodic memory retrieval. Nature 380: 715–717. herself, unless she is addressing an image of herself. If we Ratcliff, R., and G. McKoon. (1997). A counter model for implicit change the context—the speaker, time, place, addressee, or priming in perceptual word identification. Psychological audience—in which these expressions occur, we may end Review 104: 319–343. up with a different referent. Among the expressions that Roediger, H. L. (1990). Implicit memory: retention without may switch reference with a change in context are per- remembering. American Psychologist 45: 1043–1056. sonal pronouns (my, you, she, his, we, . . .), demonstrative Roediger, H. L., and K. B. McDermott. (1993). Implicit memory in pronouns (this, that), compound demonstratives (this normal human subjects. In H. Spinnler and F. Boller, Eds., table, that woman near the window, . . .), adverbs (today, Handbook of Neuropsychology, vol. 8. Amsterdam: Elsevier, yesterday, now, here, . . .), adjectives (actual and present), pp. 63–131. 396 Indexicals and Demonstratives These general features of token reflexive expressions possessive adjectives (my pen, their house, . . .). The refer- depend on their particular linguistic meaning: “the utterer of ence of other words (e.g., local, Monday, . . .) seems also this token” is the linguistic meaning (the character, Kaplan to depend on the context in which they occur. These words 1977, or role, Perry 1977) of “I,” whereas “the day in which capture the interest of those working within the boundaries this token is uttered” is the linguistic meaning of “today,” of cognitive science for several reasons: they play crucial and so on. As such, their linguistic meaning can be viewed roles when dealing with such puzzling notions as the as a function taking as argument the context and giving as nature of the SELF, the nature of perception, the nature of value the referent (Kaplan 1977). time, and so forth. It is often the case, though, that the linguistic MEANING Reichenbach (1947) characterized this class of expres- sions token reflexive and argued that they can be defined in of expressions like “this,” “that,” “she,” and so on, together terms of the locution “this token,” where the latter (reflex- with context, is not enough to select a referent. These ively) self-refers to the very token used. So, “I” can be expressions are often accompanied by a pointing gesture or defined in terms of “the person who utters this token,” demonstration and the referent will be what the demonstra- “now” in terms of “the time at which this token is uttered,” tion demonstrates. Kaplan (1977) distinguishes between “this table” in terms of “the table pointed to by a gesture pure indexicals (“I,” “now,” “today,”. . .) and demonstra- accompanying this token,” and so on. tives (“this,” “she,” . . .). The former, unlike the latter, do not One of the major features of token reflexive expres- need a demonstration to secure the refererence. sions—also called indexical expressions—that differentiates Another way to understand the distinction between pure them from other referential expressions (e.g., proper names: indexicals and demonstratives is to argue that the latter, “Socrates,” “Paris”; mass terms: “gold,” “water”; terms for unlike pure indexicals, are perception based. When one says species: “tiger,” “rose”; and so on) is that they are usually “I” or “today,” one does not have to perceive oneself or the used to make reference in praesentia. That is, a use of a relevant day to be able to use and understand these expres- token reflexive expression exploits the presence of the refer- sions competently. To use and understand “this,” “she,” and ent. In a usual communicative interaction the referent is in the like competently, one ought to perceive the referent or the perceptual field of the speaker and some contextual demonstratum. For this very reason, when a pure indexical clues are used to make the referent salient. is involved, the context of reference fixing and the context When token reflexive expressions are not used to make of utterance cannot diverge: the reference of a pure indexi- reference in praesentia they exploit a previously fixed refer- cal, unlike the reference of a demonstrative, cannot be fixed ence. “That man” in “That man we saw last night is hand- by a past perception. some” does not refer to a present man. The use of token Moreover, a demonstrative, unlike a pure indexical, can reflexive expressions to make reference in absentia forces be a vacuous term. “Today,” “I,” and so on never miss the the distinction between the context of utterance and the con- referent. Even if I do not know whether today is Monday or text of reference fixing. In our example, to fix the reference Tuesday and I am amnesiac, if I say “Today I am tired,” I the speaker and the hearer appeal to a past context. The gap refer to the relevant day and myself. By contrast, if halluci- between the two contexts is bridged by memory. nating one says “She is funny,” or pointing to a man, “This The general moral seems to be that the paradigmatic use car is green,” “she” and “this car” are vacuous. of a token reflexive expression cannot be deferential. Besides, pure indexicals cannot be coupled with sortal Although one often relies on the so-called division of lin- predicates, whereas “this” and “that” are often accompanied guistic labor when using nontoken reflexive expressions, by sortal predicates to form compound demonstratives like one cannot appeal to the same phenomenon when using a “this book” and “that water.” Sortal predicates can be con- token reflexive expression: for example, one can compe- sidered to be universe narrowers which, coupled with other tently use “Spiro Agnew” or “roadrunner” even if one does contextual clues, help us fix the reference. If when pointing not know who Spiro Agnew is or is unable to tell a roadru- to a bottle one says “This liquid is green,” the sortal “liquid” nner from a rabbit. This parallels the fact that when using helps us to fix the liquid and not the bottle as referent. proper names, mass terms, and the like, context is in play Moreover, personal pronouns that work like demonstratives before the name is used: we first fix the context and then (e.g., “she,” “he,” “we,” . . .) have a built-in or hidden sortal. use the name, whereas with token reflexive expressions “She,” unlike “he,” refers to a female, whereas “we” refers context is at work the very moment we use them. As Perry to a plurality of people among whom is the speaker. (1997) suggests, we often use context to disambiguate a See also CONTEXT AND POINT OF VIEW; PRAGMATICS; mark or noise (e.g., “bank;” “Aristotle” used either as a tag REFERENCE, THEORIES OF; SENSE AND REFERENCE for the philosopher or for Onassis). These are presemantic —Eros Corazza uses of context. With token reflexive expressions, though, context is used semantically. It remains relevant after the References language, words, and meaning are all known; the meaning directs us to certain aspects of context. This distinction Biro, J. (1982). Intention, demonstration, and reference. Philoso- reflects on the fact that proper names, mass terms, and so phy and Phenomenological Research XLIII(1): 35–41. on, unlike token reflexive expressions, contribute to build- Castañeda, H. (1966). “He”: a study in the logic of self-conscious- ing context-free (eternal) sentences, that is, sentences that ness. Ratio 8(2): 130–157. are true or false independently of the context in which they Castañeda, H. (1967). Indicators and quasi-indicators. American are used. Philosophical Quarterly 4(2): 85–100. Individualism 397 Individualism Evans, G. (1981). Understanding demonstratives. In G. Evans (1985), Collected Papers. Oxford: Oxford University Press, pp. 291–321. Frege, G. (1918/1988). Thoughts. In N. Salmon and S. Soaemes, Individualism is a view about how psychological states are Eds., Propositions and Attitudes. Original work published taxonomized that has been claimed to constrain the cogni- 1918. Oxford: Oxford University Press, pp. 33–55. tive sciences, a claim that remains controversial. Individual- Kaplan, D. (1977/1989). Demonstratives. In J. Almog et al., Eds., ists view the distinction between the psychological states of Themes from Kaplan. Original work published 1977. Oxford: individuals and the physical and social environments of Oxford University Press, pp. 481–563. those individuals as providing a natural basis for demarcat- Kaplan, D. (1989). Afterthoughts. In J. Almog et al., Eds., ing properly scientific, psychological kinds. Psychology in Themes from Kaplan. Oxford: Oxford University Press, pp. particular and the cognitive sciences more generally are to 565–614. Lewis, D. (1979). Attitudes de dicto and de se. The Philosophical be concerned with natural kinds whose instances end at the Review 88: 513–543. Reprinted in D. Lewis (1983), Philosoph- boundary of the individual. Thus, although individualism is ical Papers: vol. 1. Oxford: Oxford University Press. sometimes glossed as the view that psychological states are Perry, J. (1977). Frege on demonstratives. The Philosophical “in the head,” it is a more specific and stronger view than Review 86(4): 476–497. Reprinted in J. Perry (1993), The Prob- suggested by such a characterization. Individualism is lem of the Essential Indexical and Other Essays. Oxford: sometimes (e.g., van Gulick 1989) called “internalism,” and Oxford University Press. its denial “externalism.” Its acceptance or rejection has Perry, J. (1979). The problem of the essential indexical. Nous implications for accounts of MENTAL REPRESENTATION, 13(1): 3–21. Reprinted in J. Perry (1993), The Problem of the MENTAL CAUSATION, and SELF-KNOWLEDGE. Essential Indexical and Other Essays. Oxford: Oxford Univer- The dominant research traditions in cognitive science sity Press. Perry, J. (1997). Indexicals and demonstratives. In R. Hale and C. have been at least implicitly individualistic. Relatively Wright, Eds., Companion to the Philosophy of Language. explicit statements of a commitment to an individualistic Oxford: Blackwell, pp. 586–612. view of aspects of cognitive science include Chomsky’s Reichenbach, H. (1947). Elements of Symbolic Logics. New York: (1986, 1995) deployment of the distinction between two con- Free Press, pp. 284–286. ceptions of language (the “I”-language and the “E”-language, Yourgrau, P., Ed. (1990). Demonstratives. Oxford: Oxford Univer- for “internal” and “external,” respectively), Jackendoff’s sity Press. (1991) related, general distinction between “psychological” and “philosophical” conceptions of the mind, and Cosmides Further Readings and Tooby’s (1994) emphasis on the constructive nature of our internal, evolutionary-specialized cognitive modules. Austin, D. (1990). What’s the Meaning of This? Ithaca: Cornell Individualism is controversial for at least three reasons. University Press. First, individualism appears incompatible with FOLK PSY- Bach, K. (1987). Thought and Reference. Oxford: Clarendon CHOLOGY and, more controversially, with a commitment to Press. INTENTIONALITY more generally. Much contemporary cog- Boër, S. E., and W. G. Lycan. (1980). Who, me? The Philosophical nitive science incorporates and builds on the basic concepts Review 89 (3): 427–466. of the folk (e.g., belief, memory, perception) and at least Burks, A. W. (1949). Icon, index, and symbol. Philosophy and appears to rely on the notion of mental content. Second, Phenomenological Research 10: 673–689. despite the considerable intuitive appeal of individualism, Castañeda, H-N. (1989). Thinking, Language, and Experience. arguments for it have been less than decisive. Third, the Minneapolis: University of Minnesota Press. Chisholm, R. (1981). The First Person. Minneapolis: University of relationship between individualism and cognitive science Minnesota Press. has been seen to be more complicated than initially thought, Corazza, E. (1995). Référence, Contexte et Attitudes. Montréal and in part because of varying views of how to understand Paris: Bellarmin/Vrin. explanatory practice in the cognitive sciences. Evans, G. (1982). The Varieties of Reference. Oxford: Oxford Uni- One formulation of individualism, expressed in Putnam versity Press. (1975), but also found in the work of Carnap and early Mellor, D. H. (1989). I and now. Proceedings of the Aristotelian twentieth century German thinkers, is methodological Society 89: 79–84. Reprinted in D. H. Mellor (1991), Matters of solipsism. Following Putnam, Fodor views methodologi- Metaphysics. Cambridge: Cambridge University Press. cal solipsism as the doctrine that psychology ought to con- Numberg, G. (1993). Indexicality and deixis. Linguistics and Phi- cern itself only with narrow psychological states, where losophy 16: 1–43. Perry, J. (1986). Thoughts without representation. Proceedings of these are states that do not presuppose “the existence of the Aristotelian Society 60: 137–152. Reprinted in J. Perry, any individual other than the subject to whom that state is (1993), The problem of the Essential Indexical and Other ascribed” (Fodor 1980: 244). An alternative formulation Essays. Oxford: Oxford University Press. of individualism offered by Stich (1978), the principle of Récanati, F. (1993). Direct Reference. London: Blackwell. autonomy, says that “the states and processes that ought to Vallée, R. (1996). Who are we? Canadian Journal of Philosophy. be of concern to the psychologist are those that supervene Vol. 26, no. 2: 211–230. on the current, internal, physical state of the organism” Wettstein, H. (1981). Demonstrative reference and definite (Stich 1983: 164–165; see SUPERVENIENCE). Common to descriptions. Philosophical Studies 40: 241–257. Reprinted in both expressions is the idea that an individual’s psycho- H. Wettstein (1991), Has Semantics Rested on a Mistake? and logical states should be bracketed off from the mere, Other Essays. Stanford: Stanford University Press. 398 Individualism beyond-the-head environments that individuals find them- ence, the most powerful empirical arguments appeal to the selves in. computational nature of cognition. At times, such argu- Fodor and Stich used their respective principles to argue ments have involved detailed examination of particular the- for substantive conclusions about the scope and methodol- ories or research programs, most notably Marr’s theory of ogy of psychology and the cognitive sciences. Fodor con- vision (Egan 1992; Segal 1989; Shapiro 1997), the discus- trasted a solipsistic psychology with what he called a sion of which forms somewhat of an industry within philo- naturalistic psychology, arguing that the latter (among sophical psychology. A different sort of challenge to the which he included JAMES JEROME GIBSON’s approach to per- inference from computationalism to individualism is posed ception, learning theory, and the naturalism of WILLIAM by Wilson’s (1994) wide computationalism, a view accord- JAMES) was unlikely to prove a reliable research strategy in ing to which some of our cognitive, computational systems psychology. Stich argued for a syntactic or computational literally extend into the world (and so cannot be individual- theory of mind that made no essential use of the notion of istic). intentionality or mental content. Although these arguments Those rejecting individualism on empirical or method- themselves have not won widespread acceptance, many phi- ological grounds have appealed to the situated or embedded losophers interested in the cognitive sciences are attracted to nature of cognition, seeking more generally to articulate the individualism because of a perceived connection to FUNC- crucial role that an organism’s environment plays in its cog- nitive processing. For example, McClamrock (1995: chap. TIONALISM in the philosophy of mind, the idea being that 6) points to the role of improvisation in planning, and Wil- the functional or causal roles of psychological states are son (1999) points to the ways in which informational load is individualistic. shifted from organism to environment in metarepresentation. A point initially made in different ways by both Putnam But despite the suggestiveness of such appeals, the argu- and Stich—that our folk psychology violates individual- ments here remain relatively undeveloped; further attention ism—was developed by Tyler Burge (1979) as part of a to the relationships between culture, external symbols, and wide-ranging attack on individualism. Putnam had argued cognition (Donald 1991; Hutchins 1995) is needed. that “‘meanings’ just ain’t in the head” by developing a Further issues: First, faced with the prima facie conflict causal theory of reference for natural kind terms, introduc- between individualism and representational theories of the ing TWIN EARTH thought experiments into the philosophy of mind, including folk psychology, individualists have often mind. Stich identified intuitive cases (including two of Put- invoked a distinction between wide content and NARROW nam’s) in which our folk psychological ascriptions con- flicted with the principle of autonomy. Burge introduced CONTENT and argued that cognitive science should use only individualism as a term for an overall conception of the the latter. The adequacy of competing accounts of narrow mind, extended the Twin Earth argument from language to content, and even whether there is a notion of content that thought, and showed that the conflict between individualism meets the constraint of individualism, are matters of con- and folk psychology did not turn on a perhaps controversial tinuing debate. claim about the semantics of natural kind terms. Together Second, little of the debate over individualism in cogni- with Burge’s (1986) argument that DAVID MARR’s cele- tive science has reflected the increasing prominence of neu- brated computational theory of vision was not individualis- roscientific perspectives on cognition. Although one might tic, Burge’s early arguments have posed the deepest and assume that the various neurosciences must be individualis- most troubling challenges to individualism (cf. Fodor 1987: tic, this issue remains largely unexplored. chap. 2; 1994). Finally, largely unasked questions about the relationship Individualism is motivated by several clusters of power- between individualism in psychology and other “individual- ful intuitions. A “Cartesian” cluster that goes most naturally istic” views—for example, in evolutionary biology (Will- with the methodological solipsism formulation of individu- iams 1966; cf. Wilson and Sober 1994) and in the social alism revolves around the idea that an organism’s mental sciences (e.g., Elster 1986)—beckon discussion as the states could be just as they are even were its environment boundaries and focus of traditional cognitive science are radically different. Perhaps the most extreme case is that of challenged. Answering such questions will require discus- a brain-in-a-vat: were you a brain-in-a-vat, not an embodied sion of the place of the cognitive sciences among the sci- person actively interacting with the world, you could have ences more generally, and of general issues in the just the same psychological states that you have now, pro- philosophy of science. vided that you were supplied with just the right stimulation See also COMPUTATIONAL THEORY OF MIND; PHYSICAL- and feedback from the “vat” that replaces the world you are ISM; SITUATEDNESS/EMBEDDEDNESS actually in. A second, “physicalist” cluster that comple- —Robert A. Wilson ments the principle of autonomy formulation appeals to the idea that physically identical individuals must have the same References psychological states—again, no matter how different their environments. Burge, T. (1979). Individualism and the mental. In P. French, T. So much for the intuitions; what about the arguments? Uehling Jr., and H. Wettstein, Eds., Midwest Studies in Philoso- Empirical and methodological (rather than a priori) argu- phy, vol. 4 (Metaphysics). Minneapolis: University of Minne- ments for or against individualism are perhaps of most inter- sota Press. est to cognitive scientists (Wilson 1995: chaps. 2–5). Burge, T. (1986). Individualism and Psychology. Philosophical Because of the centrality of computation to cognitive sci- Review 95: 3–45. Induction 399 Chomsky, N. (1986). Knowledge of Language. New York: Praeger. Devitt, M. (1990). A narrow representational theory of mind. In W. Chomsky, N. (1995). Language and nature. Mind 104: 1–61. Lycan, Ed., Mind and Cognition: A Reader. New York: Black- Cosmides, L., and J. Tooby. (1994). Foreword to S. Baron-Cohen, well. Mindblindness. Cambridge, MA: MIT Press. Fodor, J. A. (1982). Cognitive science and the Twin Earth problem. Donald, M. (1991). Origins of the Modern Mind. Cambridge, MA: Notre Dame Journal of Formal Logic 23: 98–118. Harvard University Press. Houghton, D. (1997). Mental content and external representation. Egan, F. (1992). Individualism, computation, and perceptual con- Philosophical Quarterly 47: 159–177 tent. Mind 101: 443–459. Jacob, P. (1997). What Minds Can Do: Intentionality in a Non- Elster, J. (1986). An Introduction to Karl Marx. New York: Cam- Intentional World. New York: Cambridge University Press. bridge University Press. Kitcher, Pat. (1985). Narrow taxonomy and wide functionalism. Fodor, J. A. (1980). Methodological solipsism considered as a Philosophy of Science 52: 78–97. research strategy in cognitive psychology. Reprinted in his Rep- Lewis, D. L. (1994). Reduction of mind. In S. Guttenplan, Ed., Companion to the Philosophy of Mind. New York: Blackwell. resentations. Sussex: Harvester Press, 1981. Marr, D. (1982). Vision: A Computational Approach. San Fran- Fodor, J. A. (1987). Psychosemantics. Cambridge, MA: MIT cisco: Freeman. Press. McGinn, C. (1989). Mental Content. New York: Blackwell. Fodor, J. A. (1994). The Elm and the Expert. Cambridge, MA: Millikan, R. G. (1993). White Queen Psychology and Other Essays MIT Press. for Alice. Cambridge, MA: MIT Press. Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT Patterson, S. (1991). Individualism and semantic development. Press. Philosophy of Science 58: 15–35. Jackendoff, R. (1991). The problem of reality. Reprinted in his Pylyshyn, Z. (1984). Computation and Cognition. Cambridge, Languages of the Mind. Cambridge, MA: MIT Press, 1992. MA: MIT Press. McClamrock, R. (1995). Existential Cognition: Computational Segal, G. (1991). Defense of a reasonable individualism. Mind Minds in the World. Chicago: University of Chicago Press. Putnam, H. (1975). The meaning of “meaning.” Reprinted in his 100: 485–494. Mind, Language, and Reality. New York: Cambridge Univer- Stalnaker, R. C. (1989). On what’s in the head. In J. Tomberlin, sity Press. Ed., Philosophical Perspectives 3. Atascadero, CA: Ridgeview. Segal, G. (1989). Seeing what is not there. Philosophical Review Walsh, D. (1998). Wide content individualism. Mind 107: 625–651 98: 189–214. White, S. (1991). The Unity of the Self. Cambridge, MA: MIT Shapiro, L. (1997). A clearer vision. Philosophy of Science 64: Press. 131–153. Woodfield, A., Ed. (1982). Thought and Object: Essays on Inten- Stich, S. (1978). Autonomous psychology and the belief-desire tionality. Oxford: Oxford University Press. thesis. Monist 61: 573–591. Stich, S. (1983). From Folk Psychology to Cognitive Science. Cam- Induction bridge, MA: MIT Press. van Gulick, R. (1989). Metaphysical arguments for internalism and why they don’t work. In S. Silvers, Ed., Rerepresentation. Dor- Induction is a kind of inference that introduces uncertainty, in drecht: Kluwer. contrast to DEDUCTIVE REASONING in which the truth of a Williams, G. (1966). Adaptation and Natural Selection. Princeton, conclusion follows necessarily from the truth of the premises. NJ: Princeton University Press. The term induction sometimes has a narrower meaning to Wilson, D. S., and E. Sober. (1994). Re-introducing group selec- describe a particular kind of inference to a generalization, for tion to the human behavioral sciences. Behavioral and Brain example from “All the cognitive scientists I’ve met are intel- Sciences 17: 585–608. ligent” to “All cognitive scientists are intelligent.” In the Wilson, R. A. (1994). Wide computationalism. Mind 103: 351– broader sense, induction includes all kinds of nondeductive 372. LEARNING, including concept formation, ANALOGY, and the Wilson, R. A. (1995). Cartesian psychology and physical minds: individualism and the sciences of the mind. New York: Cam- generation and acceptance of hypotheses (abduction). bridge University Press. The traditional philosophical problem of induction is Wilson, R. A. (1999). The mind beyond itself. In D. Sperber, Ed., whether inductive inference is legitimate. Because induction Metarepresentation. New York: Oxford University Press. involves uncertainty, it may introduce error: no matter how many intelligent cognitive scientists I have encountered, one Further Readings still might turn up who is not intelligent. In the eighteenth century, DAVID HUME asked how people could be justified in Adams, F., D. Drebushenko, G. Fuller, and R. Stecker. (1990). Nar- believing that the future will be like the past and concluded row content: Fodor’s folly. Mind and Language 5: 213–229 that they cannot: we use induction out of habit but with no Block, N. (1986). Advertisement for a semantics for psychology. In P. French, T. Uehling, Jr., and H. Wettsten, Eds., Midwest legitimate basis. Because no deductive justification of Studies in Philosophy, vol. 10: Philosophy of Mind. Minneapo- induction is available, and any inductive justification would lis: University of Minnesota Press. be circular, it seems that induction is a dubious source of Burge, T. (1988). Individualism and self-knowledge. Journal of knowledge. Rescher (1980) offers a pragmatic justification Philosophy 85: 649–663. of induction, arguing that it is the best available means for Burge, T. (1989). Individuation and causation in psychology. accomplishing our cognitive ends. Induction usually works, Pacific Philosophical Quarterly 70: 303–322. and even when it leads us into error, there is no method of Crane, T. (1991). All the difference in the world. Philosophical thinking that would work better. Quarterly 41: 1–25. In the 1950s, Nelson Goodman (1983) dissolved the tradi- Davies, M. (1991). Individualism and perceptual content. Mind tional problem of induction by pointing out that the validity 100: 461–484. 400 Inductive Logic Programming of deduction consists in conformity to valid deductive princi- eses about why this happened. Pick the best hypothesis, which might be done probabilistically or qualitatively by ples at the same time that deductive principles are evaluated considering which hypothesis is the best explanation. according to deductive practice. Justification is then just a Forming and evaluating explanatory hypotheses is a kind matter of finding a coherent fit between inferential practice of CAUSAL REASONING. Medical diagnosis is one kind of and inferential rules. Similarly, inductive inference does not hypothesis formation. need any general justification but is a matter of finding a set 4. Analogical inference. To solve a given target problem, of inductive principles that fit well with inductive practice look for a similar problem that can be adapted to infer a after a process of improving principles to fit with practice possible solution to the target problem. ANALOGY and and improving practice to fit with principles. Instead of the hypothesis formation are often particularly risky kinds of old problem of coming up with an absolute justification of induction, inasmuch as they both tend to involve sub- induction, we have a new problem of compiling a set of good stantial leaps beyond the information given and intro- duce much uncertainty; alternative analogies and inductive principles. hypotheses will always be possible. Nevertheless, these The task that philosophers of induction face is therefore risky kinds of induction are immensely valuable to very similar to the projects of researchers in psychology and everyday and scientific thought, because they can bring artificial intelligence who are concerned with learning in new creative insights that induction of rules and concepts humans and machines. Psychologists have investigated a from examples could never provide. wide array of inductive behavior, including rule learning in rats, category formation, formation of models of the social How much of human thinking is deductive and how much and physical worlds, generalization, learning inferential is inductive? No data are available to answer this question, rules, and analogy (Holland et al. 1986). AI researchers but if Harman (1986) is right that inference is always a mat- have developed computational models of many kinds of ter of coherence, then all inference is inductive. He points MACHINE LEARNING, including learning from examples and out that the deductive rule of modus ponens, from P and if P EXPLANATION-BASED LEARNING, which relies heavily on then Q to infer Q, does not tell us that we should infer Q background knowledge (Langley 1996; Mitchell 1997). from P and if P then Q; sometimes we should give up P or if Most philosophical work on induction, however, has P then Q instead, depending on how these beliefs and Q tended to ignore psychological and computational issues. cohere with our other beliefs. Many kinds of inductive infer- Following Carnap (1950), much research has been con- ence can be interpreted as maximizing coherence using par- cerned with applying and developing probability theory. For allel constraint satisfaction (Thagard and Verbeurgt 1998). example, Howson and Urbach (1989) use Bayesian proba- See also CATEGORIZATION; EPISTEMOLOGY AND COGNI- bility theory to describe and explain inductive inference in TION; JUSTIFICATION; LOGIC; PRODUCTION SYSTEMS science. Similarly, AI research has investigated inference in —Paul Thagard BAYESIAN NETWORKS (Pearl 1988). In contrast, Thagard (1992) offers a more psychologically oriented view of sci- References entific induction, viewing theory choice as a process of par- allel constraint satisfaction that can be modeled using Carnap, R. (1950). Logical Foundations of Probability. Chicago: connectionist networks. There is room for both psychologi- University of Chicago Press. Goodman, N. (1983). Fact, Fiction and Forecast. 4th ed. Indianap- cal and nonpsychological investigations of induction in phi- olis: Bobbs-Merrill. losophy and artificial intelligence: the former are concerned Harman, G. (1986). Change in View: Principles of Reasoning. with how people do induction, and the latter pursue the Cambridge, MA: MIT Press/Bradford Books. question of how probability theory and other mathematical Holland, J. H., K. J. Holyoak, R. E. Nisbett, and P. R. Thagard. methods can be used to perform differently and perhaps bet- (1986). Induction: Processes of Inference, Learning, and Dis- ter than people typically do. It is possible, however, that covery. Cambridge, MA: MIT Press/Bradford Books. psychologically motivated connectionist approaches to Howson, C., and P. Urbach. (1989). Scientific Reasoning: The learning may approximate to optimal reasoning. Bayesian Tradition. Lasalle, IL: Open Court. Here are some of the inductive tasks that need to be Langley, P. (1996). Elements of Machine Learning. San Francisco: understood from a combination of philosophical, psycho- Morgan Kaufmann. Mitchell, T. (1997). Machine Learning. New York: McGraw-Hill. logical, and computational perspectives: Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. San Mateo: Morgan Kaufman. 1. Concept learning. Given a set of examples and a set of Rescher, N. (1980). Induction. Oxford: Blackwell. prior concepts, formulate new CONCEPTS that effectively Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton describe the examples. A student entering the university, University Press. for example, needs to form new concepts that describe Thagard, P., and K. Verbeurgt. (1998). Coherence as constraint sat- kinds of courses, professors, and students. isfaction. Cognitive Science 22: 1–24. 2. Rule learning. Given a set of examples and a set of prior rules, formulate new rules that improve problem solving. Inductive Logic Programming For example, a student might generalize that early morn- ing classes are hard to get to. According to some lin- guists, LANGUAGE ACQUISITION is essentially a matter of Inductive logic programming (ILP) is the area of computer learning rules. science involved with the automatic synthesis and revision 3. Hypothesis formation. Given a puzzling occurrence such of logic programs from partial specifications. The word as a friend’s not showing up for a date, generate hypoth- Inductive Logic Programming 401 “inductive” is used in the sense of philosophical rather than pendently chosen and associated with the label true or false mathematical induction. In his Posterior Analytics Aristotle depending on whether they are or are not examples of the introduced philosophical induction (in Greek epagogue) as chosen concept. From an inductive agent’s point of view, the study of the derivation of general statements from spe- prior to receipt of any examples the probability that any par- cific instances (see INDUCTION). This can be contrasted with ticular hypothesis H will fit the data is p(H), the probability deduction, which involves the derivation of specific state- of H being chosen by the teacher. Likewise, p(E) denotes ments from more general ones. For instance, induction the probability that the teacher will provide the example might involve the conjecture that (a) “all swans are white” sequence E, p(H | E) is the probability the hypothesis cho- from the observation (b) “all the swans in that pond are sen was H given that the example sequence was E and p(E | white,” where (b) can be derived deductively from (a). In H) is the probability that the example sequence was E given the Principles of Science the nineteenth-century philosopher that the hypothesis chosen was H. According to Bayes’s and economist William Stanley Jevons gave some simple theorem demonstrations that inductive inference could be carried out p ( H )p ( E H ) by reversal of deductive rules of inference. The idea of p ( H E ) = ------------------------------ . - p(E ) reversing deduction has turned out to be one of the strong lines of investigation within ILP. The most likely hypothesis H given the example E is the one In ILP the general and specific statements are in both that maximizes p(H | E). In fact it is sufficient to maximize cases logic programs. A logic program is a set of Horn p(H) p(E | H) because p(E) is common to all candidate clauses (see LOGIC PROGRAMMING). Each Horn clause has the form Head ← Body. Thus the definite clause active(X) hypotheses. As with all Bayesian approaches the main issue ← charged(X), polar(X) states that any X that is charged and is how to choose the inductive agent’s prior probabilities over hypotheses. In common with other forms of machine polar is also active. In this case, active, charged, and polar learning, this is usually done within ILP systems using a are called “predicates,” and they are properties that are MINIMUM DESCRIPTION LENGTH (MDL) approach, that is, a either true or false of X. Within deductive logic program- probability distribution that assigns higher probabilities to ming the inference rule of resolution is used to derive conse- textually simple hypotheses. quences from logic programs. According to resolution, given atom a and clauses C, D, from the clauses a ∨ C and Within any computational framework such as ILP, a key - D ∨ a the clause C ∨ D can be concluded. A connected set question involves the efficiency with which the inductive agent converges on the “correct” solution. Here a number of of resolutions is called a proof or deductive derivation. ILP researchers have taken the approach of COMPUTA- The inductive derivations in ILP are sometimes thought TIONAL LEARNING THEORY, which studies the numbers of of as inversions of resolution-based proofs (see “inverse res- examples required for an inductive algorithm to return with olution” in Muggleton and de Raedt 1994). Following Plot- high probability an hypothesis which has a given degree of kin’s work in the 1970s, the logical specification of an ILP accuracy (this is Valiant’s Probably Approximately Correct problem is thought of as consisting of three primary compo- [PAC] model). One interesting result of these investigations nents: B, the background knowledge (e.g., things cannot be has been that though it has been shown that logic programs completely black and completely white), E the examples cannot be PAC-learned as a class, time-bound logic pro- (e.g., the first swan on the pond is white) and H, the hypothe- grams (those in which the derivation of instances are of sis (e.g., all swans are white). The primary relation between bounded length) are efficiently learnable within certain these three components is that the background together with Bayesian settings in which the probability of hypotheses the hypothesis should allow the derivation of the examples. decays rapidly (e.g., exponentially or with the inverse This can be written in logical notation as follows: square) relative to their size. B, H E One of the hard and still largely unsolved problems within ILP is that of predicate invention. This is the process by which predicates are added to the background knowl- Given only such logical specifications, it has long been edge in order to provide compact representations for fore- known that induction is not sound. That is to say, H is not a ground concepts. Thus suppose you were trying to induce a necessary conclusion from knowing B and E. phrase structured grammar for English, and you already A sound treatment is possible by viewing inductive infer- have background descriptions for noun, verb, and verb ence as the derivation of statements with associated proba- phrase, but no definition for a noun phrase. “Inventing” a bilities. This approach was advocated by the philosopher definition for noun phrase would considerably simplify the Carnap in the 1950s and has been taken up more recently in overall hypothesized descriptions. However, the space of a revised form within ILP. The framework typically chosen possible new predicates that could be invented is clearly is that of Bayesian inference (see BAYESIAN LEARNING). large, and its topology not clearly understood. This is a probabilistic framework that allows calculations of ILP has found powerful applications in areas of scientific the probability of certain events’ happening given that other discovery in which the expressive power of logic programs is events have happened. Thus suppose we imagine that necessary for representing the concepts involved. Most nota- nature, or a teacher, randomly and independently chooses a bly, ILP techniques have been used to discover constraints on series of concepts to be learned, where each concept has an the molecular structures of certain biological molecules. associated probability of coming up. Suppose also that for These semiautomated scientific “discoveries” include a new each concept a series of instances are randomly and inde- 402 Infant Cognition structural alert for mutagenesis and the suggestion of a new ones: By what processes does knowledge grow, how does it binding site for an HIV protease inhibitor. ILP techniques change, and how variable are its developmental paths and have also been demonstrated capable of building large gram- endpoints? mars automatically from example sentences. Studies of cognition in infancy have long been viewed as The philosopher of science Gillies (1996) has made a a potential source of answers to these questions, but they careful comparison of techniques used in ILP with Bacon face a problem: How does one find out what infants know? and Popper’s conception of scientific induction. Gillies Before the twentieth century, most studies of human knowl- concludes that ILP techniques combine elements from edge depended either on the ability to reflect on one’s Bacon’s “pure” knowledge-free notion of induction and knowledge or on the ability to follow instructions and per- Popper’s falsificationist approach. ILP has helped clarify a form simple tasks with focused attention. Infants obviously number of issues in the theory, implementation, and appli- are unfit subjects for these studies, and so questions about cation of inductive inference within a computational logic their knowledge were deemed unanswerable by scientists as framework. different as HERMAN VON HELMHOLTZ and Edward Brad- ford Titchener. See also DEDUCTIVE REASONING; PROBABILITY, FOUN- The study of cognition in infancy nevertheless began in DATIONS OF earnest in the 1920s, with the research of JEAN PIAGET. —Stephen Muggleton Piaget observed his own infants’ spontaneous, naturally occurring actions under systematically varying conditions. References and Further Readings Some of his most famous observations centered on infants’ reaching for objects under different conditions of Bratko, I., and S. Muggleton. (1995). Applications of Inductive visibility and accessibility. Infants under nine months of Logic Programming. Communications of the ACM 38(11): 65– age, who showed intense interest in visible objects, either 70. Carnap, R. (1952). The Continuum of Inductive Methods. Chicago: failed to reach for objects or reached to inappropriate University of Chicago Press. locations when the objects were occluded. Search failures Gillies, D. A. (1996). Artificial Intelligence and Scientific Method. and errors declined with age, a change that Piaget attrib- Oxford: Oxford University Press. uted to the emergence of abilities to represent objects as Jevons, W. S. (1874). The Principles of Science: a Treatise on enduring, mechanical bodies. He proposed a domain-general, Logic and Scientific Method. London: Macmillan. Republished constructivist theory of cognitive development, according in 1986 by IBIS publishing. to which the development of object representations was King R., S. Muggleton, A. Srinivasan, and M. Sternberg. (1996). just one manifestation of a more general change in cogni- Structure-activity relationships derived by machine learning: tive functioning over the period from birth to eighteen the use of atoms and their bond connectives to predict muta- months. genicity by inductive logic programming. Proceedings of the National Academy of Sciences 93: 438–442. Later investigators have confirmed Piaget’s central Lavrac, N., and S. Dzeroski. (1994). Inductive Logic Program- observations but questioned his conclusions. Studies of ming. Ellis Horwood. motor development suggest that developmental changes in Muggleton, S. (1995). Inverse entailment and Progol. New Genera- infants’ search for hidden objects stem in part from devel- tion Computing Journal 13: 245–286. oping abilities to reach around obstacles, manipulate two Muggleton, S., and L. de Raedt. (1994). Inductive Logic Program- objects in relation to one another, and inhibit prepotent ming: theory and methods. Journal of Logic Programming actions. When these abilities are not required (for exam- 19(20): 629–679. ple, when infants are presented with an object that is Plotkin, G. (1971). A further note on inductive generalisation. In obscured by darkness rather than by occlusion), successful Machine Intelligence 6. Edinburgh: Edinburgh University search occurs at younger ages. The causes of developmen- Press. tal changes in search are still disputed, however, with Sternberg, M., R. King, R. Lewis, and S. Muggleton. (1994). different accounts emphasizing changes in action, ATTEN- Application of machine learning to structural molecular biol- ogy. Philosophical Transactions of the Royal Society B 344: TION, MEMORY, object representations, and physical 365–371. knowledge. Zelle, J. M., and R. J. Mooney. (1996). Comparative results on Recent studies of cognition in infancy have tended to using inductive logic programming for corpus-based parser focus on early-developing actions such as kicking, sucking, construction. In Connectionist, Statistical and Symbolic and looking. Experiments have shown that even newborn Approaches to Learning for Natural Language Processing. infants learn to modify their actions so as to produce or Berlin: Springer, pp. 355–369. change a perceived event: for example, babies will suck on a pacifier with increased frequency or pressure if the action is followed by a sound, and they will suck harder or longer for Infant Cognition some sounds than for others. Studies using this method pro- vide evidence that newborn infants recognize their parents’ Questions about the origins and development of human voices (they suck harder to hear the voice of their mother knowledge have been posed for millennia. What do new- than the voice of a different woman) and their native lan- born infants know about their new surroundings, and what guage (they suck harder to produce speech in their own do they learn as they observe events, play with objects, or community’s language). Both abilities likely depend on interact with people? Behind these questions are deeper auditory perception and learning before birth. Infant Cognition 403 A variant of this method is based on the finding that other objects) than to the inertial properties of object infants’ sucking declines over time when followed by the motions (they fail to represent objects as moving at constant same sound and then increases if the sound changes. This or smoothly changing velocities in the absence of obsta- pattern is the basis of studies of infants’ auditory discrimi- cles). Very young infants also have been shown to detect nation, CATEGORIZATION, and memory, and it reveals and discriminate different numbers of objects in visible and remarkably acute capacities for SPEECH PERCEPTION in the partly occluded displays when numbers are small or numer- ical differences are large. With large set sizes and small dif- first days of life. Indeed, young infants are more sensitive ferences, in contrast, infants fail to respond reliably to than adults to speech contrasts outside their native language. number. Studies of cognition in infancy are most revealing Studies of older infants, using similar procedures and a where they show contrasting patterns of success and failure, headturn response, reveal abilities to recognize the sounds as in these examples, because the patterns provide insight of individual words and predictable sequences of syllables into the nature of the cognitive systems underlying their per- well before speech begins. The relation between these early- formance. developing abilities and later LANGUAGE ACQUISITION is an Where infants have shown visual preferences for events open question guiding much current research. that adults judge to be unnatural, controversy has arisen Since the middle of the twentieth century, many studies concerning the interpretation of infants’ looking patterns. of cognition in infancy have used some aspect of visual For example, Baillargeon (1993) has proposed that the pat- attention as a window on the development of knowledge. terns provide evidence for early-developing, explicit knowl- Even newborn infants show systematic differences in look- edge of objects; Karmiloff-Smith (1992) has proposed that ing time to different displays, preferring patterned to homo- the patterns provide evidence for an initial system of object geneous pictures, moving to stationary objects, and familiar representation not unlike early-developing perceptual sys- people to strangers. Like sucking, looking time declines tems; and Haith (Haith and Bensen 1998) has proposed that when a single display is repeated and increases when a new preferential looking to unnatural events depends on sensory display appears. Both intrinsic preferences and preferences or motor systems attuned to subtle, superficial properties of for novelty provide investigators with measures of detection the events. These contrasting possibilities animate current and discrimination not unlike those used by traditional psy- research. chophysicists. They have produced quite a rich body of Alongside these studies is a rich tradition of research on knowledge about early PERCEPTUAL DEVELOPMENT. Investi- infants’ social development, providing further insight into gators now know, for example, that one-week-old infants their cognitive capacities. Newborn infants attend to human perceive depth and the constant sizes of objects over varying faces, recognize familiar people, and even imitate some distances, that two-month-old infants have begun to per- facial gestures and expressions in a rudimentary way. By six ceive the stability of objects over self motion and conse- months, infants follow people’s gaze and attend to objects quent image displacements in the visual field, and that on which people have acted. By nine months, infants repro- three-month-old infants perceive both similarities among duce other people’s actions on objects, and they communi- animals within a single species and differences across dif- cate about objects with gestures such as pointing. These ferent species. patterns suggest that infants have considerable abilities to Perhaps the most intriguing, and controversial, studies learn from other people, and they testify to early-developing using preferential looking methods have focused on more knowledge about human action. Studies probing the nature central aspects of cognitive development in infancy. Return- of this knowledge, using methods parallel to the preferential ing to knowledge of objects, experiments have shown that looking methods just described, reveal interesting differ- infants as young as three months look systematically longer ences. Whereas infants represent inanimate object motions at certain events that adults find unnatural or unexpected, as initiated on contact, they represent human actions as relative to superficially similar events that adults find natu- directed to goals; whereas continuity of motion provides the ral. In one series of studies, for example, three-month-old strongest information for object identity, constancy of prop- infants viewed an object that was initially fully visible on a erties such as facial features provides stronger information horizontal surface, an opaque screen in front of the object for personal identity. Evidence for these differences has rotated upward and occluded the object, and then the screen been obtained only recently and much remains to be either stopped at the location of the object (expected for learned, but research already suggests that distinct systems adults) or rotated a full half turn, passing through the space of knowledge underlie infants’ reasoning about persons and that the object had occupied (unexpected). Infants looked inanimate objects. longer at the latter event, despite the absence of any intrinsic In sum, the descriptive enterprise of characterizing preference for the longer rotation. Infants’ looking patterns infants’ developing knowledge is well under way, both in the suggested that they represented the object’s continuous preceding domains and in others not mentioned. In contrast, existence, stable location, and solidity, and that they reacted the deeper and more important enterprise of explaining early with interest or surprise when these properties were vio- cognitive development has hardly begun. Most investigators lated. agree that knowledge is organized into domain-specific sys- In further investigations of these abilities, the limits of tems at a very early age, but they differ in their characteriza- early-developing object knowledge have been explored. tions of those systems and their explanations for each Thus, four-month-old infants have been found to be more system’s emergence and growth. Elman et al. (1996) suggest sensitive to the contact relations among object motions that infants are endowed with a collection of connectionist (they represent objects as initiating motion on contact with 404 Inference learning systems whose differing architectures and process- Spelke, E. S., and E. L. Newport. (Forthcoming). Nativism, empir- icism, and the development of knowledge. In R. Lerner and W. ing characteristics predispose them to treat information from Damon, Eds., Handbook of Child Psychology, vol. 1: Theoreti- different domains. Spelke and others suggest that infants are cal Models of Human Development. 5th ed. New York: Wiley. endowed with systems of core knowledge that remain cen- Thelen, E., and L. B. Smith. (1994). A Dynamical Systems tral to humans as adults. Carey (1991) proposes that infants Approach to the Development of Cognition and Action. Cam- are endowed with modular systems for processing percep- bridge, MA: Bradford Books/MIT Press. tual information, but CONCEPTUAL CHANGE occurs as these Wynn, K. (1995). Infants possess a system of numerical knowl- systems are partly superceded over development by more edge. Current Directions in Psychological Science 4: 172–176. central systems of representation. As research on infants’ learning, knowledge, and perception progresses, these views Inference and others will become more amenable to empirical test. References to the experiments discussed earlier can be See found in Bertenthal (1996), Haith and Bensen (1998), Man- DEDUCTIVE REASONING; INDUCTION; LOGIC; LOGICAL dler (1998), and Spelke and Newport (1998). Discussions of REASONING SYSTEMS infant cognition from diverse theoretical perspectives are Influence Diagrams listed in the references. See also COGNITIVE DEVELOPMENT; INTERSUBJECTIVITY; NAIVE PHYSICS; NATIVISM; PHONOLOGY, ACQUISITION OF See BAYESIAN NETWORKS —Elizabeth S. Spelke Information Processing References See INTRODUCTION: COMPUTATIONAL INTELLIGENCE; IN- Baillargeon, R. (1993). The object concept revisited: New direc- TRODUCTION: PSYCHOLOGY tions in the study of infants’ physical knowledge. In C. Granrud, Ed., Perception and Cognition in Infancy. Hillsdale, Information Theory NJ: Erlbaum. Bertenthal, B. I. (1996). Origins and early development of percep- tion, action, and representation. Annual Review of Psychology Information theory is a branch of mathematics that deals 47: 431–459. with measures of information and their application to the Carey, S. (1991). Knowledge acquisition: enrichment or concep- tual change? In S. Carey and R. Gelman, Eds., Epigenesis of study of communication, statistics, and complexity. It origi- Mind: Essays on Biology and Cognition. Hillsdale, NJ: nally arose out of communication theory and is sometimes Erlbaum, pp. 1257–1291. used to mean the mathematical theory that underlies com- Elman, J., E. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Parisi, munication systems. Based on the pioneering work of and K. Plunkett. (1996). Rethinking Innateness: A Connection- Claude Shannon (1948), information theory establishes the ist Perspective on Cognitive Development. Cambridge, MA: limits to the shortest description of an information source MIT Press. and the limits to the rate at which information can be sent Haith, M. M., and J. B. Bensen. (1998). Infant cognition. In D. over a communication channel. The results of information Kuhn and R. Siegler, Eds., Handbook of Child Psychology, vol. theory are in terms of fundamental quantities like entropy, 3: Cognition, Perception, and Language. 5th ed. New York: relative entropy, and mutual information, which are defined Wiley. Jusczyk, P. (1997). The Discovery of Spoken Language. Cam- using a probabilistic model for a communication system. bridge, MA: MIT Press. These quantities have also found application to a number of Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmen- other areas, including statistics, computer science, complex- tal Perspective on Cognitive Science. Cambridge, MA: MIT ity and economics. In this article, we will describe these Press. basic quantities and some of their applications. Terms like Kellman, P. J., and M. Arterberry. (Forthcoming). The Cradle of information and entropy are richly evocative with multiple Knowledge: Development of Perception in Infancy. Cambridge, meanings in everyday usage; information theory captures MA: MIT Press. only some of the many facets of the notion of information. Leslie, A. M. (1988). The necessity of illusion: perception and Strictly speaking, information theory is a branch of mathe- thought in infancy. In L. Weiskrantz, Ed., Thought Without matics, and care should be taken in applying its concepts Language. Oxford: Clarendon Press. Mandler, J. M. (1998). Representation. In D. Kuhn and R. Siegler, and tools to other areas. Eds., Handbook of Child Psychology, vol. 3: Cognition, Percep- Information theory relies on the theory of PROBABILITY tion, and Language. 5th ed. New York: Wiley. to model information sources and communication chan- Munakata, Y., J. L. McClelland, M. H. Johnson, and R. S. Siegler. nels. A source of information produces a message out of a (1997). Rethinking infant knowledge: toward an adaptive pro- set of possible messages. The difficulty of communication cess account of successes and failures in object permanence or storage of the message depends only on length of the tasks. Psychological Review 104: 686–713. representation of the message and can be isolated from the Piaget, J. (1954). The Construction of Reality in the Child. New meaning of the message. If there is only one possible mes- York: Basic Books. sage, then no information is transmitted by sending that Spelke, E. S., K. Breinlinger, J. Macomber, and K. Jacobson. (1992). message. The amount of information obtained from a mes- Origins of knowledge. Psychological Review 99: 605–632. Information Theory 405 sage is related to its UNCERTAINTY—if something is very universal up to a constant. For random strings, it can be likely, not much information is obtained when it occurs. shown that with high probability, the Kolmogorov com- The simplest case occurs when there are two equally likely plexity is equivalent to the entropy rate of the random pro- messages—the messages can then be represented by the cess. However, due to the halting problem, it is not always symbols 0 and 1 and the amount of information transmitted possible to discover the shortest program for a string, and by such a message is one bit. If there are four equally likely thus it is not always possible to determine the Kolmogorov messages, the messages can be represented by 00, 01,10 complexity of a string. But Kolmogorov or algorithmic and 11, and thus require two bits for its representation. For complexity provides a natural way to think about complex- equally likely messages, the number of bits required grows ity and data compression and has been developed into a logarithmically with the number of possibilities. When the rich and deep theory (Li and Vitanyi 1992) with many messages are not equally likely, we can model the message applications (see MINIMUM DESCRIPTION LENGTH and COM- source by a random variable X, which takes on values 1, 2, PUTATIONAL COMPLEXITY). . . . with probabilities p1, p2, . . . , with associated entropy Transmission of information over a communication defined as channel is subject to noise and interference from other send- ers. A fundamental concept of information theory is the ∑ p log p H(X) = – bits. notion of channel capacity, which plays a role very similar 2i i to the capacity of a pipe carrying water—information is like i an incompressible fluid that can be sent reliably at any rate The entropy is a measure of the average uncertainty of the below capacity. It is not possible to send information reli- random variable. It can be shown (Cover and Thomas 1991) ably at a rate above capacity. A communication channel is that the entropy of a random variable is a lower bound to the described by a probability transition function p(y|x), which average length of any uniquely decodable representation of models the probability of a particular output message y the random variable. For a uniformly distributed random when input signal x is sent. The capacity C of the channel variable taking on 2k possible values, the entropy of the ran- can then be calculated as dom variable is k bits, and it is easy to see how an outcome p ( x )p ( y x) ∑∑ could be represented by a k bit number. When the random p ( x )p ( y x) log -------------------------------- C = max - (4) p ( x )Σ x p ( y x) p(x) variable is not uniformly distributed, it is possible to get a x y lower average length description by using fewer bits to describe the most frequently occurring outcomes, and more A key result of information theory is that if the entropy rate bits to describe the less frequent outcomes. For example, if a of a source is less than the capacity of a channel, then the random variable takes on the values A, B, C, and D with source can be reproduced at the output of the channel with probabilities 0.5, 0.25, 0.125, and 0.125 respectively, then negligible error, whereas if the entropy rate is greater than we can use a code 0, 10, 110, 111 to represent these four the capacity, the error cannot be made arbitrarily small. This outcomes with an average length of 1.75 bits, which is less result, due to Shannon (1948), created a sensation when it than the two bits required for the equal length code. Note first appeared. Before Shannon, communication engineers that in this example, the average codeword length is equal to believed that the only way to increase reliability in the pres- the entropy. In general, the entropy is a lower bound on the ence of noise was to reduce the rate by repeating messages, average length of any uniquely decodable code, and there and that to get arbitrarily high reliability one would need to exists a code that achieves an average length that is within send at a vanishingly small rate. Shannon proved that this one bit of the entropy. was not necessary; at any rate below the capacity of the Traditional information theory was developed using channel, it is possible to send information with arbitrary probabilistic models, but it was extended to arbitrary reliability by appropriate coding and decoding. The meth- strings using notions of program length by the work of Kol- ods used by Shannon were not explicit, but subsequent mogorov (1965), Chaitin (1966), and Solomonoff (1964). research has developed practical codes that allow communi- Suppose one has to send a billion bits of π to a person on cation at rates close to capacity. These codes are used in the moon. Instead of sending the raw bits, one could combating errors and noise in most current digital systems, instead send a program to calculate π and let the receiver for example, in CD players and modems. reconstruct the bits. Because the length of the program Another fundamental quantity used in information theory would be much shorter than the raw bits, we compress the and in statistics is the relative entropy or Kullback-Leibler string using this approach. This motivates the following distance D(p || q) between probability mass functions p and definition forKolmogorov complexity or algorithmic com- q, which is defined as plexity: The Kolmogorov complexity Ku(x) is the length of the shortest program for a universal TURING machine U (see p ∑ D ( p || q ) = p i log ---i - (5) AUTOMATA) that prints out the string x and halts. Using the qi fact that any universal Turing machine can simulate any i other universal Turing machine using a fixed length simula- tion program, it is easy to see that the Kolmogorov com- The relative entropy is a measure of the difference plexity with respect to two different Turing machines between two distributions. It is always nonnegative, and is zero if and only if the two distributions are the same. How- differs by at most a constant (the length of the simulation program). Thus the notion of Kolmogorov complexity is ever, it is not a true distance measure, because it is not 406 Informational Semantics symmetric. The relative entropy plays a key role in large Informational Semantics deviation theory, where it is the exponent in the probability that data drawn according to one distribution looks like data from the other distribution. Thus if a experimenter Informational semantics is an attempt to ground meaning— observes n samples of data and wants to decide if the data as this is understood in the study of both language and that he has observed is drawn from the distribution p or the mind—in an objective, mind (and language) independent, distribution q, then the probability that he will think that notion of information. This effort is often part of a larger the distribution is p when the data is actually drawn from q effort to naturalize INTENTIONALITY and thereby exhibit is approximately 2–nD(p||q). semantic—and, more generally, mental—phenomena as an We have defined entropy and relative entropy for single aspect of our more familiar (at least better understood) discrete random variables. The definitions can be extended material world. to continuous random variables and random processes as Informational semantics locates the primary source of well. Because a real number cannot be represented fully meaning in symbol-world relations (the symbols in question with a finite number of bits, a branch of information theory can occur either in the language of thought or in a public called rate distortion theory characterizes the tradeoff language). The symbol-world relations are sometimes between the accuracy of the representation (the distortion) described in information-theoretic terms (source, receiver, and the length of the description (the rate). The theory of signal, etc.) and sometimes in more general causal terms. In communication has also been extended to networks with either case, the resulting semantics is to be contrasted with multiple senders and receivers and to information record- conceptual role (also called procedural) semantics, which ing (which can be considered as information transmission locates meaning in the relations symbols have to one over time as opposed to information transmission over another (or, more broadly, the way they are related to one space). another, sensory input, and motor output). Because on some Information theory has had a profound impact on the interpretations of information, the information a signal car- technology of the information age. The fact that most infor- ries is what it indicates about a source, informational mation now is stored and transmitted in digital form can be semantics is sometimes referred to as indicator semantics. considered a result of one of the fundamental insights of the The concept of information involved is inspired by, but is theory, that the problem of data compression can be sepa- only distantly related to, the statistical construct in INFOR- rated from the problem of optimal transmission over a com- MATION THEORY (Dretske 1981). munication channel without any loss in achievable The word “meaning” is multiply ambiguous. Two of its performance. Over the half century since Shannon’s original possible meanings (Grice 1989) are: (1) nonnatural work, complex source coding and channel coding schemes meaning—the sense in which the word “fire” stands for or have been developed that have come close to fulfilling the means fire; and (2) natural meaning—the way in which bounds that are derived in the theory. These fundamental smoke means (is a sign of, indicates) fire. Nonnatural bounds on description length and communication rates meaning has no necessary connection with truth: “Jim has apply to all communication systems, including complex the measles” means that Jim has the measles whether or not ones like the nervous system or the brain (Arbib 1995). The he has the measles. Natural meaning, on the other hand, quantities defined by information theory have also found requires the existence of the condition meant: if Jim doesn’t application in many other fields, including statistics, com- have the measles, the red spots on his face do not mean puter science, and economics. (indicate) that he has the measles. Perhaps all they mean is that he has been eating too much candy. Natural meaning, See also ALGORITHM; COMPUTATION; ECONOMICS AND what one event indicates about another, is taken to be a COGNITIVE SCIENCE; INFORMATIONAL SEMANTICS; LAN- relation between sign and signified that does not depend on GUAGE AND COMMUNICATION; WIENER, NORBERT anyone recognizing or identifying what is meant. Tracks in —Thomas M. Cover and Joy A. Thomas the snow can mean there are deer in the woods even if no one identifies them that way—even if they do not mean that References to anyone. Information, as this is used in informational semantics, is Arbib, M. A. (1995). The Handbook of Brain Theory and Neural akin to natural meaning. It is an objective (mind-indepen- Networks. Cambridge, MA: Bradford Books/MIT Press. dent) relation between a sign or signal—tracks in the snow, Chaitin, G. J. (1966). On the length of programs for computing for instance—and what that sign or signal indicates—deer in binary sequences. J. Assoc. Comp. Mach. 13: 547–569. Cover, T. M., and J. A. Thomas. (1991). Elements of Information the woods. The information a signal carries about a source is Theory. New York: Wiley. what that signal indicates (means in a natural way) about that Kolmogorov, A. N. (1965). Three approaches to the quantitative source. Informational semantics, then, is an effort to under- definition of information. Problems of Information Transmis- stand non-natural meaning—the kind of meaning character- sion 1: 4–7. istic of thought and language—as arising out of and having Li, M., and P. Vitanyi. (1992). Introduction to Kolmogorov Com- its source in natural meaning. The word “meaning” will plexity and its Applications. New York: Springer. hereafter be used to refer to nonnatural meaning; “informa- Shannon, C. E. (1948). A mathematical theory of communication. tion” and “indication” will be reserved for natural meaning. Bell Sys. Tech. Journal 27: 379–423, 623–656. Informational semantics takes the primary home of Solomonoff, R. J. (1964). A formal theory of inductive inference. meaning to be in the mind—as the meaning or content of a Information and Control 7: 1–22, 224–254. Informational Semantics 407 nal state has the function of indicating only one of the many thought or intention. Sounds and marks of natural language things it carries information about. Only one piece of infor- derive their meaning from the communicative intentions of mation is it supposed to carry. According to informational the agents who use them. As a result, the information of pri- semantics, this would be its meaning. mary importance to informational semantics is that occur- Informational semantics—as well as any other theory of ring in the brains of conscious agents. Thus, for meaning—has the problem of saying of what relevance the informational semantics, the very existence of thought and, meaning of internal states is to the behavior of the systems thus, the possibility of language depends on the capacity of in which it occurs. Of what relevance is meaning to a sci- systems to transform information (normally supplied by ence of intentional systems? Is not the behavior of systems perception) into meaning. completely explained by the nonsemantic (e.g., neurobio- Not all information-processing systems have this capac- logical or, in the case of computers, electrical and mechani- ity. Artifacts (measuring instruments and computers) do not. cal) properties of internal events? This question is To achieve this conversion, two things are required. First, because meaning is fine grained (even though 3 is 3√27, sometimes put by asking whether, in addition to syntactic engines, there are (or could be) semantic engines. The diffi- thinking or saying that x = 3 is not the same as thinking or saying that x = 3√27) and information is coarse grained (a culty of trying to find an explanatory role for meaning in the behavior of intentional (i.e., semantic) systems has led some signal that carries the information that x = 3 necessarily car- ries the information that x = 3√27), a theory of meaning to abandon meaning (and with it the mind) as a legitimate scientific construct (ELIMINATIVE MATERIALISM), others to must specify how coarse grained information is converted into fine grained meaning. Which of the many pieces of regard meaning as legitimate in only an instrumental sense information an event (normally) carries is to be identified as (Dennett 1987), and still others (e.g., Burge 1989; Davidson its meaning? Second, in order to account for the fact that 1980; Dretske 1988; Fodor 1987; Kim 1996) to propose something (e.g., a thought) can mean (have the content) that indirect—but nonetheless quite real—ways meaning figures x = 3 when x ≠ 3, a way must be found to “detach” informa- in the explanation of behavior. tion from the events that normally carry it so that something See also MEANING; MENTAL REPRESENTATION can mean that x = 3 when it does not carry this information —Fred Dretske (because x ≠ 3). One of the strategies used by some (e.g., Dretske 1981, References 1986; Stampe 1977, 1986) to achieve these results is to identify meaning with the environmental condition with Burge, T. (1989). Individuation and causation in psychology. which a state is, or is supposed to be, correlated. For Pacific Philosophical Quarterly 70: 303–322. instance, the meaning of a state might be the condition Davidson, D. (1980). Essays on Actions and Events. Oxford: Oxford University Press. about which it is supposed to carry information where the Dennett, D. (1987). The Intentional Stance. Cambridge, MA: MIT “supposed to” is understood in terms of the state’s teleo- Press. function. Others—for example, Fodor (1990)—reject teleol- Dretske, F. (1981). Knowledge and the Flow of Information. Cam- ogy altogether and identify meaning with the sort of causal bridge, MA: MIT Press. antecedents of an event on which other causes of that event Dretske, F. (1986). Misrepresentation. In R. Bogdan, Ed., Belief. depend. Still others—for example, Millikan (1984)— Oxford: Oxford University Press, pp. 17–36. embrace the teleology but reject the idea that the relevant Dretske, F. (1988). Explaining Behavior. Cambridge, MA: MIT functions are informational. For Millikan a state can mean Press. M without having the function of carrying this information. Fodor, J. (1987). Psychosemantics: The Problem of Meaning in the By combining teleology with information, informational Philosophy of Mind. Cambridge, MA: MIT Press. Fodor, J. (1990). A Theory of Content and Other Essays. Cam- semantics holds out the promise of satisfying the desiderata bridge, MA: MIT Press. described in the last paragraph. Just as the pointer reading Grice, P. (1989). Studies in the Way of Words. Cambridge, MA: on a measuring instrument—a speedometer, for example— Harvard University Press. can misrepresent the speed of the car because there is some- Kim, J. (1996). Philosophy of Mind. Boulder, CO: Westview Press. thing (viz., the speed of the car) it is supposed to indicate Millikan, R. (1984). Language, Thought, and Other Biological that it can fail to indicate, so various events in the brain can Categories. Cambridge, MA: MIT Press. misrepresent the state of the world by failing to carry infor- Stampe, D. (1977). Towards a causal theory of linguistic represen- mation it is their function to carry. In the case of the nervous tation. In P. French, T. Uehling, and H. Wettstein, Eds., Mid- system, of course, the information-carrying functions are west Studies in Philosophy 2. Minneapolis: University of not (as with artifacts) assigned by designers or users. They Minnesota Press, pp. 42–63. Stampe, D. (1986). Verificationism and a causal account of mean- come, in the first instance, from a specific evolutionary ing. Synthèse 69: 107–137. (selectional) history—the same place the heart and kidneys get their function—and, in the second, from certain forms of Further Readings learning. Not only do information-carrying functions give perceptual organs and the central nervous system the capac- Barwise, J., and J. Perry. (1983). Situations and Attitudes. Cam- ity to misrepresent the world (thus solving the second of the bridge, MA: MIT Press. above two problems), they also help solve the grain prob- Block, N. (1986). Advertisement for a semantics for psychology. lem. Of the many things the heart does, only one, pumping In Midwest Studies in Philosophy, vol. 10. Minneapolis: Uni- blood, is its (biological) function. So too, perhaps, an inter- versity of Minnesota Press, pp. 615–678. 408 Inheritance encoded in environmental events. Thus (4a,b) seem to have Fodor, J. (1984). Semantics, Wisconsin style. Synthèse 59: 231– 250. identical surface structures, yet in (4a) Mary is the subject Israel, D., and J. Perry. (1990). What is information? In P. Hanson, of please (she will do the pleasing) and in (4b) Mary is the Ed., Information, Language, and Cognition. Vancouver: Uni- object of please (she will be pleased). How will a learner versity of British Columbia Press, pp. 1–19. learn this, inasmuch as it seems that the information is not Lepore, E., and B. Loewer. (1987). Dual aspect semantics. In E. directly provided to the learner in the surface form of the Lepore, Ed., New Directions in Semantics. London: Academic sentence? Press. Papineau, D. (1987). Reality and Representation. Oxford: Black- (4) a. Mary is eager to please. well. b. Mary is easy to please. The field of learnability theory (Wexler and Hamburger Inheritance 1973; Wexler and Culicover 1980) developed as an attempt to provide mathematical preciseness to the APS, and to See FRAME-BASED SYSTEMS; LOGICAL REASONING SYSTEMS derive exact consequences. Learnability theory provides an exact characterization of the class of possible grammars, the Innateness of Language nature of the input information, and the learning mecha- nisms, and asks whether the learning mechanism can in fact learn any possible grammar. The basic results of the field Although the idea has a large philosophical tradition (espe- include the formal, mathematical demonstration that with- cially in the work of the Continental rationalists; see RATIO- out serious constraints on the nature of human grammar, no NALISM VS. EMPIRICISM), modern ideas concerning the possible learning mechanism can in fact learn the class of innateness of language originated in the work of Chomsky human grammars. In some cases it is possible to derive from (1965 etc.) and the concomitant development of GENERA- learnability considerations the existence of specific con- TIVE GRAMMAR. Chomsky’s hypothesis is that many aspects straints on the nature of human grammar. These predictions of the formal structure of language are encoded in the can then be tested empirically. genome. The hypothesis then becomes an empirical hypoth- The strongest, most central arguments for innateness thus esis, to be accepted or validated according to standard continue to be the arguments from APS and learnability the- empirical methods. ory. This is recognized by critics of the Innateness Hypothe- As with any other hypothesis in the natural sciences, the sis (e.g., Elman et al. 1996; Quartz and Senjowski 1997). innateness hypothesis (that there exist genetically specified The latter write (section 4.1): aspects of language) has to be evaluated alongside compet- The best known characterization of a developing system’s learning ing hypotheses. Clearly, the competing hypothesis is that properties comes from language acquisition—what syntactic there are no genetically specified aspects of language. If one properties a child could learn, what in the environment could serve accepts that any genetically specified aspects of language as evidence for that learning, and ultimately, what must be exist, then there is no more debate about a general innate- prespecified by the child’s genetic endowment. From these ness hypothesis, but only a debate about exactly which questions, 30 years of research have provided mainly negative aspects of language are innate. This debate, in fact, is central results. . . . In the end, theorists concluded that the child must bring to current research in linguistics and PSYCHOLINGUISTICS. most if its syntactic knowledge, in the form of a universal grammar, There are many arguments for the innateness hypothesis. to the problem in advance. . . . The perception that this striking view But the most significant one in Chomsky’s writings, and the of syntax acquisition is based primarily on rigorous results in one that has most affected the field, is the argument from the formal learning theory makes it especially compelling. Indeed, above all, it is this formal feature that has prompted its POVERTY OF THE STIMULUS (APS; see also Wexler 1991). generalization from syntax to the view of the entire mind as a As Chomsky points out, this argument in the study of lan- collection of innately specified, specialized modules. . . . It is guage is a modern version of DESCARTES’s argument con- probably no overstatement to suggest that much of cognitive cerning human knowledge of CONCEPTS. The basic thrust of science is still dominated by Chomsky’s nativist view of the mind. the argument goes as follows: In addition to the APS, there are a number of other argu- (1) Human language has the following complex form: G ments for the innateness hypothesis. These include (1) the (for Grammar) similarity of languages around the world on a wide array of (2) The nature of the information about G available to the abstract features, even when the languages are not in contact, learner is the following: I (for Input/Information, called and the features do not have an obvious functional motiva- Primary Linguistic Data in Chomsky 1965). tion; (2) the rapid and uniform acquisition of language by most children, without instruction (see PHONOLOGY, ACQUI- (3) No learner could take the information in I and transform SITION OF; SEMANTICS, ACQUISITION OF; SYNTAX, ACQUISI- it into G. TION OF) whereas many other tasks (e.g., problem solving of various kinds) need instruction and are not uniformly In other words, the argument from the poverty of the attained by the entire population. stimulus is that the information in the environment is not rich Over the years there have been many attempts (currently enough to allow a human learner to attain adult competence. fashionable ones include Elman et al. 1996; Quartz and Sen- The arguments to support APS in linguistic theory usu- jowski 1996) to suggest that perhaps learning could explain ally involve linguistic structures that do not seem to be Intelligence 409 after all. However, no arguments Integration LANGUAGE ACQUISITION have been given that overcome the central learnability argu- mentation for the innateness hypothesis. For example, nei- See MULTISENSORY INTEGRATION ther Elman et al. nor Quartz and Senjowski explain via “learning” any of the properties of universal grammar or how they are attained. Quartz and Sejnowski attempt to cri- Intelligence tique learnability theory, but their critique does not apply to actual studies in learnability theory. For example, they char- acterize learnability theory as assuming that learners must Intelligence may be defined as the ability to adapt to, shape, enumerate every possible language in the class as part of the and select environments, although over the years many defi- learning procedure. This is false of Gold (1967) and explic- nitions of intelligence have been offered (e.g., see symposia itly argued against in Wexler and Culicover (1980). Wexler in Journal of Educational Psychology 1921; Sternberg and and Culicover, who are greatly concerned with the psycho- Detterman 1986). Various approaches have been proposed logical plausibility of the learning procedure, derive their in attempts to understand it (see Sternberg 1990). The results under some quite severe restrictions, making their emphasis here will be on cognitive-scientific approaches. learning procedure much more empirically adequate on psy- Historically, two major competing approaches to under- chological grounds than the procedures that are considered standing intelligence were offered, respectively, by Sir Fran- in so-called learning accounts. (See Gibson and Wexler cis Galton in England and Alfred Binet in France. Galton 1994 for an analysis of psychologically plausible learning (1883) sought to understand (and measure) intelligence in mechanisms in the principles and parameters framework in terms of psychophysical skills, such as an individual’s just which there is much innate knowledge.) One simply has to noticeable difference (JND) for discriminating weights or say that CONNECTIONIST APPROACHES TO LANGUAGE and its the distance on the skin two points needed to be separated in acquisition are simply programmatic statements without any order for them to be felt as having occurred in distinct loca- kind of theoretical or empirical support. To be taken seri- tions. Binet and Simon (1916), in contrast, conceptualized ously as a competitor to the innateness hypothesis, these intelligence in terms of complex judgmental abilities. Binet approaches will have to attain real results. believed that three cognitive abilities are key to intelligence: (1) direction (knowing what has to be done and how it See also COGNITIVE DEVELOPMENT; CONNECTIONISM, should be done), (2) adaptation (selection and monitoring of PHILOSOPHICAL ISSUES; LANGUAGE AND CULTURE; MODU- one’s strategies for task performance), and (3) control (the LARITY AND LANGUAGE; NATIVISM; NATIVISM, HISTORY OF ability to criticize one’s own thoughts and judgments). The —Kenneth Wexler “metacognitive” emphasis in this conception is apparent. Binet’s views have had more impact, both because his the- References ory seemed better to capture intuitive notions of intelligence and because Binet devised a test of intelligence that success- Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, fully predicted children’s performance in school. MA: MIT Press. Charles Spearman (1923) was a forerunner of contempo- Elman, J. L., E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, rary cognitive approaches to intelligence in suggesting three D. Parisi, and K. Plunkett. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, MA: information processes underlying intelligence: (1) appre- MIT Press. hension of experience, (2) eduction of relations, and (3) Gibson, E., and K. Wexler. (1994). Triggers. Linguistic Inquiry eduction of correlates. Spearman used the four-term ANAL- 25(3): 407–454. OGY problem (A : B :: C : D) as a basis for illustrating these Gold, E. M. (1967). Language identification in the limit. Informa- processes, whereby the first process involved encoding the tion and Control 10: 447–474. terms; the second, inferring the relation between A and B; Quartz, S. R., and T. J. Sejnowski. (1997). The neural basis of cog- and the third, applying that relation from C to D. nitive development: a constructivist manifesto. Behavioral and The early part of the twentieth century was dominated Brain Sciences 20(4): 537–596. by psychometric approaches to intelligence, which empha- Wexler, K. (1991). On the argument from the poverty of the stimu- sized the measurement of individual differences but had lus. In A. Kasher, Ed., The Chomskyan Turn. Cambridge: Blackwell, pp. 253–270. relatively less to say about the cognitive processing under- Wexler, K., and P. Culicover. (1980). Formal Principles of Lan- lying intelligence (see Sternberg 1990 for a review). These guage Acquisition. Cambridge, MA: MIT Press. approaches for the most part used factor analysis, a statisti- Wexler, K., and H. Hamburger. (1973). On the insufficiency of sur- cal technique for discovering possible structures underlying face data for the learning of transformational languages. In K. correlational data. For example, Spearman (1927) believed J. J. Hintikka, J. M. E. Moravcsik, and P. Suppes, Eds., that a single factor, g (general ability), captured most of Approaches to Natural Language. Proceedings of the 1970 what is important about intelligence, whereas Thurstone Standard Workshop on Grammar and Semantics. Dordrecht: (1938) believed in a need for seven primary factors. More Reidel, pp. 167–179. recently, Carroll (1993) has proposed a three-tier hierarchi- cal model that is psychometrically derived, but is expressed Inner Sense in information-processing terms, with g at the top and suc- cessively more narrow cognitive skills at each lower level of the hierarchy. See INTROSPECTION; SELF 410 Intelligence A change in the field occurred when Estes (1970) and some question as to the replicability of the findings (Wickett Hunt, Frost, and Lunneborg (1973) proposed what has come and Vernon 1994). to be called the cognitive-correlates approach to intelli- The field of intelligence has many applied offshoots. For gence, whereby relatively simple information-processing example, a number of cognitive tests have been proposed to tasks used in the laboratories of cognitive psychologists measure intelligence (see Sternberg 1993), and a number of were related to scores on conventional psychometric tests of different programs have been developed, based on cognitive intelligence. Hunt and his colleagues found correlations of theory, to modify intelligence (see Nickerson 1994). Some roughly –.3 between parameters of rate of information pro- investigators have also argued that there are various kinds of cessing in tasks such as a letter-identification task (Posner intelligence, such as practical intelligence (Sternberg et al. and Mitchell 1967)—where participants had to say whether 1995) and emotional intelligence (Goleman 1995; Salovey letter pairs like A A, A a, or A b were the same either physi- and Mayer 1990). The field is an active one today, and it cally or in name—and scores on psychometric tests of promises to change rapidly as new theories are proposed verbal abilities. This approach continues actively today, and new data collected. The goal is not to choose among with investigators proposing new tasks that they believe to alternative paradigms, but rather for them to work together be key to intelligence, such as the inspection time task, ultimately to help us produce a unified understanding of whereby individuals are assessed psychophysically for the intellectual phenomena. time it takes them accurately to discern which of two lines is See also CREATIVITY; MACHIAVELLIAN INTELLIGENCE longer than the other (e.g., Deary and Stough 1996). HYPOTHESIS; PROBLEM SOLVING; PSYCHOPHYSICS An alternative, cognitive-components approach was pro- —Robert J. Sternberg posed by Sternberg (1977), who suggested that intelligence could be understood in terms of the information-processing References components underlying complex reasoning and problem- solving tasks such as analogies and syllogisms. Sternberg Binet, A., and T. Simon. (1916). The Development of Intelligence used information-processing and mathematical modeling to in Children. Baltimore: Williams and Wilkins. (Originally pub- decompose cognitive task performance into its elementary lished in 1905.) components and strategies. Some theorists, such as Hunt Carpenter, P. A., M. A. Just, and P. Shell. (1990). What one intelli- (1974) and Carpenter, Just, and Shell (1990), have used gence test measures: a theoretical account of the processing in computer-simulation methodology in order to identify such the Raven Progressive Matrices Test. Psychological Review 97: components and strategies in complex tasks, such as the 404–431. Raven progressive matrices. Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. New York: Cambridge University Building on his earlier work, Sternberg (1985) proposed Press. a triarchic theory of intelligence, according to which these Deary, I. J., and C. Stough. (1996). Intelligence and inspection information-processing components are applied to experi- time: achievements, prospects, and problems. American Psy- ence to adapt to, shape, and select environments. Intelli- chologist 51(6): 599–608. gence is best understood in terms of performance on either Estes, W. F. (1970). Learning Theory and Mental Development. relatively novel cognitive tasks or in terms of automatiza- New York: Academic Press. tion of performance on familiar tasks. Sternberg argued that Galton, F. (1883). Inquiry into Human Faculty and its Develop- intelligence comprises three major aspects: analytical, cre- ment. London: Macmillan. ative, and practical thinking. Gardner, H. (1983). Frames of Mind: The Theory of Multiple Intel- Howard Gardner (1983, 1995), in contrast, has ligences. New York: Basic. Gardner, H. (1995). Reflections on multiple intelligences: myths suggested that intelligence is not unitary, but rather and messages. Phi Delta Kappan 77: 200–203, 206–209. comprises eight distinct multiple intelligences: linguistic, Goleman, D. (1995). Emotional Intelligence. New York: Bantam. logical-mathematical, spatial, musical, bodily-kinesthetic, Haier, R. J., K. H. Nuechterlein, E. Hazlett, J. C. Wu, J. Pack, H. L. interpersonal, intrapersonal, and naturalist. Each of these Browning, and M. S. Buchsbaum. (1988). Cortical glucose intelligences is a distinct module in the brain and operates metabolic rate correlates of abstract reasoning and attention more or less independently of the others. Gardner has studied with positron emission tomography. Intelligence 12: offered a variety of kinds of evidence to support his 199–217. theory—including cognitive-scientific research—although Haier, R. J., B. Siegel, C. Tang, L. Abel, and M. S. Buchsbaum. he has not conducted research directly to test his model. (1992). Intelligence and changes in regional cerebral glucose Other theorists have tried directly to link information metabolic rate following learning. Intelligence 16: 415–426. Hunt, E. (1974). Quote the raven? Nevermore! In L. W. Gregg, Ed., processing to physiological processes in the brain. For Knowledge and Cognition. Hillsdale, NJ: Erlbaum, pp. 129–157. example, Haier and his colleagues (Haier et al. 1988; Haier Hunt, E., N. Frost, and C. Lunneborg. (1973). Individual differ- et al. 1992) have shown via POSITRON EMISSION TOMOGRA- ences in cognition: a new approach to intelligence. In G. PHY (PET) scans that brains of intelligent individuals gener- Bower, Ed., The Psychology of Learning and Motivation, vol. ally consume less glucose in doing complex tasks such as 7. New York: Academic Press, pp. 87–122. Raven matrices or the game of TETRIS, suggesting that the “Intelligence and its measurement”: A symposium. (1921). Journal greater expertise of intelligent people enables them to of Educational Psychology 12: 123–147, 195–216, 271–275. expend less effort on the tasks. Vernon and Mori (1992), Nickerson, R. S. (1994). The teaching of thinking and problem among others, have attempted directly to link measured solving. In R. J. Sternberg, Ed., Thinking and Problem Solving. speed of neural conduction to intelligence, although there is San Diego: Academic Press, pp. 409–449. Intelligent Agent Architecture 411 acting to achieve desired results. Clearly, when designing a Posner, M. I., and R. F. Mitchell. (1967). Chronometric analysis of classification. Psychological Review 74: 392–409. particular agent, many domain-specific features of the envi- Salovey, P., and J. D. Mayer. (1990). Emotional intelligence. Imag- ronment must be reflected in the detailed design of the ination, Cognition, and Personality 9: 185–211. agent. Still, the general form of the subsystems underlying Spearman, C. (1923). The Nature of “Intelligence” and the Princi- intelligent interaction with the environment may carry over ples of Cognition. 2nd ed. London: Macmillan. (1923 edition from domain to domain. Intelligent agent architectures reprinted in 1973 by Arno Press, New York.) attempt to capture these general forms and to enforce basic Spearman, C. (1927). The Abilities of Man. London: Macmillan. system properties such as soundness of reasoning, effi- Sternberg, R. J. (1977). Intelligence, Information Processing, and ciency of response, or interruptibility. Many architectures Analogical Reasoning: The Componential Analysis of Human have been proposed that emphasize one or another of these Abilities. Hillsdale, NJ: Erlbaum. properties, and these architectures can be usefully grouped Sternberg, R. J. (1985). Beyond IQ: A Triarchic Theory of Human Intelligence. New York: Cambridge University Press. into three broad categories: the deliberative, the reactive, or Sternberg, R. J. (1990). Metaphors of Mind: Conceptions of the the distributed. Nature of Intelligence. New York: Cambridge University Press. The deliberative approach, inspired in part by FOLK PSY- Sternberg, R. J. (1993). Sternberg Triarchic Abilities Test. Unpub- CHOLOGY, models agents as symbolic reasoning systems. In lished test. this approach, an agent is decomposed into data subsystems Sternberg, R. J., and D. K. Detterman, Eds. (1986). What is Intelli- that store symbolic, propositional representations, often cor- gence? Contemporary Viewpoints on its Nature and Definition. responding to commonsense beliefs, desires, and intentions, Norwood, NJ: Ablex. and processing subsystems responsible for perception, rea- Sternberg, R. J., R. K. Wagner, W. M. Williams, and J. A. Horvath. soning, planning, and execution. Some variants of this (1995). Testing common sense. American Psychologist 50(11): approach (Genesereth 1983; Russell 1991) emphasize for- 912–927. Thurstone, L. L. (1938). Primary mental abilities. Psychometric mal methods and resemble approaches from formal philoso- Monographs 1. phy of mind and action, especially with regard to soundness Vernon, P. A., and M. Mori. (1992). Intelligence, reaction times, of logical reasoning, KNOWLEDGE REPRESENTATION, and and peripheral nerve conduction velocity. Intelligence 8: 273– RATIONAL DECISION MAKING. Others (Newell 1990) empha- 288. size memory mechanisms, general PROBLEM SOLVING, and Wickett, J. C., and P. A. Vernon. (1994). Peripheral nerve conduc- search. Deliberative architectures go beyond folk psychol- tion velocity, reaction time, and intelligence: an attempt to rep- ogy and formal philosophy by giving concrete computa- licate Vernon and Mori. Intelligence 18: 127–132. tional interpretations to abstract processes of representation and reasoning. Ironically, the literal-minded interpretation Intelligent Agent Architecture of mental objects has also been a source of difficulty in building practical agents: symbolic reasoning typically Intelligent agent architecture is a model of an intelligent involves substantial search and is of high COMPUTATIONAL information-processing system defining its major sub- COMPLEXITY, and capturing extensive commonsense knowl- systems, their functional roles, and the flow of information edge in machine-usable form has proved difficult as well. and control among them. These problems represent significant challenges to the Many complex systems are made up of specialized deliberative approach and have stimulated researchers to subsystems that interact in circumscribed ways. In the investigate other paradigms that might address or sidestep biological world, for example, organisms have modular sub- them. systems, such as the circulatory and digestive systems, The reactive approach to intelligent-agent design, for presumably because nature can improve subsystems more example, begins with the intuition that although symbolic easily when interactions among them are limited (see, for reasoning may be a good model for certain cognitive pro- example, Simon 1969). These considerations apply as well cesses, it does not characterize well the information process- to artificial systems: vehicles have fuel, electrical, and ing involved in routine behavior such as driving, cooking, suspension subsystems; computers have central-processing, taking a walk, or manipulating everyday objects. These abil- mass-storage, and input-output subsystems; and so on. ities, simple for humans, remain distant goals for robotics When variants of a system share a common organization and seem to impose hard real-time requirements on an into subsystems, it is often useful to characterize abstractly agent. Although these requirements are not in principle the elements shared by all variants. For example, a family of inconsistent with deliberative architectures (Georgeff and integrated circuits might vary in clock speed or specialized Lansky 1987), neither are they guaranteed, and in practice data operations, while sharing a basic instruction set and they have not been easily satisfied. Proponents of the reac- memory model. In the engineering disciplines, the term tive approach, therefore, have argued for architectures that architecture has come to refer to generic models of shared insure real-time behavior as part of their fundamental structure. Architectures serve as templates, allowing design. Drawing on the mathematical and engineering tradi- designers to develop, refine, test, and maintain complex tion of feedback control, advocates of reactive architectures systems in a disciplined way. model agent and environment as coupled dynamic systems, The benefits of architectures apply to the design of intel- the inputs of each being the outputs of the other. The agent ligent agents as well. An intelligent agent is a device that contains behavioral modules that are self-contained feed- interacts with its environment in flexible, goal-directed back-control systems, each responsible for detecting states ways, recognizing important states of the environment and of the environment based on sensory data and generating 412 Intentional Stance appropriate output. The key is for state-estimation and out- Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Trans. Rob. Autom. 2: 14–23. put calculations to be performed fast enough to keep up with Genesereth, M. R. (1983). An overview of metalevel architecture. the sampling rates of the system. There is an extensive liter- Proceedings AAAI 83: 119–123. ature on how to build such behaviors (control systems) when Georgeff, M., and A. Lansky. (1987). Reactive reasoning and plan- a mathematical description of the environment is available ning. Proceedings AAAI 87. and is of the proper form; reactive architectures advance Kaelbling, L. (1988). Goals as parallel program specification. Pro- these traditional control methods by describing how com- ceedings AAAI 88. plex behaviors might be built out of simpler ones (Brooks Miller, G., E. Galanter, and K. H. Pribram. (1960). Plans and the 1986), either by switching among a fixed set of qualitatively Structure of Behavior. New York: Henry Holt and Company. different behaviors based on sensed conditions (see Miller, Newell, A. (1990). Unified Theories of Cognition. Cambridge, Galanter, and Pribram 1960 for precursors), by the hierar- MA: Harvard University Press. chical arrangement of behaviors (Albus 1992), or by some Russell, S., and E. Wefald. (1991). Do the Right Thing. Cambridge, more intricate principle of composition. Techniques have MA: MIT Press. Simon, H. A. (1969). The Sciences of the Artificial. Cambridge, also been proposed (Kaelbling 1988) that use off-line sym- bolic reasoning to derive reactive behavior modules with MA: MIT Press. guaranteed real-time on-line performance. A third architectural paradigm, explored by researchers Intentional Stance in distributed artificial intelligence, is motivated by the fol- lowing observation. A local subsystem integrating sensory The intentional stance is the strategy of interpreting the data or generating potential actions may have incomplete, behavior of an entity (person, animal, artifact, or the like) by uncertain, or erroneous information about what is happen- treating it as if it were a rational agent that governed its ing in the environment or what should be done. But if there “choice” of “action” by a “consideration” of its “beliefs” are many such local nodes, the information may in fact be and “desires.” The distinctive features of the intentional present, in the aggregate, to assess a situation correctly or stance can best be seen by contrasting it with two more basic select an appropriate global action policy. The distributed stances or strategies of prediction, the physical stance and approach attempts to exploit this observation by decompos- the design stance. The physical stance is simply the standard ing an intelligent agent into a network of cooperating, com- laborious method of the physical sciences, in which we use municating subagents, each with the ability to process whatever we know about the laws of physics and the physi- inputs, produce appropriate outputs, and store intermediate cal constitution of the things in question to devise our pre- states. The intelligence of the system as a whole arises from diction. When I predict that a stone released from my hand the interactions of all the system’s subagents. This approach will fall to the ground, I am using the physical stance. For gains plausibility from the success of groups of natural things that are neither alive nor artifacts, the physical stance intelligent agents, for example, communities of humans, is the only available strategy. Every physical thing, whether who decompose problems and then reassemble the solu- designed or alive or not, is subject to the laws of physics and tions, and from the parallel, distributed nature of neural hence behaves in ways that can be explained and predicted computation in biological organisms. Although it may be from the physical stance. If the thing I release from my hand stretching the agent metaphor to view an individual neuron is an alarm clock or a goldfish, I make the same prediction as an intelligent agent, the idea that a collection of units about its downward trajectory, on the same basis. might solve one subproblem while other collections solve Alarm clocks, being designed objects (unlike the rock), others has been an attractive and persistent theme in agent are also amenable to a fancier style of prediction—predic- design. tion from the design stance. Suppose I categorize a novel Intelligent-agent research is a dynamic activity and is object as an alarm clock: I can quickly reason that if I much influenced by new trends in cognitive science and depress a few buttons just so, then some hours later the computing; developments can be anticipated across a broad alarm clock will make a loud noise. I do not need to work front. Theoretical work continues on the formal semantics out the specific physical laws that explain this marvelous of MENTAL REPRESENTATION, models of behavior composi- regularity; I simply assume that it has a particular design— tion, and distributed problem solving. Practical advances the design we call an alarm clock—and that it will function can be expected in programming tools for building agents, properly, as designed. Design-stance predictions are riskier as well as in applications (spurred largely by developments than physical-stance predictions, because of the extra in computer and communications technology) involving assumptions I have to take on board: that an entity is intelligent agents in robotics and software. designed as I suppose it to be, and that it will operate accord- See also BEHAVIOR-BASED ROBOTICS; COGNITIVE ARCHI- ing to that design—that is, it will not malfunction. Designed TECTURE; FUNCTIONAL DECOMPOSITION; MODULARITY OF things are occasionally misdesigned, and sometimes they MIND; MULTIAGENT SYSTEMS break. But this moderate price I pay in riskiness is more than —Stanley J. Rosenschein compensated for by the tremendous ease of prediction. An even riskier and swifter stance is the intentional References stance, a subspecies of the design stance, in which the designed thing is an agent of sorts. An alarm clock is so Albus, J. S. (1992). RCS: A reference model architecture for intel- simple that this fanciful anthropomorphism is, strictly ligent control. IEEE Comput. 25(5): 56–59. Intentionality 413 instrumentalist strategy, not a theory of real or genuine speaking, unnecessary for our understanding of why it does belief, this common misapprehension has been extensively what it does, but adoption of the intentional stance is more discussed and rebutted in subsequent accounts (Dennett useful—indeed, well-nigh obligatory—when the artifact in 1987, 1991, 1996). question is much more complicated than an alarm clock. Consider chess-playing computers, which all succumb See also COGNITIVE DEVELOPMENT; COGNITIVE ETHOL- neatly to the same simple strategy of interpretation: just OGY; FOLK PSYCHOLOGY; INTENTIONALITY; PROPOSITIONAL think of them as rational agents that want to win, and that ATTITUDES; RATIONAL AGENCY; REALISM AND ANTIREALISM know the rules and principles of chess and the positions of the pieces on the board. Instantly your problem of predict- —Daniel Dennett ing and interpreting their behavior is made vastly easier than it would be if you tried to use the physical or the design References stance. At any moment in the chess game, simply look at the Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and chessboard and draw up a list of all the legal moves avail- Theory of Mind. Cambridge, MA: MIT Press. able to the computer when it is its turn to play (there will Byrne, R., and A. Whiten. (1991). Machiavellian Intelligence: usually be several dozen candidates). Now rank the legal Social Expertise and the Evolution of Intellect in Monkeys, moves from best (wisest, most rational) to worst (stupidest, Apes and Humans. New York: Oxford University Press. most self-defeating), and make your prediction: the com- Dennett, D. (1971). Intentional systems. Journal of Philosophy 68: puter will make the best move. You may well not be sure 87–106. what the best move is (the computer may “appreciate” the Dennett, D. (1983). Intentional systems in cognitive ethology: the situation better than you do!), but you can almost always “panglossian paradigm” defended. Behavioral and Brain Sci- ences 6: 343–390. eliminate all but four or five candidate moves, which still Dennett, D. (1987). The Intentional Stance. Cambridge, MA: gives you tremendous predictive leverage. Bradford Books/MIT Press. The intentional stance works (when it does) whether or Dennett, D. (1991). Real patterns. Journal of Philosophy 87: 27– not the attributed goals are genuine or natural or “really 51. appreciated” by the so-called agent, and this tolerance is Dennett, D. (1996). Kinds of Minds. New York: Basic Books. crucial to understanding how genuine goal-seeking could be Leslie, A. (1991). The theory of mind impairment in autism: evi- established in the first place. Does the macromolecule really dence for a modular mechanism of development? In A. Whiten, want to replicate itself? The intentional stance explains what Ed., Natural Theories of Mind. Oxford: Blackwell. is going on, regardless of how we answer that question. Consider a simple organism—say a planarian or an amoeba Intentionality —moving nonrandomly across the bottom of a laboratory dish, always heading to the nutrient-rich end of the dish, or The term intentional is used by philosophers, not as apply- away from the toxic end. This organism is seeking the good, ing primarily to actions, but to mean “directed upon an or shunning the bad—its own good and bad, not those of object.” More colloquially, for a thing to be intentional is some human artifact-user. Seeking one’s own good is a fun- for it to be about something. Paradigmatically, mental states damental feature of any rational agent, but are these simple and events are intentional in this technical sense (which organisms seeking or just “seeking”? We do not need to originated with the scholastics and was reintroduced in answer that question. The organism is a predictable inten- modern times by FRANZ BRENTANO). For instance, beliefs tional system in either case. and desires and regrets are about things, or have “intentional By exploiting this deep similarity between the sim- objects”: I have beliefs about Boris Yeltsin, I want a beer plest—one might as well say mindless—intentional systems and world peace, and I regret agreeing to write so many and the most complex (ourselves), the intentional stance encyclopedia articles. also provides a relatively neutral perspective from which to A mental state can have as intentional object an individ- investigate the differences between our minds and simpler ual (John loves Marsha), a state of affairs (Marsha thinks minds. For instance, it has permitted the design of a host of that it’s going to be a long day) or both at once (John wishes experiments shedding light on whether other species, or Marsha were happier). Perception is intentional: I see John, young children, are capable of adopting the intentional and that John is writing Marsha’s name in his copy of Ver- stance—and hence are higher-order intentional systems. bal Behavior. The computational states and representations Although imaginative hypotheses about “theory of mind posited by cognitive psychology and other cognitive sci- modules” (Leslie 1991) and other internal mechanisms ences are intentional also, inasmuch as in the course of com- (e.g., Baron-Cohen 1995) to account for these competences putation something gets computed and something gets have been advanced, the evidence for the higher-order com- represented. (An exception here may be states of NEURAL petences themselves must be adduced and analyzed inde- NETWORKS, which have computational values but arguably pendently of these proposals, and this has been done by not representata.) cognitive ethologists (Dennett 1983; Byrne and Whiten What is at once most distinctive and most philosophi- 1991) and developmental psychologists, among others, cally troublesome about intentionality is its indifference to using the intentional stance to generate the attributions that reality. An intentional object need not actually exist or in turn generate testable predictions of behavior. obtain: the Greeks worshiped Zeus; a friend of mine Although the earliest definition of the intentional stance believes that corks grow on trees; and even if I get the beer, (Dennett 1971) suggested to many that it was merely an 414 Intentionality my desire for world peace is probably going to go unful- Fodor (1975, 1981) have argued that intentional states are filled. just physical states that have semantical properties, and the Brentano argued both (A) that this reality-neutral feature existent-or-nonexistent states of affairs that are their objects of intentionality makes it the distinguishing mark of the are just representational contents. mental, in that all and only mental things are intentional in The main difficulty for this representationalist account is that sense, and (B) that purely physical or material objects that of saying exactly how a physical item’s representational cannot have intentional properties—for how could any content is determined; in virtue of what does a neurophysio- purely physical entity or state have the property of being logical state represent precisely that the Republican candi- “directed upon” or about a nonexistent state of affairs? (A) date will win? An answer to that general question is what and (B) together imply the Cartesian dualist thesis that no Fodor has called a “psychosemantics”; the question itself mental thing is also physical. And each is controversial in has also been called the “symbol grounding problem.” Sev- its own right. eral attempts have been made on it (Devitt 1981; Millikan Thesis (A) is controversial because it is hardly obvious 1984; Block 1986; Dretske 1988; Fodor 1987, 1990). that every mental state has a possibly nonexistent inten- One serious complication is that, surprisingly, ordinary tional object; bodily sensations such as itches and tickles do propositional attitude contents do not seem to be determined not seem to, and free-floating anxiety is notorious in this by the states of their subjects’ nervous systems, not even by regard. Also, there seem to be things other than mental the total state of their subjects’ entire bodies. Putnam’s states and events that “aim at” possibly nonexistent objects. (1975) TWIN EARTH and indexical examples are widely taken Linguistic items such as the name “Santa Claus” are an to show that, surprising as it may seem, two human beings obvious example; paintings and statues portray fictional could be molecule-for-molecule alike and still differ in their characters; and one might ignorantly build a unicorn trap. beliefs and desires, depending on various factors in their spa- More significantly, behavior as usually described is inten- tial and historical environments. (For dissent, however, see tional also: I reach for the beer; John sends a letter to Mar- Searle 1983.) Thus we can distinguish between “narrow” sha; Marsha throws the letter at the cat; Macbeth tries to properties, those that are determined by a subject’s intrinsic clutch the dagger he sees. (Though some philosophers, such physical composition, and “wide” properties, those that are as Chisholm 1958 and Searle 1983, argue that the aboutness not so determined, and representational contents are wide. of such nonmental things as linguistic entities and behavior So it seems an adequate psychosemantics cannot limit its is second-rate because it invariably derives from the more resources to narrow properties such as internal functional or fundamental intentionality of someone’s mental state.) computational roles; it must specify some scientifically Dualism and immaterialism about the mind are unpopu- accessible relations between brain and environment. lar both in philosophy and in psychology—certainly cogni- (Though some theorists continue to maintain that a narrow tive psychologists do not suppose that the computational notion of content—see NARROW CONTENT—and accordingly and representational states they posit are states of anything a narrow psychosemantics are needed and will suffice for but the brain—so we have strong motives for rejecting the- cognitive science; see Winograd 1972; Johnson-Laird 1977; sis (B) and finding a way of explaining how a purely physi- and Fodor 1987. A few maintain the same for the everyday cal organism can have intentional states. (Though some propositional attitudes; see Loar 1988; Devitt 1990.) behaviorists in psychology and eliminative materialists in A second and perhaps more serious obstacle to the repre- philosophy have taken the bolder step of simply denying sentational view of thinking is that the objects of thought that people do in fact ever have intentional states; see need not be in the environment at all. They may be abstract; BEHAVIORISM and ELIMINATIVE MATERIALISM.) The taxon- one can think about a number, or about an abstruse theolog- omy of such explanations is now fairly rich. It divides first ical property, and as always they may be entirely unreal. between theories that ascribe intentionality to presumed (The same things are true of representations posited by cog- particular states of the brain and those that attribute inten- nitive psychology.) An adequate psychosemantics must deal tional states only to the whole subject. just as thoroughly with Arthur’s illiterate belief that the Many theorists, especially those influenced by cognitive number of the Fates was six, and with a visual system’s hal- science, do believe that not only the intentionality of cogni- lucinatory detection of an edge that isn’t really there, as tive computational states but also that of everyday inten- much as with a real person’s seeing and wanting to eat a tional attitudes such as beliefs and desires (also called muffin that is right in front of her. PROPOSITIONAL ATTITUDES) inhere in states of the brain. On In view of the foregoing troubles and for other reasons as this view, all intentionality is at bottom MENTAL REPRESEN- well, other philosophers have declined to ascribe intention- TATION, and propositional attitudes have Brentano’s feature ality to particular states of subjects, and they insist that because the internal physical states and events that realize ascriptions of commonsense intentional attitudes, at least, them represent actual or possible states of affairs. Some evi- are not about inner states at all, much less about internal dence for this is that intentional features are semantical fea- causes of behavior. Some such theories maintain just that tures: Like undisputed cases of representation, beliefs are the attitudes are states, presumably physical states, of a true or false; they entail or imply other beliefs; they are (it whole person (Strawson 1959; McDowell 1994; Baker seems) composed of concepts and depend for their truth on 1995; Lewis 1995). Others are overtly instrumentalist: Phi- a match between their internal structures and the way the losophers influenced by W.V. Quine (1960) or by continen- world is; and so it is natural to regard their aboutness as a tal hermeneuticists maintain that what a subject believes or matter of mental referring or designation. Sellars (1963) and desires is entirely a matter of how that person is interpreted Intersubjectivity 415 or translated into someone else’s preferred idiom for one Loar, B. (1988). Social content and psychological content. In R. Grimm and D. Merrill, Eds., Contents of Thought. Tucson: Uni- purpose or another, there being no antecedent or inner fact versity of Arizona Press, pp. 99–110. of the matter. A distinctive version of this view is that of McDowell, J. (1994). Mind and World. Cambridge, MA: Harvard Donald Davidson (1970) and D. C. Dennett (1978, 1987), University Press. who hold that intentional ascriptions express nonfactual, Millikan, R. G. (1984). Language, Thought, and Other Biological normative calculations that help to predict behavior but not Categories. Cambridge, MA: MIT Press/Bradford Books. in the same way as the positing of inner mechanisms does— Putnam, H. (1975). The meaning of “meaning.” In K. Gunderson, in particular, not causally (see INTENTIONAL STANCE). Such Ed., Minnesota Studies in the Philosophy of Science, vol. 7: views are usually defended epistemologically, by reference Language, Mind and Knowledge. Minneapolis: University of to the sorts of evidence we use in ascribing propositional Minnesota Press. attitudes. Quine, W. V. (1960). Word and Object. Cambridge, MA: MIT Press. Perhaps suspiciously, the instrumentalist views are not Searle, J. R. (1983). Intentionality. Cambridge: Cambridge Univer- usually extrapolated to the aboutness of perceptual states or sity Press. of representations posited by cognitive scientists; they are Sellars, W. (1963). Science, Perception, and Reality. London: Rou- restricted to commonsense beliefs and desires. They do shed tledge and Kegan Paul. the burden of psychosemantics, that is, of explaining how a Strawson, P. F. (1959). Individuals. London: Methuen and Co. particular brain state can have a particular content, but they Winograd, T. (1972). Understanding Natural Language. New do no better than did the representationalist views in York: Academic Press. explaining how thoughts can be about abstracta or about nonexistents. Further Readings See also INFORMATIONAL SEMANTICS; MENTAL CAUSA- Chisholm, R. M. (1967). Intentionality. In P. Edwards, Ed., Ency- TION; MIND-BODY PROBLEM; PHYSICALISM clopedia of Philosophy. London: Macmillan. Lycan, W. G. (1988). Judgement and Justification, Part 1. Cam- —William Lycan bridge: Cambridge University Press. Perry, J. (1995). Intentionality (2). In S. Guttenplan, Ed., A Com- References panion to the Philosophy of Mind. Oxford: Blackwell, pp. 386– 395. Baker, L. R. (1995). Explaining Attitudes. Cambridge: Cambridge Searle, J. (1995). Intentionality (1). In S. Guttenplan, Ed., A Com- University Press. panion to the Philosophy of Mind. Oxford: Blackwell, pp. 379– Block, N. J. (1986). Advertisement for a semantics for psychology. 386. In P. French, T. E. Uehling, and H. Wettstein, Eds., Midwest Sterelny, K. (1990). The Representational Theory of Mind: An Studies, vol. 10: Studies in the Philosophy of Mind. Minneapo- Introduction. Oxford: Blackwell. lis: University of Minnesota Press, pp. 615–678. Chisholm, R. M. (1958). Sentences about believing. In H. Feigl, M. Scriven, and G. Maxell, Eds., Minnesota Studies in the Phi- Internalism losophy of Science, vol. 2. Minneapolis: University of Minne- sota Press, pp. 510–520. Davidson, D. (1970). Mental events. In L. Foster and J. W. Swan- See INDIVIDUALISM son, Eds., Experience and Theory. Amherst: University of Mas- sachusetts Press, pp. 79–101. Interpretation Dennett, D. C. (1978). Brainstorms. Montgomery, VT: Bradford Books. Dennett, D. C. (1987). The Intentional Stance. Cambridge, MA: See DISCOURSE; PRAGMATICS; RADICAL INTERPRETATION; Bradford Books/MIT Press. SENTENCE PROCESSING Devitt, M. (1981). Designation. New York: Columbia University Press. Intersubjectivity Devitt, M. (1990). A narrow representational theory of the mind. In W. G. Lycan, Ed., Mind and Cognition. Oxford: Blackwell, pp. 371–398. Intersubjectivity is the process in which mental activity— Dretske, F. (1988). Explaining Behavior. Cambridge, MA: Brad- including conscious awareness, motives and intentions, cog- ford Books/MIT Press. nitions, and emotions—is transferred between minds. ANI- Fodor, J. A. (1975). The Language of Thought. Hassocks, England: MAL COMMUNICATION and cooperative social life require Harvester Press. Fodor, J. A. (1981). RePresentations. Cambridge, MA: Bradford intersubjective signaling (Marler, Evans, and Hauser 1992). Books/MIT Press. Individuals must perceive and selectively respond to the Fodor, J. A. (1987). Psychosemantics. Cambridge, MA: Bradford motives, interests, and emotions behind perceived move- Books/MIT Press. ment in bodies of other animals, especially in conspecifics. Fodor, J. A. (1990). A Theory of Content and Other Essays. Cam- Such communication has attained a new level of complexity bridge, MA: Bradford Books/MIT Press. in human communities, with their consciousness of collec- Johnson-Laird, P. (1977). Procedural semantics. Cognition 5: 189– tively discovered cultural meanings. 214. Human intersubectivity manifests itself as an immediate Lewis, D. (1995). Lewis, David: reduction of mind. In S. Gutten- sympathetic awareness of feelings and conscious, purposeful plan, Ed., A Companion to the Philosophy of Mind. Oxford: intelligence in others. It is transmitted by body movements Blackwell, pp. 412–431. 416 Intersubjectivity (especially of face, vocal tract, and hands) that are adapted to environmental AFFORDANCES and objects of purposeful give instantaneous visual, auditory, or tactile information action (Trevarthen and Hubley 1978). Before they possess about purposes, interests, and emotions and symbolic ideas any verbalizable THEORY OF MIND, children share purposes active in subjects’ minds. On it depends cultural learning, and their consequences through direct other-awareness of and the creation of a “social reality,” of conventional beliefs, persons’ interests and moods (Reddy et al. 1997). Acquired languages, rituals, and technologies. Education of children is beliefs and concepts of a young child are redescriptions of rooted in preverbal, mimetic intersubjectivity. Human lin- narrativelike patterns of intention and consciousness that guistic dialogue also rests on intersubjective awareness, as can be shared, without verbal or rational analysis, in a famil- do the phenomena of “self-awareness” in society. A psychol- iar “common sense” world. Narrative expression by rhyth- ogy of intersubjectivity concerns itself with analysis of this mic posturing and gesturing with prosodic or melodic innate capacity for intimate and efficient intermental cou- vocalisation, or “mimesis,” may have been a step in the pre- pling, and attempts to assess what must be learned, through linguistic evolution of hominid communication and cogni- imitation or instruction, to advance intelligent cooperation. tion (Donald 1991). Pretense and socially demonstrated Research on communication with infants and young chil- METACOGNITION is natural in infant play, and imitation of dren proves the existence in the developing human brain of pretend actions and attitudes is essential in the development emotional and cognitive regulators for companionship in of imaginative representational play in toddlers of modern thought and purposeful action (Aitken and Trevarthen Homo sapiens. 1997). The theory of innate intersubjectivity (Trevarthen Language and other symbolic conventions enrich inter- 1998), like the theory of the virtual other (Braten 1988), subjectivity, generating and storing limitless common mean- invites new concepts of language and thinking, as well as of ing and strategies of thought, but they do not constitute the music and all temporal arts, and it requires deep examina- basis for interpersonal awareness. Rather, as Wittgenstein tion of cognitive processing models of CONSCIOUSNESS. perceived, the reverse is the case—all language develops Infants demonstrate that they perceive persons as essen- from experience negotiated, with emotion, in intersubjectiv- tially different “objects” from anything nonliving and non- ity, whatever innate predispositions there may be to acquire human (Legerstee 1992; Trevarthen 1998). They are language qua language (see PRAGMATICS). The acquisition acutely sensitive to time patterns in human movement of syntax is derived from expressive sequences that are per- (manifestations of TIME IN THE MIND), and can react in syn- ceived as emotional narratives in game rituals (Bruner chrony, or with complementary “attunement” of motives 1983). Word meaning is acquired by imitation in narrative and feelings (Stern 1985, 1993; Trevarthen, Kokkinaki, and exchanges modulated by dynamic affects and expressions of Fiamenghi 1998). Dynamic forms of vocal, facial, and ges- interest, intention, and feeling deployed by the child and tural emotional expression are recognized and employed in companions in a familiar, common world (Locke 1993; see interactions with other persons from birth, before inten- WORD MEANING, ACQUISITION OF). tional use of objects is effective. Scientific research into the Intersubjective sympathy is shown when persons syn- earliest orientations, preferences, and intentional actions of chronize or alternate their motor impulses, demonstrating newborns when they encounter evidence of a person, and “resonance” of motives that have matching temporospatial their capacities for IMITATION, prove that the newborn dimensions (rhythms and morphology or “embodiment”) in human is ready for, and needs, mutually regulated intersub- all individuals, and that, consequently, can be perceived by jective transactions (Kugiumutzakis 1998). Infants do not one to forecast the other’s acts and their perceptual conse- acquire intersubjective powers in “pseudo dialogues” by quences (Trevarthen 1998). Psychophysical and physiologi- being treated “as if” they wish to express intentions, cal research proves that cognitive processes of perceptual thoughts, and feelings (Kaye 1982), but they do find satis- information uptake, thoughts, and memories are organized fying communication only with partners who accept that by intrinsically generated “motor images” or “dynamic they have such powers, because such acceptance releases forms” (Jeannerod 1994). The same motor forms are dem- intuitive patterns of “parenting” behavior that infants can onstrated in communication. The discovery of “mirror neu- be aware of, and with which they can enter into dialogue rons” in the ventral premotor cortex of monkeys, which (Papoušek and Papoušek 1987). Infants’ emotional well- discharge both when the monkey grasps or manipulates being depends on a mutual regulation of consciousness something and when the human experimenter makes similar with affectionate companions (Tronick and Weinberg manipulations, indicates how “self” and “other,” or observer 1997). Events in the infant-adult “dynamic system” (Fogel and actor, can be matched, and it helps explain how the and Thelen 1987) are constrained by intrinsic human psy- essential intersubjective neural mechanisms of communica- chological motives on both sides (Aitken and Trevarthen tion by language may have evolved (Rizzolatti and Arbib 1997). These intrinsic constraints are psychogenic adapta- 1998). Conscious monitoring of intersubjective motives is tions for cultural learning. asymmetric; in normal circumstances we are more aware of Human knowledge begins in exchange and combination others’ feelings and intentions than of our own inner states. of purposes between the young child and more experienced However, we do not have to be in the presence of others to companions—in “joint attention” (Tomasello 1988). A “pri- have them in mind. As Adam Smith explained, the moral mary intersubjectivity” is active, in “protoconversational” sense, built around representations of an innate sympathy, play, soon after birth (Bateson 1975; Trevarthen 1979), and takes the form of an “other-awareness” or “conscience.” In this develops by the end of the first year into “secondary his words, “When I endeavour to examine my own conduct, intersubjectivity”—sympathetic intention toward shared I divide myself, as it were, into two persons. The first is the Intersubjectivity 417 spectator . . . the second is the agent, the person whom I Psychoanalysis, though interested in rational representa- properly call myself, and of whose conduct, under the char- tions of self and others by introjection and projection, or acter of the spectator, I was endeavouring to form an opin- transference and countertransference, seeks intersubjective ion.” (Smith 1759: 113, 6). explanations for psychopathology, and for the relation of The aptitude of human minds for imitation does not mean, disorders in child development to the acquisition of an as social learning theory asserts, that a SELF-KNOWLEDGE is emotionally communicated self-awareness (Stern 1985). Theories of childhood AUTISM, a challenging but highly possible only as a result of learning how others see us, and how to talk about oneself. Impulses of purpose and interest, instructive pathology, necessarily confront hypotheses con- with the emotions that evaluate their prospects and urgency, cerning how engagement with mind states is normally pos- have, as motives, similar status for the self as for others, and sible between individuals from infancy (Hobson 1993). independently of culture. This consequence of immediate, As the political offspring of the subjective first position, innate intersubjective sympathy is overlooked by empiri- individualism sees a society as animated by stressful en- cists, including theorists of ECOLOGICAL PSYCHOLOGY, in counters between competitors who survive and prosper by their accounts of how perceptual information is taken up by Machiavellian deceit. Positive relationships or attachments behaving subjects, individually. There is a historical reason merely serve to build alliances that increase chances of suc- for the idea of the separate self. cess for their members in competition with the rest of the The Western philosophical tradition (exemplified by RENÉ group. Like the misnamed “social Darwinism,” SOCIOBIOL- DESCARTES and IMMANUEL KANT) generally assumes that OGY is founded on the tacit belief that there is no natural human minds are inherently separate in their purposes and ALTRUISM, no capacity to link purposes for collective goals experiences, seeking rational clarity, autonomous skills, and that are valued because they are products of intuitive sympa- self-betterment. Subjects take in information for conscious- thy and cooperative awareness. Closer study of animal ness of reality or truth, become aware of other people as behaviors shows contrary evidence (De Waal 1996). The objects that have particular properties and affordances. They evidence that infants learn by emotional referencing to eval- construct an awareness of the SELF in society, but remain sin- uate experiences through attunement with motives of famil- gle subjectivities. Interpersonal life and collective under- iar companions, for whom they have, and from whom they standing result from individuals communicating thoughts in receive, affectionate regard, proves that it is the sense of language, the grammar of which is derived from innate ratio- individuality in society that is the derived state of mind, nal processes. The rational individual has both self-serving developed in contrast to more fundamental intersubjective emotions and instinctive “biological” reactions to others, needs and obligations. which must be regulated by conventional rules of socially —Colwyn Trevarthen acceptable behavior. With good management of education and social government, individuals learn how to negotiate References with social partners and converge in awareness of transcen- dental universals in their individual consciousness and pur- Aitken, K. J., and C. Trevarthen. (1997). Self-other organization in poses, to their mutual benefit. We will call this view of human psychological development. Development and Psycho- intelligent and civilized cooperation as an artificial acquisi- pathology 9: 651–675. tion the “extrinsic intersubjectivity” or “subjective first” Bateson, M. C. (1975). Mother-infant exchanges: the epigenesis position. Psychology and cognitive science analyze aware- of conversational interaction. In D. Aaronson and R. W. Rie- ness, thinking, learning, and skill in action from this position, ber, Eds., Developmental Psycholinguistics and Communica- which must find immediate intersubjectivity difficult to tion Disorders (Annals of the New York Academy of Sciences, vol. 263. New York: New York Academy of Sciences, pp. 101– explain. 113. A different conception of human consciousness, exhibit- Bråten, S. (1988). Dialogic mind: the infant and adult in protocon- ing affinity with some philosophies of religious experience, versation. In M. Cavallo, Ed., Nature, Cognition and System. perceives interpersonal awareness, cooperative action in Dordrecht: Kluwer Academic, pp. 187–205. society, and cultural learning as manifestations of innate Bruner, J. S. (1983). Child’s Talk: Learning to Use Language. New motives for sympathy in purposes, interests, and feelings— York: Norton. that is, that a human mind is equipped with needs for dia- De Waal, F. (1996). Good Natured: The Origins of Right and logic, intermental engagement with other similar minds Wrong in Human and Other Animals. Cambridge, MA: Har- (Jopling 1993). Notions of moral obligation are conceived vard University Press. as fundamental to the growth of consciousness of meaning Donald, M. (1991). Origins of the Modern Mind. Cambridge and London: Harvard University Press. in every human being. Language is neither just a learned Fogel, A., and E. Thelen. (1987). Development of early expressive skill, nor the product of an instinct for generating grammati- action from a dynamic systems approach. Developmental Psy- cal structures to model subjective agency and to categorize chology 23: 747–761. awareness of objects. It emerges as an extension of the natu- Hobson, P. (1993). Autism and the Development of Mind. Hills- ral growth of a sympathy in purposes and experiences that is dale, NJ: Erlbaum. already clearly demonstrated in the mimetic narratives of an Jeannerod, M. (1994). The representing brain: Neural correlates of infant’s play in reciprocal attention with an affectionate motor intention and imagery. Behavioral and Brain Sciences companion. We will call this view of how human coopera- 17: 187–245. tive awareness arises the “intrinsic intersubjectivity” or Jopling, D. (1993). Cognitive science, other minds, and the phi- “intersubjective first” position. losophy of dialogue. In U. Neisser, Ed., The Perceived Self: 418 Intersubjectivity Ecological and Interpersonal Sources of Self-Knowledge. New Barnes, B. (1995). The Elements of Social Theory. London: UCL York: Cambridge University Press, pp. 290–309. Press. Kaye, K. (1982). The Mental and Social Life of Babies. Chicago: Beebe, B., J. Jaffe, S. Feldstein, K. Mays, and D. Alson. (1985). University of Chicago Press. Inter-personal timing: the application of an adult dialogue Kugiumutzakis, G. (1998). Neonatal imitation in the intersubjec- model to mother-infant vocal and kinesic interactions. In F. M. tive companion space. In S. Bråten, Ed., Intersubjective Com- Field and N. A. Fox, Eds., Social Perception in Infants. Nor- munication and Emotion in Early Ontogeny. Cambridge: wood, NJ: Ablex, 217–248. Cambridge University Press. Bretherton, I., S. McNew, and M. Beeghley-Smith. (1981). Early Legerstee, M. (1992). A review of the animate-inanimate distinc- person knowledge as expressed in gestural and verbal commu- tion in infancy: Implications for models of social and cognitive nication: when do infants acquire “theory of mind”? In M. E. knowing. Early Development and Parenting 1: 59–67. Lamb and L. R. Sherrod, Eds., Infant Social Cognition. Hills- Locke, J. L. (1993). The Child’s Path to Spoken Language. Cam- dale, NJ: Erlbaum. bridge, MA: Harvard University Press. Bruner, J. S. (1996). The Culture of Education. Cambridge, MA: Marler, P., C. S. Evans, and M. C. Hauser. (1992). Animal signals: Harvard University Press. motivational, referential or both? In H. Papoušek, U. Jurgens, Donaldson, M. (1978). Children’s Minds. Glasgow: Fontana/Col- and M. Papoušek, Eds., Nonverbal Vocal Communication: lins. Comparative and Developmental Aspects. Cambridge: Cam- Halliday, M. A. K. (1975). Learning How to Mean: Explorations bridge University Press, pp. 66–86. in the Development of Language. London: Edward Arnold. Papoušek, H., and M. Papoušek. (1987). Intuitive parenting: a dia- Hondereich, T., Ed. (1995). The Oxford Companion to Philosophy. lectic counterpart to the infant’s integrative competence. In J. Oxford and New York: Oxford University Press. D. Osofsky, Ed., Handbook of Infant Development. 2nd ed. Leslie, A. (1987). Pretense and representation: the origins of “the- New York: Wiley, pp. 669–720. ory of mind.” Psychological Review 94: 412–426. Reddy, V., D. Hay, L. Murray, and C. Trevarthen. (1997) Commu- Meltzoff, A. N. (1985). The roots of social and cognitive develop- nication in infancy: mutual regulation of affect and attention. In ment: models of man’s original nature. In T. M. Field and N. A. G. Bremner, A. Slater, and G. Butterworth, Eds., Infant Devel- Fox, Eds., Social Perception in Infants. Norwood, NJ: Ablex, opment: Recent Advances. Hillsdale, NJ: Erlbaum, pp. 247–273. pp. 1–30. Rizzolatti, A., and M. A. Arbib. (1998). Language within our Nadel, J., and A. Pezé. (1993). Immediate imitation as a basis for grasp. Trends in the Neurosciences 21: 188–194. primary communication in toddlers and autistic children. In J. Smith, A. (1759). Theory of Moral Sentiments. Edinburgh. Modern Nadel and L. Camioni, Eds., New Perspectives in Early Com- edition: D. D. Raphael and A. L. Macfie, Gen. Eds., Glasgow municative Development. London: Routledge, 139–156. Edition. Oxford: Clarendon, 1976. Reprint, Indianapolis: Lib- Neisser, U. (1993). The self perceived. In U. Neisser, Ed., The Per- erty Fund, 1984. ceived Self: Ecological and Interpersonal Sources of Self- Stern, D. N. (1985). The Interpersonal World of the Infant: A View Knowledge. New York: Cambridge University Press, pp. 3–21. from Psychoanalysis and Developmental Psychology. New Rogoff, B. (1990). Apprenticeship in Thinking: Cognitive Develop- York: Basic Books. ment in Social Context. New York: Oxford University Press. Stern, D. N. (1993). The role of feelings for an interpersonal self. Ryan, J. (1974). Early language development: towards a communi- In U. Neisser, Ed., The Perceived Self: Ecological and Interper- cational analysis. In M. P. M. Richards, Ed., The Integration of sonal Sources of Self-Knowledge. New York: Cambridge Uni- a Child into a Social World. London: Cambridge University versity Press, pp. 205–215. Press, pp. 185–213. Tomasello, M. (1988). The role of joint attentional processes in Schore, A. (1994). Affect Regulation and the Origin of the Self: early language development. Language Sciences 10: 69–88. The Neurobiology of Emotional Development. Hillsdale, NJ: Trevarthen, C. (1979). Communication and cooperation in early Erlbaum. infancy. A description of primary intersubjectivity. In M. Searle, J. R. (1995). The Construction of Social Reality. New York: Bullowa, Ed., Before Speech: The Beginning of Human Com- Simon and Schuster. munication. London: Cambridge University Press, pp. 321–347. Sluga, H. (1996). Ludwig Wittgenstein: life and work. An introduc- Trevarthen, C. (1998). The concept and foundations of infant inter- tion. In H. Sluga and D. G. Stern, Eds., The Cambridge Compan- subjectivity. In S. Bråten, Ed., Intersubjective Communication ion to Wittgenstein. Cambridge: Cambridge University Press. and Emotion in Early Ontogeny. Cambridge: Cambridge Uni- Tomasello, M., A. C. Kruger, and H. H. Ratner. (1993). Cultural versity Press, pp. 15–46. learning. Behavioral and Brain Sciences 16(3): 495–552. Trehub, S. E., L. J. Trainor, and A. M. Unyk. (1993). Music and Trevarthen, C., and P. Hubley. (1978). Secondary intersubjectivity: speech processing in the first year of life. Advances in Child confidence, confiding and acts of meaning in the first year. In Development and Behaviour 24: 1–35. A. Lock, Ed., Action, Gesture and Symbol. London: Academic Trevarthen, C. (1980). The foundations of intersubjectivity: devel- Press, pp. 183–229. opment of interpersonal and cooperative understanding of Trevarthen, C., T. Kokkinaki, and G. A. Fiamenghi Jr. (1998). infants. In D. Olson, Ed., The Social Foundations of Language What infants’ imitations communicate: With mothers, with and Thought: Essays in Honor of J. S. Bruner. New York: fathers and with peers. In J. Nadel and G. Butterworth, Eds., W. W. Norton, pp. 316–342. Imitation in Infancy. Cambridge: Cambridge University Press. Trevarthen, C. (1986). Development of intersubjective motor con- Tronick, E. Z., and M. K. Weinberg. (1997). Depressed mothers trol in infants. In M. G. Wade and H. T. A. Whiting, Eds., and infants: failure to form dyadic states of consciousness. In L. Motor Development in Children: Aspects of Coordination and Murray and P. J. Cooper, Eds., Postpartum Depression and Control. Dordrecht: Martinus Nijhof, pp. 209–261. Child Development. New York: Guilford Press, pp. 54–81. Trevarthen, C. (1993). The self born in intersubjectivity: the psy- chology of an infant communicating. In U. Neisser, Ed., The Further Readings Perceived Self: Ecological and Interpersonal Sources of Self- Knowledge. New York: Cambridge University Press, pp. 121– Bakhtine, M. M. (1981). The Dialogic Imagination. Austin: Uni- 173. versity of Texas Press. Introspection 419 and hence whether introspection, properly so-called, exists. Trevarthen, C. (1994). Infant semiosis. In W. Noth, Ed., Origins of Semiosis. Berlin: Mouton de Gruyter, pp. 219–252. According to Gilbert Ryle (1949) and William Lyons Trevarthen, C. (1995). Contracts of mutual understanding: negoti- (1986), what we loosely describe as attending to current ating meaning and moral sentiments with infants. In P. Wohl- perceptions is really just perceiving in an attentive manner. muth, Ed., The Crisis of Text: Issues in the Constitution of But perceiving attentively itself sometimes involves attend- Authority. San Diego: University of San Diego School of Law, ing to the perceiving, as when one is explicitly aware of Journal of Contemporary Legal Issues, Spring 1995, vol. 6: visually concentrating on something. Moreover, when we 373–407. report what mental states we are in, those reports express higher-order mental representations of the states we report; Intonation Ryle’s denial that remarks such as “I am in pain” are liter- ally about one’s mental states is groundless. It is often held that introspection involves some “inner See PROSODY AND INTONATION; PROSODY AND INTONA- sense” by which we perceive our own mental states. The TION, PROCESSING ISSUES seemingly spontaneous and unmediated character of per- ceiving generally would then explain why introspection Introspection itself seems spontaneous and immediate. This model could, in addition, appeal to mechanisms of perceptual attention to Introspection is a process by which people come to be atten- explain how we come to focus attentively on our concurrent tively conscious of mental states they are currently in. This mental states. focused CONSCIOUSNESS of one’s concurrent mental states is But introspection cannot be a form of perceiving. Per- distinct from the relatively casual, fleeting, diffuse way we ception invariably involves sensory qualities, and no quali- are ordinarily conscious of many of our mental states. ties ever occur in introspection other than those of the “Introspection” is occasionally applied to both ways of sensations and perceptions we introspect; the introspecting being conscious of one’s mental states (e.g., Armstrong itself produces no additional qualities. Moreover, speech 1968/1993), but is most often used, as in what follows, for acts generally express not perceptions, but thoughts and the attentive way only. other intentional states (see INTENTIONALITY). So intro- Introspection involves both the mental states intro- spective reports express intentional states about the mental spected and some mental representation of those very states states we introspect, and introspective representations of (as suggested by the etymology, from the Latin spicere concurrent mental states involve assertive intentional states, “look” and intra “within”; looking involves mental repre- or thoughts. Introspection is deliberate and attentive sentations of what is seen). Because it involves higher- because these higher-order intentional states are themselves order mental representations of introspected states, intro- attentive and deliberate. And our introspecting seems spon- spection is a kind of conscious METAREPRESENTATION or taneous and unmediated presumably because we remain unaware of any mental processes that might lead to these METACOGNITION. (1911/1912) held that introspection higher-order intentional states. Introspection consists in WILHELM WUNDT provides an experimental method for psychology, and relied conscious, attentively focused, higher-order thoughts about on it in setting up, in 1879 in Leipzig, the first experimental our concurrent mental states. psychology laboratory. Some challenged this introspection- Despite Comte’s claim that attention cannot be divided, ist method, following Auguste Comte’s (1830–42) denial people can with a little effort attend to more than one thing. that a single mind can be both the agent and object of intro- And attentive consciousness of concurrent mental states spection. This, Comte had held, would divide attention could in any case occur whenever the target mental state between the act and object of introspecting, which he was not itself an attentive state. thought impossible. These concerns led WILLIAM JAMES A related concern is that attending to concurrent mental (1890) and others to propound instead a method of immedi- states may distort their character. But it is unclear why that ate retrospection. should happen, inasmuch as attention does not generally Introspectionist psychology foundered mainly not for alter the properties of its object. Introspection itself cannot these reasons, but because results from different introspec- show that distortion occurs, because even if it seems to, that tionist laboratories frequently conflicted. Still, experimental appearance might be due not to the distorting effect of intro- procedures in psychology continue to rely on subjects’ spection, but to introspection’s making us aware of more of access to their current mental states, though the theoretical a state’s properties or of a different range of properties. Sim- warrant for this reliance is seldom discussed. ilarly for the idea that introspective attention might actually The phenomenological movement in philosophy, pio- bring the introspected state into existence (Hill 1991: chap. neered by Wundt’s contemporary Edmund Husserl (1913/ 5). That may well happen, but it may instead be that, when 1980), held that introspection, by “bracketing” consciousness that seems to happen, introspection simply makes one newly from its object, enables us to describe and analyze conscious- aware of a state that already existed. ness, and thereby solve many traditional philosophical prob- Work by John H. Flavell (1993) has raised doubt about lems. This methodology encountered difficulties similar to whether children five and younger have introspective access those that faced introspectionist psychology. to their mental states. Four- and five-year-olds describe Some have questioned whether higher-order mental rep- themselves and others as thinking, feeling, and experienc- resentations of concurrent mental states ever actually occur ing. But they also describe people while awake as going for 420 Introspection significant periods without thinking or feeling anything Flavell, J. H. (1993). Young children’s understanding of thinking and consciousness. Current Directions in Psychological Sci- whatever. Doubtless these children themselves have, when ence 2(2): 40–43. awake, normal streams of consciousness. But they seem not Hill, C. S. (1991). Sensations: A Defense of Type Materialism. to think of themselves in that way and, hence, not to intro- Cambridge: Cambridge University Press. spect their streams of consciousness. Flavell also reports Husserl, E. (1913/1980). Ideas Pertaining to a Pure Phenomenol- that these children determine what people attend to and ogy and to a Phenomenological Philosophy, vol. 1. Trans. T. E. think about solely on the basis of behavioral cues and envi- Klein and W. E. Pohl. The Hague and Boston: M. Nijhoff. ronmental stimulation. So perhaps their inability to intro- James, W. (1890). The Principles of Psychology. 2 vols. New York: spect results from their simply not conceiving of thoughts Henry Holt. and experiences as states that are sometimes conscious (see Lashley, K. S. (1958). Cerebral organization and behavior. In H. C. Solomon, S. Cobb, and W. Penfield, Eds., The Brain and THEORY OF MIND). Human Behavior, vol. 36. Association for Research in Nervous Some have held that introspective access to one’s men- and Mental Diseases, Research Publications. Baltimore: Will- tal states cannot be erroneous or, at least, that it overrides iams and Wilkins, pp. 1–18. all other evidence (see SELF-KNOWLEDGE). RENÉ DES- Lyons, W. (1986). The Disappearance of Introspection. Cam- CARTES (1641/1984) famously noted that one cannot, when bridge, MA: MIT Press/Bradford Books. thinking, doubt that one is thinking. But this hardly shows Nisbett, R. E., and T. DeCamp Wilson. (1977). Telling more than that when one is thinking one always knows one is, much we can know: verbal reports on mental processes. Psychologi- less that one is invariably right about which thoughts one cal Review 84 (3): 231–259. has. In a similar spirit, Sydney Shoemaker (1996) has Ryle, G. (1949). The Concept of Mind. London: Hutchinson. urged that when one has a belief one always knows one Shoemaker, S. (1996). The First-Person Perspective and Other does, because a rational person’s believing something itself Essays. Cambridge: Cambridge University Press. Wundt, W. (1911/1912). An Introduction to Psychology. Trans. involves cognitive dispositions that constitute that person’s Rudolf Pintner. London: George Allen and Unwin. knowing about the belief. But the relevant rationality often fails to accompany our beliefs and other first-order mental Further Readings states. Indeed, psychological research reveals many such Brentano, F. (1874/1973). Psychology from an Empirical Stand- lapses of rationality. In addition to the misrepresentations point. O. Kraus, Ed., L. L. McAlister, Eng. ed. Trans. A. C. of one’s own mental states discovered by SIGMUND FREUD, Rancurello, D. B. Terrell, and L. L. McAlister. London: Rout- other work (e.g., Nisbett and Wilson 1977) shows that ledge and Kegan Paul. introspective judgments frequently result from confabula- Broad, C. D. (1925). The Mind and Its Place in Nature. London: Routledge and Kegan Paul. tion. People literally invent mental states to explain their Burge, T. (1988). Individualism and self-knowledge. The Journal own behavior in ways that are expected or acceptable. of Philosophy 85 (11): 649–663. Daniel Dennett (1991) in effect seeks to generalize this Cassam, Q. (1995). Introspection and bodily self-ascription. In J. finding by arguing that all introspective reports can be Luis Bermudez, A. Marcel, and N. Eilan, Eds., The Body and treated as reports of useful fictions. the Self. Cambridge, MA: MIT Press/Bradford Books. Introspection not only misrepresents our mental states, Churchland, P. M. (1985). Reduction, qualia, and the direct intro- but it also fails to reveal many concurrent states, both in spection of brain states. The Journal of Philosophy 82 (1): 8–28. ordinary and exotic situations (see BLINDSIGHT and Dretske, F. (1994/95). Introspection. Proceedings of the Aristote- IMPLICIT VS. EXPLICIT MEMORY). And it is likely that lian Society CXV: 263–278. introspection seldom if ever reveals all the mental proper- Goldman, A. I. (1993). The psychology of folk psychology. The Behavioral and Brain Sciences 16(1): 15–28. Open peer com- ties of target states. Many, moreover, would endorse KARL mentary, 29–90; author’s response: Functionalism, the theory- LASHLEY’s (1958) dictum that introspection never makes theory and phenomenology, 101–113. mental processes accessible, only their results. At best, Gopnik, A. (1993). How do we know our minds: the illusion of introspection is one tool among many for learning about first-person knowledge of intentionality. The Behavioral and the mind. Brain Sciences 16(1): 1–14. Open peer commentary, 29–90; See also ATTENTION; INTERSUBJECTIVITY; SELF author’s response: Theories and illusion, 90–100. Locke, J. (1700/1975). Essay Concerning Human Understanding. —David M. Rosenthal Nidditch, P. H., Ed. Oxford: Oxford University Press. Lycan, W. (1996). Consciousness and Experience. Cambridge, MA: MIT Press/Bradford Books. References Metcalfe, J., and A. P. Shimamura, Eds. (1994). Metacognition: Armstrong, D. M. (1968/1993). A Materialist Theory of the Mind. Knowing About Knowing. Cambridge, MA: MIT Press/Brad- New York: Humanities Press. Rev. ed., London: Routledge and ford Books. Kegan Paul. Nelson, T. O., Ed. (1992). Metacognition: Core Readings. Boston: Comte, A. (1830–42). Cours de Philosophie Positive. 6 vols. Paris: Allyn and Bacon. Bachelier. Nelson, T. O. (1996). Consciousness and metacognition. American Dennett, D. (1991). Consciousness Explained. Boston: Little, Brown. Psychologist 51(2): 102–116. Descartes, R. (1641/1984). Meditations on first philosophy. In The Rosenthal, D. M. (1997). A theory of consciousness. In N. Block, Philosophical Writings of Descartes, vol. 2. Trans. J. Cotting- O. Flanagan, and G. Güzeldere, Eds., The Nature of Conscious- ham, R. Stoothoff, and D. Murdoch. Cambridge: Cambridge ness: Philosophical Debates. Cambridge, MA: MIT Press, pp. University Press. 729–753. Jakobson, Roman 421 discussions of the foundations of literary theory; research Rosenthal, D. M. (Forthcoming). Consciousness and metacogni- into formal devices of literature, including pioneering inquir- tion. In D. Sperber, Ed., Metarepresentation: Proceedings of the Tenth Vancouver Cognitive Science Conference. New York: ies into the principles governing meter in poetry; analyses of Oxford University Press. poetic texts in many languages; philological investigations Titchener, E. B. (1909). A Text-Book of Psychology. New York: into literary monuments; studies of Slavic folklore and com- Macmillan. parative mythology; inquiries into the cultural history of the Uleman, J. S., and J. A. Bargh, Eds. (1989). Unintended Thought. Slavs; and some of the finest literary criticism of modern New York: Guilford Press. Russian poetry. A bibliography of Jakobson’s writing is Weinert, F. E., and R. H. Kluwe, Eds. (1987). Metacognition, Moti- Jakobson (1990). vation, and Understanding. Hillsdale, NJ: Erlbaum. Jakobson’s chief contribution to linguistics concerns the Weiskrantz, L. (1997). Consciousness Lost and Found: A Neuro- nature of the speech sounds (phonemes). It has been ac- psychological Exploration. Oxford: Oxford University Press. cepted for well over a century that the phonemes that make White, P. A. (1988). Knowing more than we can tell: “introspec- tive access” and causal report accuracy 10 years later. British up the words in all languages differ fundamentally from Journal of Psychology 79(1): 13–45. other, acoustically similar sounds that humans produce with Wilson, T. D., S. D. Hodges, and S. J. LaFleur. (1995). Effects of the lips, tongue and larynx, for example yawns, burps, or introspecting about reasons: inferring attitudes from accessible coughs. (See, e.g., chap. 5 of Sievers 1901, the standard thoughts. Journal of Personality and Social Psychology 69: phonetics text of the time.) It was also understood that the 16–28. sounds of a given language are not a random collection, but are made up of various intersecting groups of sounds; for Intuitive Biology example, [p t k] or [p b f v m] or [m n]. What was not under- stood was the basis on which speech and nonspeech sounds are differentiated, and how, short of listing, phonemes can See FOLK BIOLOGY be assigned to groups. It was Jakobson who proposed an Intuitive Mathematics answer to these fundamental questions. In a 1928 paper written by Jakobson and co-signed by the Russian linguists N. S. Trubetzkoy (1890–1938) and S. See NAIVE MATHEMATICS Karcevskij (1884–1955), Jakobson proposed that the pho- nemes of all languages are complexes of a small number of Intuitive Physics DISTINCTIVE FEATURES such as nasal, labial, voicing, frica- tive, and so on. Although many features were used by pho- neticians, they were viewed as somewhat accidental See NAIVE PHYSICS attributes of sounds. By contrast, for Jakobson the features are the raw material of which the phonemes—and only pho- Intuitive Psychology nemes—are made. The fact that only phonemes, but no other sounds, are feature complexes differentiates phonemes from the rest, and this fact also explains the grouping of See FOLK PSYCHOLOGY phonemes into intersecting sets, each defined by one or Intuitive Sociology more shared features; for example, [m n] are nasal, [p b f v m] are labial, and [p t k] are voiceless. This conception of phonemes as feature bundles is fundamental to subsequent See NAIVE SOCIOLOGY developments in PHONOLOGY. In exploring the consequences of this proposal, Jakobson ILP was joined by Trubetzkoy, whose important Grundzüge der Phonologie (1939) summarizes many results of these early investigations. Jakobson’s own contributions to feature the- See INDUCTIVE LOGIC PROGRAMMING ory are reflected in such papers as Jakobson (1929), where the evolution of the phoneme system of modern Russian is Jakobson, Roman reviewed in light of the feature concept; Jakobson (1939), where it is shown—contra Trubetzkoy (1939)—that such Roman Jakobson (1896–1982), one of the foremost students apparently multivalued features as “place of articulation” of linguistics, literature, and culture of the twentieth cen- can be reanalyzed in terms of binary features, and that this tury, was born and educated in Moscow. In 1920 he moved makes it possible to analyze both vowels and consonants to Czechoslovakia, where he remained until the Nazi occu- with the same set of features; Jakobson (1941), where facts pation in 1939. During 1939–1941 Jakobson was in Scandi- of phoneme acquisition by children, phoneme loss in apha- navia, and in 1941 he emigrated to the United States, where sia, phoneme distribution in different languages, and other he served on the faculties of Columbia University (1946– phonological phenomena are reviewed in feature terms; and 1949), Harvard (1949–1966), and MIT (1958–1982). Jakobson, Fant, and Halle (1952), where the acoustic corre- The eight published volumes of Jakobson’s Selected Writ- lates of individual features were first described and many ings reflect the impressive breadth and variety of his contri- consequences of these new facts discussed. Jakobson also butions. These include, in addition to linguistics proper, made enduring contributions to the phonological study of 422 James, William individual languages, for instance Russian (Jakobson 1948), After teaching anatomy and physiology at Harvard for Slovak (Jakobson 1931), Arabic (Jakobson 1957a), and two years, James began teaching physiological psychology Gilyak (Jakobson 1957b). in 1875. In 1879 James gave his first lectures in philosophy In addition to phonology, Jakobson’s major contributions at Harvard. As he himself put it, these lectures of his own to linguistics were in the area of MORPHOLOGY, the study of were the first lectures in philosophy that he had ever heard. the form of words and their minimal syntactic constituents, In 1884 James helped found the American Society for Psy- the morphemes. Among the new ideas in these studies (col- chical Research and, in the following year, was appointed lected in Jakobson 1984) is Jakobson’s attempt to extend the professor of philosophy at Harvard. In 1890, after some feature analysis to morphemes, which, like its phonological twelve years of labor on the project, James published his counterpart, has been adopted in subsequent work. magnum opus, the two volumes of The Principles of Psy- chology. In that same year, James established the psycholog- See also BLOOMFIELD, LEONARD; PHONOLOGY, ACQUISI- ical laboratory at Dane Hall in Harvard, one of the first such TION OF laboratories to be set up in America. In 1892 James wrote a —Morris Halle textbook of psychology for students derived from The Prin- ciples, entitled Textbook of Psychology: Briefer Course. References The next twenty years saw a rapid succession of books: The Will to Believe, and Other Essays in Popular Philoso- Jakobson, R. (1929). Remarques sur l’évolution phonologique du phy (1897), Human Immortality: Two Supposed Objections russe comparée à celle des autres langues slaves. SW 1: 7–116. to the Doctrine (1898), Talks to Teachers on Psychology: Jakobson, R. (1931). Phonemic notes on standard Slovak. SW 1: and to Students on Some of Life’s Ideals (1899), The Varie- 221–230. Jakobson, R. (1939). Observations sur le classement phonologique ties of Religious Experience (1902—his Gifford lectures at des consonnes. SW 1: 272–279. Edinburgh), Pragmatism (1907—lectures delivered at the Jakobson, R. (1941). Kindersprache, Aphasie, und allgemeine Lowell Institute in Boston and at Columbia University), A Lautgesetze. SW 1: 328–401. Pluralistic Universe (1909—his Hibbert lectures at Oxford) Jakobson, R. (1948). Russian conjugation. SW 2: 119–129. and The Meaning of Truth: A Sequel to Pragmatism (1909). Jakobson, R. (1957a). Mufaxxama—the “emphatic” phonemes in These unusually accessible books of philosophy and psy- Arabic. SW 1: 510–522. chology achieved a wide readership and acclaim. After a Jakobson, R. (1957b). Notes on Gilyak. SW 2: 72–97. long period of illness, James died at his summer home in Jakobson, R. (1962–88). Selected Writings. (SW) 8 vols. Berlin: Chocorua, New Hampshire, in 1910. Mouton de Gruyter. At his death James left behind an uncompleted work that Jakobson, R. (1984). Russian and Slavic Grammar. L. R. Waugh and M. Halle, Eds. Berlin: Mouton de Gruyter. was published as Some Problems of Philosophy: A Begin- Jakobson, R. (1990). A Complete Bibliography of His Writings. ning of an Introduction to Philosophy (1911). This work Ed. S. Rudy. Berlin: Mouton de Gruyter. was followed by a number of other posthumous volumes, Jakobson, R., C. G. M. Fant, and M. Halle. (1952). Preliminaries to mainly collections of his essays and reviews: Memories and speech analysis. SW 8: 583–660. Studies (1911), Essays in Radical Empiricism (1912), Jakobson, R., S. Karcevskij, and N. S. Trubetzkoy. (1928). Quelles Selected Papers in Philosophy (1917) and Collected Essays sont les méthodes les mieux appropiées à un exposé complet et and Reviews (1920). A collection of his letters was pub- pratique de la phonologie d’une langue quelconque? SW 1: 3–6. lished in 1920, and a reconstruction of his 1896 Lowell lec- Sievers, E. (1901). Grundzüge der Phonetik. 5th ed. Leipzig: Breit- tures on “exceptional mental states” was issued in 1982. kopf und Härtel. James’s best known contributions can most readily be Trubetzkoy, N. S. (1939). Grundzüge der Phonologie. Travaux du understood when seen against the background of the tem- Cercle linguistique de Prague 7. perament of his thought. One central strand of this tempera- ment was his desire always to emphasize the practical, James, William particular, and concrete over the theoretical, abstract and metaphysical. Thus his doctrine of pragmatism, which he shared with Charles Sanders Peirce and John Dewey, was a William James (1842–1910) was born in New York City plea that an abstract concept of modern philosophy, MEAN- into a cultivated, liberal, financially comfortable and deeply religious middle-class family. It was also a very lit- ING, was best understood in terms of the practical effects of erary family. His father wrote theological works, his the words or the concepts embodying it. The meaning of a brother Henry became famous as a novelist, and his sister word or concept was only discernible in the practical effects Alice acquired a literary reputation on the posthumous pub- of its employment, and its meaningfulness was a function of lication of her diaries. As his parents took to traveling its success (or failure) in practice. extensively in Europe, William James was educated at Most famously, or in some quarters, notoriously, this home and in various parts of Europe by a succession of pri- account of meaning was clearly illustrated by James’s vate tutors and through brief attendance at whatever school account of truth. For James there was no “final, complete was at hand. After an unsuccessful attempt to become a fact of the matter,” no truth with a capital T. Truth was sim- painter in Newport, Rhode Island, James began to study ply a word that we applied to a “work-in-progress belief,” comparative anatomy at the Lawrence Scientific School of that is, a belief which we held and have continued to hold Harvard University. After a few years, James moved to the because it enables us to make our way in the world in the Harvard Medical School, graduating in medicine in 1869. long run. A true belief, then, is one that is useful and one Judgment Heuristics 423 that has survived, in Darwinian fashion, the pressures of its behavior and physiological effects that we usually associate environment. Such “truths” are always revisable. As James with a particular emotion. “Common sense says, we lose our himself put it, “The true is the name of whatever proves fortune, are sorry and weep. . . . The hypothesis here to be itself to be good in the way of belief, and good, too, for def- defended says. . . that we feel sorry because we cry.” inite, assignable reasons” (Pragmatism). See also DESCARTES, RENÉ; EMOTIONS; INTROSPECTION; James has sometimes been referred to as the first Ameri- SELF can phenomenologist, though he himself preferred to call —William Lyons this aspect of his thinking radical empiricism. Both these labels have some cash value, because another strand of References James’s temperament was his evangelical holism in regard to all experiences. By experience, James meant any sub- Bird, G. (1986). William James. London: Routledge. ject’s current stream of consciousness. Such a stream was James, W. The Works of William James. Cambridge, MA: Harvard always experienced as a seamless flux. It alone was what University Press. Myers, G. E. (1986). William James: His Life and Thought. New was real for any subject. Any concepts, categories, or dis- Haven, CT: Yale University Press. tinctions that we might refer to in regard to CONSCIOUSNESS, Perry, R. B. (1935). The Thought and Character of William James, indeed even to speak of consciousness in the traditional way vols. 1 and 2. Boston: Little, Brown. as inner, subjective, and mental, were, strictly speaking, Taylor, E. (1982). William James on Exceptional Mental States. artificial conceptual matrices that we have placed over the New York: Charles Scribners Sons. flow of experience for particular pragmatic purposes. Although James’s fascination with consciousness has led Judgment Heuristics many to refer to him as a Cartesian or as an introspectionist like WILHELM WUNDT, it is more fruitful to see him in rela- tion to Heraclitus and Henri Bergson. People sometimes need to know quantities that they can nei- James’s evangelical holism regarding the nature of expe- ther look up nor calculate. Those quantities might include rience should be coupled with what might be called an evan- the probability of rain, the size of a crowd, the future price gelical pluralism about its scope. For James believed that all of a stock, or the time needed to complete an assignment. experiences, whether mystical, psychical, or “normal,” were One coping strategy is to use a heuristic, or rule of thumb, to of equal value. They were properly to be differentiated only produce an approximate answer. That answer might be used in terms of the purposes for which these experiences had directly, as the best estimate of the desired quantity, or been deployed. So James’s lectures on The Varieties of Reli- adjusted for suspected biases. Insofar as heuristics are, by gious Experience, his championing of psychical research, definition, imperfect rules, it is essential to know how much and his interest in “exceptional mental states” associated confidence to place in them. with cases of multiple personality and other mental ill- Heuristics are common practice in many domains. For nesses, were in harmony with his radical empiricism. example, skilled tradespeople have rules for bidding con- James connected his radical empiricism with mainstream tracts, arbitrageurs have ones for making deals, and opera- psychology and physiology by taking as the central task of tions researchers have ones for predicting the behavior of The Principles of Psychology “the empirical correlation of complex processes. These rules may be more or less the various sorts of thought or feeling [as known in con- explicit; they may be computed on paper or in the head. The sciousness] with definite conditions of the brain.” In the errors that they produce are the associated “bias.” Principles James also connected his radical empiricism with Heuristics attained prominence in cognitive psychology his unyielding advocacy of the freedom of the will by ana- through a series of seminal articles by Amos TVERSKY and lyzing this freedom as involving the momentary endorse- Daniel Kahneman (1974), then at the Hebrew University of ment of a particular thought in our stream of consciousness Jerusalem. They observed that judgments under conditions such that this thought would thereby become the cause of of uncertainty often call for heuristic solutions. The precise some appropriate behavior. answers are unknown or inaccessible. People lack the train- In the Principles, as well as in some other texts, James ing needed to compute appropriate estimates. Even those also made important contributions to the psychology of with training may not have the intuitions needed to apply MEMORY, self-identity, habit, instinct, the subliminal and their textbook learning outside of textbook situations. religious experiences. For example, his distinction between The first of these articles (Tversky and Kahneman 1971) primary and secondary memory was the precursor of the proposed that people expect future observations of uncertain modern distinction between short-term and long-term mem- processes to be much like past ones, even when they have few ory, and his work on ATTENTION has influenced recent work past observations to rely on. Such people might be said to on the human capacity for dividing attention between two or apply a “law of small numbers,” which captures some proper- more tasks. However, the most influential part of the Princi- ties of the statistical “law of large numbers” but is insuffi- ples has been the theory of emotion that James developed in ciently sensitive to sample size. The heuristic of expecting parallel with the Danish physiologist Carl Lange. As James past observations to predict future ones is useful but leads to himself put it, “bodily changes follow directly the percep- predictable problems, unless one happens to have a large sample. Tversky and Kahneman demonstrated these prob- tion of the exciting fact, . . . [so that] our feeling of the same changes as they occur IS the emotion.” That is, an emotional lems, with quantitative psychologists as subjects. For exam- state is the feeling in our stream of consciousness of the ple, their scientist subjects overestimated the probability that 424 Judgment Heuristics small samples would affirm research hypotheses, leading These seminal papers have produced a large research lit- them to propose study designs with surprisingly low statisti- erature (Kahneman, Slovic, and Tversky 1982). Their influ- cal power. Of course, well-trained scientists can calculate the ence can be traced to several converging factors (Dawes correct value for power analyses. However, to do so, they 1997; Jungermann 1983), including: (1) The initial demon- must realize that their heuristic judgment is faulty. Systematic strations have proven quite robust, facilitating replications reviews have found a high rate of published studies with low in new domains and the exploration of boundary conditions statistical power, suggesting that practicing scientists often (e.g., Plous 1993). (2) The effects can be described in lack this intuition (Cohen 1962). piquant ways, which present readers and investigators in a Kahneman and Tversky (1972) subsequently subsumed flattering light (able to catch others making mistakes). (3) this tendency under the more general representativeness The perspective fits the cognitive revolution’s subtext of heuristic. Users of this rule assess the likelihood of an event tracing human failures to unintended side effects of gener- by how well it captures the salient properties of the process ally adaptive processes. (4) The heuristics operationalize producing it. Although sometimes useful, this heuristic will Simon’s (1957) notions of BOUNDED RATIONALITY in ways produce biases whenever features that determine likelihood subject to experimental manipulation. are insufficiently salient (or when irrelevant features capture The heuristics-and-biases metaphor also provides an people’s attention). As a result, predicting the behavior of organizing theme for the broader literature on failures of people relying on representativeness requires both a sub- human DECISION MAKING. For example, many studies have stantive understanding of how they judge salience and a nor- found people to be insensitive to the extent of their own mative understanding of what features really matter. Bias knowledge (Keren 1991; Yates 1990). When this trend arises when the two are misaligned, or when people apply emerges as overconfidence, one contributor is the tendency appropriate rules ineffectively. to look for reasons supporting favored beliefs (Koriat, Lich- Sample size is one normatively relevant feature that tenstein, and Fischhoff 1980). Although that search is a sen- tends to be neglected. A second is the population frequency sible part of hypothesis testing, it can produce bias when of a behavior, when making predictions for a specific indi- done without a complementary sensitivity to disconfirming vidual. People feel that the observed properties of the indi- evidence (Fischhoff and Beyth-Marom 1983). Other studies vidual (sometimes called “individuating” or “case-specific” have examined hindsight bias, the tendency to exaggerate information) need to be represented in future events, even the predictability of past events (or reported facts; Fischhoff when those observations are not that robust (e.g., small sam- 1975). One apparent source of that bias is automatically ple, unreliable source). making sense of new information as it arrives. Such rapid Bias can also arise when normatively relevant features updating should facilitate learning—at the price of obscur- are recognized but misunderstood. Thus, people know that ing how much has been learned. Underestimating what one random processes should show variability, but expect too had to learn may mean underestimating what one still has to much of it. One familiar expression is the “gambler’s fal- learn, thereby promoting overconfidence. lacy,” leading people to expect, say, a “head” coin flip after Scientists working within this tradition have, naturally, four tails, but not after four alternations of head-tail. An worried about the generality of these behavioral patterns. engaging example is the unwarranted perception that bas- One central concern has been whether laboratory results ketball players have a “hot hand,” caused by not realizing extend to high-stakes decisions, especially ones with experts how often such (unrandom-looking) streaks arise by chance working on familiar tasks. Unfortunately, it is not that easy (Gilovich, Vallone, and Tversky 1985). In a sense, represen- to provide significant positive stakes (or threaten significant tativeness is a metaheuristic, a very general rule from which losses) or to create appropriate tasks for experts. Those more specific ones are derived for particular situations. As a studies that have been conducted suggest that stakes alone result, researchers need to predict how a heuristic will be do not eliminate bias nor lead people, even experts, to aban- used in order to generate testable predictions for people’s don faulty judgments (Camerer 1995). judgments. Where those predictions fail, it may be that the In addition to experimental evidence, there are anecdotal heuristic was not used at all or that it was not used in that reports and systematic observations of real-world expert particular way. performance showing biases that can be attributed to using Two other (meta)heuristics are availability and anchor- heuristics (Gilovich 1991; Mowen 1993). For example, ing and adjustment. Reliance on availability means judging overconfidence has been observed in the confidence assess- an event as likely to the extent that one can remember exam- ments of particle physicists, demographers, and economists ples or imagine it happening. It can lead one astray when (Henrion and Fischhoff 1986). A noteworthy exception is instances of an event are disproportionately (un)available in weather forecasters, whose assessments of the probability MEMORY. Reliance on anchoring and adjustment means of precipitation are remarkably accurate (e.g., it rains 70 estimating a quantity by thinking of why it might be larger percent of the times that they forecast a 70 percent chance or smaller than some initial value. Typically, people adjust of rain; Murphy and Winkler 1992). These experts make too little, leaving them unduly “anchored” in that initial many judgments under conditions conducive to LEARNING: value, however arbitrarily it has been selected. Obviously, prompt, unambiguous feedback that rewards them for accu- there are many ways in which examples can be produced, racy (rather than, say, for bravado or hedging). Thus, these anchors selected, and adjustments made. The better these judgments may be a learnable cognitive skill. That process processes are understood, the sharper the predictions that may involve using conventional heuristics more effectively can be made for heuristic-based judgments. or acquiring better ones. Justification 425 Given the applied interests of decision-making research- Mowen, J. C. (1993). Judgment Calls. New York: Simon and Schuster. ers, other explorations of the boundary conditions on subop- Murphy, A. H., and R. L. Winkler. (1992). Approach verification timal performance have focused on practical procedures for of probability forecasts. International Journal of Forecasting 7: reducing bias. Given the variety of biases and potential in- 435–455. terventions, no simple summary can be comprehensive Nisbett, R., Ed. (1993). Rules for Reasoning. Hillsdale, NJ: (Kahneman, Slovic, and Tversky 1982; von Winterfeldt and Erlbaum. Edwards 1986). One general trend is that merely warning Plous, S. (1993). The Psychology of Judgment and Decision Mak- about bias is not very useful. Nor is teaching statistics, ing. New York: McGraw Hill. unless direct contact can be made with people’s intuitions. Poulton, E. C. (1995). Behavioral Decision Making. Hillsdale, NJ: Making such contact requires an understanding of natural Erlbaum. thought processes and plausible alternative ones. As a Simon, H. (1957). Models of Man: Social and Rational. New York: Wiley. result, the practical goal of debiasing has fostered interest in Svenson, O. (1996). Decision making and the search for funda- basic cognitive processes, in areas such as reasoning, mem- mental psychological regularities. Organizational Behavior ory, METACOGNITION, and PSYCHOPHYSICS (Nisbett 1993; and Human Performance 65: 252–267. Svenson 1996; Tversky and Koehler 1994). For example, Tversky, A., and D. Kahneman. (1974). Judgment under uncer- reliance on availability depends on how people encode and tainty: Heuristics and biases. Science 185: 1124–1131. retrieve experiences; any quantitative judgment may draw Tversky, A., and D. Kahneman. (1971). Belief in the “law of small on general strategies for extracting hints at the right answer numbers.” Psychological Bulletin 76: 105–110. from the details of experimental (or real-world) settings Tversky, A., and D. J. Koehler. (1994). Support theory. Psycholog- (Poulton 1995). ical Review 101: 547–567. Yates, J. F. (1990). Judgement and Decision Making. New York: See also DEDUCTIVE REASONING; METAREASONING; RA- Wiley. TIONAL DECISION MAKING von Winterfeldt, D., and W. Edwards. (1986). Decision Making —Baruch Fischhoff and Behavioral Research. New York: Cambridge University Press. References Further Readings Camerer, C. (1995). Individual decision making. In J. Kagel and Baron, J. (1994). Thinking and Deciding. 2nd ed. New York: Cam- A. Roth, Eds., The Handbook of Experimental Economics. bridge University Press. Princeton, NJ: Princeton University Press. Bazerman, M., and M. Neale. (1992). Negotiating Rationally. New Cohen, J. (1962). The statistical power of abnormal social psycho- York: The Free Press. logical research. Journal of Abnormal and Social Psychology Berkeley, D., and P. C. Humphreys. (1982). Structuring decision 65: 145–153. problems and the “bias heuristic.” Acta Psychologica 50: Dawes, R. M. (1997). Behavioral decision making, judgment, and 201–252. inference. In D. Gilbert, S. Fiske, and G. Lindzey, Eds., The Dawes, R. (1988). Rational Choice in an Uncertain World. San Handbook of Social Psychology. Boston, MA: McGraw-Hill, Diego, CA: Harcourt Brace Jovanovich. pp. 497–548. Fischhoff, B. (1975). Hindsight ≠ foresight: The effect of outcome Hammond, K. R. (1996). Human Judgment and Social Policy. New York: Oxford University Press. knowledge on judgment under uncertainty. Journal of Experi- Morgan, M. G., and M. Henrion. (1990). Uncertainty. New York: mental Psychology: Human Perception and Performance 104: Cambridge University Press. 288–299. Nisbett, R., and L. Ross. (1980). Human Inference: Strategies and Fischhoff, B., and R. Beyth-Marom. (1983). Hypothesis evaluation Shortcomings of Social Judgment. Englewood Cliffs, NJ: from a Bayesian perspective. Psychological Review 90: 239–260. Prentice-Hall. Gilovich, T. (1991). How We Know What Isn’t So. New York: Free Thaler, R. H. (1992). The Winner’s Curse: Paradoxes and Anoma- Press. lies of Economic Life. New York: Free Press. Gilovich, T., R. Vallone, and A. Tversky. (1985). The hot hand in basketball: On the misperception of random sequences. Journal of Personality and Social Psychology 17: 295–314. Justification Henrion, M., and B. Fischhoff. (1986). Assessing uncertainty in physical constants. American Journal of Physics 54: 791–798. Jungermann, H. (1983). The two camps on rationality. In R. W. Philosophers distinguish between justified and unjustified Scholz, Ed., Decision Making Under Uncertainty. Amsterdam: beliefs. The former are beliefs a cognizer is entitled to hold Elsevier, pp. 63–86. by virtue of his or her evidence or cognitive operations. The Kahneman, D., and A. Tversky. (1972). Subjective probability: A judgment of representativeness. Cognitive Psychology 3: 430– latter are beliefs he or she is unwarranted in holding, for 454. example, beliefs based on sheer fantasy, popular superstition, Kahneman, D., P. Slovic, and A. Tversky, Eds. (1982). Judgments or sloppy thinking. Some justified beliefs are based on scien- Under Uncertainty: Heuristics and Biases. New York: Cam- tific findings, but scientific beliefs do not exhaust the class of bridge University Press. justified beliefs. Ordinary perceptual and memorial beliefs, Keren, G. (1991). Calibration and probability judgment. Acta Psy- such as “There is a telephone before me” and “Nixon chologica 77: 217–273. resigned from the Presidency,” are also normally justified. Koriat, A., S. Lichtenstein, and B. Fischhoff. (1980). Reasons for A belief’s justification is usually assumed to be some- confidence. Journal of Experimental Psychology: Human Lear- thing that makes it probable that the belief is true. But justi- ning and Memory 6: 107–118. 426 Justification visually detected components of a stimulus with the fied beliefs are not guaranteed to be true. Even sound component types associated with object categories, such as scientific evidence can support a hypothesis that is actually “cup,” “elephant,” or “airplane.” Object recognition is false. Almost all contemporary epistemologists accept this sometimes produced by partial matching, however, as when form of fallibilism. an occluded or degraded stimulus reveals only a few of its There are three principal approaches to the theory of jus- contours. When partial matching is unreliable, beliefs so tification: foundationalism, coherentism, and reliabilism. formed are unjustified. An example of unreliable reasoning The historical inspiration for foundationalism was RENÉ is the overuse of confirming information and the neglect of DESCARTES (1637), who launched the project of erecting his disconfirming information in drawing covariational con- system of beliefs on solid, indeed indubitable, foundations. clusions: “I am convinced you can cure cancer with positive Most contemporary foundationalists reject Descartes’s in- thinking because I know somebody who whipped the Big C sistence on indubitable or infallible foundations, because after practicing mental imagery.” People focus on instances they doubt that there are enough infallible propositions (if that confirm their hypothesis without attending to evidence any) to support the rest of our beliefs. On their view, the that might disconfirm it (Crocker 1982; Gilovich 1991). The core of foundationalism is the notion that justification has a search for evidence can also be biased by desire or vertical structure. Some beliefs are directly or immediately preference. If we prefer to believe that a political justified independently of inference from other beliefs, for assassination was a conspiracy, this may slant our evidence example, by virtue of current perceptual experience. These collection. In one study, subjects were led to believe that beliefs are called basic beliefs, and they comprise the foun- either introversion or extroversion was related to academic dations of a person’s justificational structure. Nonbasic jus- success. Those who were led to believe that introversion tified beliefs derive their justification via reasoning from was predictive of success (a preferred outcome) came to basic beliefs. It has been widely rumored that foundational- think of themselves as more introverted than those who ism is dead, perhaps because few people still believe in in- were led to believe that extroversion was associated with fallible foundations. But most epistemologists regard weak success (Kunda and Sanitioso 1989). versions of foundationalism—those that acknowledge falli- Proponents of foundationalism and coherentism often bility—as still viable. reject the relevance of experimental science to the theory of Coherentism rejects the entire idea of basic beliefs, and justification (e.g., Chisholm 1989). But even these types of the image of vertical support that begins at the foundational theories might benefit from psychological research. What level. Instead, beliefs coexist on the same level and provide makes a memory belief about a past event justified or unjus- mutual support for one another. Each member of a belief tified, according to foundationalism? It is the conscious system can be justified by meshing, or cohering, with the memory traces present at the time of belief. Exactly what remaining members of that system. Beliefs get to be justi- kinds of memory traces are available, however, and what fied not by their relationship with a small number of basic kinds of clues do they contain about the veridicality of an beliefs, but by their fit with the cognizer’s total corpus of apparent memory? This question can be illuminated by psy- beliefs. chology. Johnson and Raye (1981) suggest that memory Reliabilism, a theory of more recent vintage, holds (in traces can be rich or poor along a number of dimensions, its simplest form) that a belief is justified if it is produced such as their sensory attributes, the number of spatial and by a sequence of reliable psychological processes (Gold- temporal contextual attributes, and so forth. Johnson and man 1979). Reliable processes are ones that usually output Raye suggest that certain of these dimensions are evidence true beliefs, or at least output true beliefs when taking true that the trace originated from external sources (obtained beliefs as inputs. Perceptual processes are reliable if they through perception) and other of these dimensions are evi- typically output accurate beliefs about the environment, and dence of an internal origin (imagination or thought). A char- reasoning processes are (conditionally) reliable if they out- acterization of these dimensions could help epistemologists put beliefs in true conclusions when applied to beliefs in specify when a memory belief about an allegedly external true premises. Some types of belief-forming processes are past event is justified and when it is unjustified (Goldman unreliable, and they are responsible for unjustified beliefs. forthcoming). This might include beliefs formed by certain kinds of biases Epistemological theories of justification sometimes ig- or illegitimate inferential methods. nore computational considerations, which are essential from Because reliabilism’s account of justification explicitly a cognitivist perspective. Coherentism, for example, usually invokes psychological processes, its connection to cognitive requires of a justified belief that it be logically consistent science is fairly straightforward. To determine exactly which with the totality of one’s current beliefs. But is this logical of our beliefs are justified or unjustified, we should determine relationship computationally feasible? Cherniak (1986) which of the belief-forming practices that generate them are argues that even the apparently simple task of checking for reliable or unreliable, and that is a task for cognitive science truth-functional consistency would overwhelm the computa- (Goldman 1986). There is no special branch of cognitive sci- tional resources available to human beings. One way to check ence that addresses this topic, but it is sprinkled across a range for truth-functional consistency is to use the familiar truth- of cognitive scientific projects. table method. But even a supermachine that could check a Although most perceptual beliefs are presumably line in a truth-table in the time it takes a light ray to traverse justified, some applications of our visual object-recognition the diameter of a proton would require 20 billion years to system are fairly unreliable. Biederman’s (1987) account of check a belief system containing only 138 logically indepen- visual object recognition posits a process of matching Kant, Immanuel 427 dent propositions. Thus, the coherence theory implicitly Though one-quarter Scottish (it is said that Kant is a Ger- requires a humanly impossible feat, thereby rendering justifi- manization of Candt), he lived his whole life in Königsberg cation humanly unattainable (Kornblith 1989). Attention to (now Kaliningrad), just below Lithuania. By his death he computational feasibility, obviously, should be an important was virtually the official philosopher of the German- constraint on theories of justification. speaking world. Until middle age, he was a prominent rationalist in the See also EPISTEMOLOGY AND COGNITION; INDUCTION; tradition of Leibniz and Wolff. Then DAVID HUME, as he put JUDGMENT HEURISTICS; RATIONAL AGENCY it, “awoke me from my dogmatic slumbers.” The critical —Alvin I. Goldman philosophy ensued. One of its fundamental questions was, what must we be like to have the experiences we have? The References view of the mind that Kant developed to answer this ques- tion framed virtually all cognitive research until very re- Biederman, I. (1987). Recognition-by-components: A theory of cently. human image understanding. Psychological Review 94: 115– 147. Philosophy of mind and knowledge were by no means Cherniak, C. (1986). Minimal Rationality. Cambridge, MA: MIT the only areas in which Kant made seminal contributions. Press. He founded physical geometry. (Fieldwork must not have Chisholm, R. (1989). Theory of Knowledge. 3rd ed. Englewood been too important—he is said never to have traveled more Cliffs, NJ: Prentice Hall. than thirty-five miles from Königsberg in his whole life!) Crocker, J. (1982). Biased questions in judgment of covariation His work on political philosophy grounds modern liberal studies. Personality and Social Psychology Bulletin 8: 214–220. democratic theory. And his deontology put ethics on a new Descartes, R. (1637). Discourse on Method. footing, one that remains influential to this day. Gilovich, T. (1991). How We Know What Isn’t So. New York: Free It is his view of the mind, however, that influenced cog- Press. nitive research. Four things in particular shaped subsequent Goldman, A. (1979). What is justified belief? In G. Pappas, Ed., Justification and Knowledge. Dordrecht: Reidel. Reprinted in thought: Goldman (1992). 1. MENTAL REPRESENTATION requires both CONCEPTS and Goldman, A. (1986). Epistemology and Cognition. Cambridge, SENSATIONS (percepts). As Kant put it, “concepts with- MA: Harvard University Press. out intuitions are empty, intuitions without concepts are Goldman, A. (Forthcoming). Internalism exposed. In M. Steup, blind.” To represent something, we require both acts of Ed., Knowledge, Truth, and Obligation: Essays on Epistemic judgment and material from the senses to judge. Put Responsibility and the Ethics of Belief. New York: Oxford Uni- another way, to discriminate, we need information; but versity Press. we also need the ability to discriminate. This doctrine is Johnson, M., and C. Raye (1981). Reality monitoring. Psychologi- now orthodoxy in cognitive science. cal Review 88: 67–85. 2. The method of transcendental argument. Kant’s central Kornblith, H. (1989). The unattainability of coherence. In J. methodological innovation, transcendental arguments Bender, Ed., The Current State of the Coherence Theory. Dor- are inferences from phenomena of a certain kind to what drecht: Kluwer. must be the case for those phenomena to exist. Applied Kunda, Z., and R. Sanitioso. (1989). Motivated changes in the to mental representations, such arguments are about self- concept. Journal of Experimental Social Psychology 25: what must be true of the thing that has those representa- 272–285. tions. Because this move allows us to infer the unobserv- able psychological antecedents of observed behavior, it Further Readings is now central to most experimental cognitive science. 3. The mind as a system of functions. Kant was the first the- Alston, W. (1989). Epistemic Justification. Ithaca, NY: Cornell orist to think of the mind as a system of functions, con- University Press. ceptual functions transforming (“taking”) percepts into BonJour, L. (1985). The Structure of Empirical Knowledge. Cam- representations, at any rate of the modern era (Sellars bridge, MA: Harvard University Press. 1968). (Aristotle may have had the same idea much ear- Goldman, A. (1992). Liaisons: Philosophy Meets the Cognitive lier.) FUNCTIONALISM is by far the most influential philos- and Social Sciences. Cambridge, MA: MIT Press. ophy of mind of cognitive science. Even the recent Harman, G. (1986). Change in View. Cambridge, MA: MIT Press. antisententialism of ELIMINATIVE MATERIALISM and CON- Lehrer, K. (1990). Theory of Knowledge. Boulder, CO: Westview NECTIONISM still see the mind as a system of functions. Press. Indeed, Kant’s notorious “noumenalism” about the Pollock, J. (1986). Contemporary Theories of Knowledge. Totowa, mind might be simply an early expression of FUNCTION- NJ: Rowman and Littlefield. ALISM. Noumenalism is the idea that we cannot know Steup, M. (1996). An Introduction to Contemporary Epistemology. what the mind is like, not even something as basic as Upper Saddle River, NJ: Prentice Hall. whether it is simple or complex. Part of Kant’s argument is that we cannot infer how the mind is built from how it Kant, Immanuel functions: function does not determine form. (The other part is an equally contemporary-sounding rejection of Immanuel Kant (1724–1804) is perhaps the single most INTROSPECTION. Both arguments occur in his most influential figure in the pre-twentieth-century history of important treatment of the mind, the chapter attacking cognitive research. He was a devoutly religious man from a rationalism’s paralogisms of pure reason in the Critique very humble background: his father was a saddlemaker. of Pure Reason.) 428 Kinds 4. Faculties: Kant developed a theory of mental faculties Kant, I. (1798). Anthropology from a Pragmatic Point of View. that strongly anticipates Fodor’s (1983) well-known Trans. Mary Gregor. The Hague: Martinus Nijhoff, 1974. modularity ac-count (see MODULARITY OF MIND). Kitcher, P. (1990). Kant’s Transcendental Psychology. New York: Oxford University Press. Kant also developed important ideas about the mind that have not played much of a role in subsequent cognitive Kinds research, though perhaps they should have. Two of them concern mental unity. See NATURAL KINDS 5. Synthesis: Kant urged that to represent the world, we must perform two kinds of synthesis. First we must syn- Knowledge Acquisition thesize colors, edges, textures, and the like into repre- sentations of single objects. Then we must tie the various represented objects together into a single repre- Knowledge acquisition is a phase in the building of knowl- sentation of a world. The first kind of synthesis is now edge-based or expert systems. Knowledge-based systems studied under the name “binding” (e.g., Treisman and are a kind of computer program that apply technical knowl- Gelade 1980). The second receives little attention, edge, or expertise, to PROBLEM SOLVING. Knowledge acqui- though it would appear to be equally central to cogni- sition involves identifying the relevant technical knowledge, tion. 6. Unity of CONSCIOUSNESS: The unity of consciousness is recording it, and getting it into computable form so it can be our ability to be aware of a great many things at the applied by the problem-solving engine of the expert system. same time, or better, as parts of a single global represen- Knowledge acquisition is the most expensive part of build- tation. ing and maintaining expert systems. The area of expertise represented in the expert system is Finally, Kant articulated some striking ideas about: called the “domain.” For example, a system that diagnoses 7. Consciousness and the SELF. The awareness that we have malfunctioning VCRs and suggests remedies has VCR of ourselves is one form of unified consciousness; we are repair as its domain. An expert VCR repairman would be a aware of ourselves as the “single common subject” of “domain expert” for the system. The domain knowledge a our representations. But Kant had insights into it that go developer, or “knowledge engineer,” might acquire includes well beyond that. Ideas he articulated about the peculiar the types of problems VCRs exhibit, the symptoms a do- barrenness of one form of consciousness of self and main expert would look for to figure out the underlying about the referential apparatus that we use to attain it did cause of the problem, and the types of repairs that could not reappear until Castañeda (1966) and Shoemaker (1968). cure the problem. One of the big challenges of knowledge acquisition is In sum, Kant articulated the view of the mind behind finding a source of EXPERTISE that can be harvested. Written most of cognitive science (see Brook 1994 for further dis- manuals are typically incomplete and sometimes even mis- cussion). leading. Manuals may contain technical details not actually See also BINDING PROBLEM; FREUD, SIGMUND applied in solving the problem. At the same time, manuals often leave out crucial “tricks” that experts have discovered —Andrew Brook in the field. The best, most experienced domain experts are generally in high demand and do not have much time for References system building. Furthermore, experts may perform very well but have difficulty describing what cues they are re- Brook, A. (1994). Kant and the Mind. New York: Cambridge Uni- sponding to and what factors contribute to decisions they versity Press. Castañeda, H.-N. (1966). “He”: A study in the logic of self- make. Very often multiple experts equally skilled in the consciousness. Ratio 8: 130–157. same field disagree on how to do their job. Fodor, J. (1983). Modularity of Mind. Cambridge: MIT Press. Another challenge addressed by approaches to knowl- Kant, I. (1781/1787). Critique of Pure Reason. London: Mac- edge acquisition is the maintenance of the knowledge. Espe- millan, 1963. cially in technical fields dealing with fast-changing product Sellars, W. (1968). Science and Metaphysics. New York: Humani- lines, domain knowledge may be extensive and dynamic. A ties Press. VCR repair system may need to know technical details of Shoemaker, S. (1968). Self-reference and self-awareness. Journal the construction of every brand and make of VCR as well as of Philosophy 65(2): 555–567. likely failures. New knowledge will need to be added as new Treisman, A., and G. Gelade. (1980). A feature-integration theory VCRs come on the market and technical innovations are of attention. Cognitive Psychology 12: 97–136. incorporated into the products. Further Readings Most approaches to knowledge acquisition make the task manageable by identifying the problem-solving method that Ameriks, K. (1983). Kant’s Theory of the Mind. Oxford: Oxford will be applied by the knowledge-based system. The prob- University Press. lem-solving method then identifies what kinds of knowledge Brook, A. (1994). Kant and the Mind. New York: Cambridge Uni- the developer should go after and organizes the knowledge so versity Press. that it is easy to access for maintenance. (For more detail on Falkenstein, L. (1996). Kant’s Intuitionism: A Commentary on the using structure in knowledge acquisition and building expert Transcendental Aesthetic. Toronto: University of Toronto Press. Knowledge Acquisition 429 is added to the problem-solving shell to produce a function- systems, see Clancey 1983, 1985; Swartout 1983; Chan- ing expert system. drasekaran 1983; Gruber and Cohen 1987; McDermott 1988; A further development has been the creation of layered Steels 1990; Wielinga, Schreiber, and Breuker 1992.) tools that handle multiple problem-solving methods (Musen For example, in our VCR repair domain, we may decide 1989; Eriksson et al. 1995; Klinker et al. 1990; Runkel and that, based on symptoms the user describes, the system will Birmingham 1993). These layered tools help the system- classify VCR problems into categories that represent the builder select a problem-solving method and then, based on most probable cause of the problem. The system will then that method, query for domain terms and knowledge, con- choose a repair based on that suspected problem. What is duct completeness and consistency analyses, and integrate “probable” will just be based on our domain experts’ with an appropriate problem-solving shell. experience of what’s worked in the past. If we use this The techniques described so far have mainly focused on problem-solving method, then the knowledge we need to interviewing a domain expert in order to encode their acquire includes symptom-to-probable-cause associations knowledge into the system. Knowledge acquisition for and probable-cause-to-repair recommendations. expert systems has also benefited from the related field of Alternatively, we may decide the system will diagnose the VCR by simulating failures of components on a sche- MACHINE LEARNING (Michalski and Chilausky 1980; Quin- matic model until the simulated system shows readings that lan 1986; Bareiss, Porter, and Murray 1989; Preston, match the readings on the sick VCR. The system will then Edwards, and Compton 1994). Machine learning focuses on suggest replacing the failed components. If we use this getting computer programs to learn autonomously from problem-solving method, then the knowledge we need to examples or from feedback to their own behavior. Machine acquire includes an internal model of a VCR identifying learning techniques have been used most effectively to train components, their dynamic behavior both when healthy and expert systems that need to make distinctions such as the when sick, and their effect on the simulation readings. distinction between symptoms caused by immune- A method-based approach to knowledge acquisition deficiency diseases versus cancers. The knowledge of the extends traditional software engineering for building expert domain is acquired through the machine’s experience with systems (see, for example, Scott, Clayton, and Gibson training cases rather than by directly encoding instructions 1991). During requirements analysis, while developers are from an interview. In domains where the cases and training figuring out what the system needs to do, they are also figur- are available, the knowledge acquisition effort can be quite ing out how it is going to do it. Developers interview do- efficient and effective. main experts and check additional knowledge sources and Finally, knowledge acquisition challenges have inspired manuals to see how the expert goes about problem-solving efforts to get the most mileage out of the knowledge that is and what types of knowledge are available for application in so expensive to acquire (Lenat and Guha 1990; Gruber a method. This information helps them select a method for 1991). Many of the method-based acquisition tools repre- the system. Establishing a method focuses further specifica- sent knowledge in a way that is specific to how it is used by tions and knowledge gathering; the method serves to orga- the method. A use-specific representation simplifies the job nize the acquired knowledge into the roles the knowledge of getting the knowledge into computable form to match the plays in the method. (See also Schreiber et al. 1994 for problem-solving shell. However, knowledge that was ac- extensive organizational schemes.) quired to design car electrical systems might also be useful To assist in the knowledge acquisition process, research- in diagnosing their failures. A use-neutral representation ers have developed automated tools (Davis 1982; Boose makes it easier to reuse the same knowledge with multiple 1984; Musen et al. 1987; Gaines 1987; Marcus 1988). The problem-solving methods provided a mechanism to trans- earliest tools assumed a single problem-solving method form or integrate the knowledge with particular problem- and had a user interface dedicated to extracting the knowl- solvers. Large knowledge-base efforts focus on storing edge needed by that method. For example, a knowledge- knowledge in a use-neutral way and making it available to acquisition tool for diagnostic tasks might query a user for many knowledge-based systems. symptoms, causes, and the strengths of association See also DOMAIN SPECIFICITY; KNOWLEDGE-BASED SYS- between symptoms and causes. Queries would be phrased TEMS; SCHEMATA in a general way so that they could be answered as easily —Sandra L. Marcus with symptoms of malfunctioning VCRs as with symp- toms of sick humans. Such tools might employ specialized References interviewing techniques to help get at distinctions experts Bareiss, R., B. W. Porter, and K. S. Murray. (1989). Supporting find difficult to verbalize. They might also analyze the start-to-finish development of knowledge bases. Machine knowledge for completeness and consistency, looking, for Learning 4: 259–283. example, for symptoms that have no causes, causes with- Boose, J. (1984). Personal construct theory and the transfer of out symptoms, or circular reasoning. human expertise. In Proceedings of the Fourth National Con- Once the domain knowledge is built up, such an inter- ference on Artificial Intelligence. Austin, Texas. viewing tool is typically used with a “shell” for this kind of Chandrasekaran, B. (1983). Towards a taxonomy of problem solv- diagnosis. A shell is an empty problem-solving engine that ing types. AI Magazine 4: 9–17. knows, for example, what to do with symptoms, causes, and Clancey, W. (1983). The advantages of abstract control knowledge symptom-cause association weights to select a probable in expert system design. In Proceedings of the Third National cause. Domain knowledge gathered by the interviewing tool Conference on Artificial Intelligence. Washington, D.C. 430 Knowledge-Based Systems Clancey, W. (1985). Heuristic classification. Artificial Intelligence Wielinga, B. J., A. T. Schreiber, and A. J. Breuker. (1992). KADS: 27: 289–350. A modeling approach to knowledge engineering. Knowledge Davis, R. (1982). TEIRESIAS: Applications of meta-level knowl- Acquisition 4: 1–162. edge. In R. Davis and D. Lenat, Eds., Knowledge-Based Sys- Further Readings tems in Artificial Intelligence. New York: McGraw-Hill. Eriksson, H., Y. Shahar, S. W. Tu, A. R. Puerta, and M. A. Musen. Chandrasekaran, B. (1986). Generic tasks in knowledge-based rea- Task modeling with reusable problem-solving methods. In soning: High-level building blocks for expert system design. International Journal of Expert Systems: Research and Appli- IEEE Expert 1: 23–29. cations 9. Quinlan, J. R., Ed. (1989). Applications of Expert Systems. Lon- Gaines, B. R. (1987). An overview of knowledge acquisition and don: Addison Wesley. transfer. International Journal of Man-Machine Studies. 26: 453–472. Gruber, T. R. (1991). The Role of Common Ontology in Achieving Knowledge-Based Systems Sharable, Reusable Knowledge Bases. San Mateo, CA: Morgan Kaufmann. Gruber, T. R., and P. Cohen. (1987). Design for acquisition: Princi- Knowledge-based systems (KBS) is a subfield of artificial ples of knowledge-system design to facilitate knowledge acqui- intelligence concerned with creating programs that embody sition. International Journal of Man-Machine Studies 26: 143– the reasoning expertise of human experts. In simplest terms, 159. the overall intent is a form of intellectual cloning: find per- Klinker, G., C. Bhola, G. Dallemagne, D. Marques, and J. McDer- sons with a reasoning skill that is important and rare (e.g., mott. (1990). Usable and Reusable Programming Constructs. an expert medical diagnostician, chess player, chemist), talk Fifth Banff Knowledge Acquisition for Knowledge-Based Sys- tems Workshop. Banff, Alberta, Canada. to them to determine what specialized knowledge they have Lenat, D. B., and R. V. Guha. (1990). Building Large Knowledge- and how they reason, then embody that knowledge and rea- Based Systems. Reading, MA: Addison-Wesley. soning in a program. Marcus, S., Ed. (1988). Automating Knowledge Acquisition for The undertaking is distinguished from AI in general in Expert Systems. Boston: Kluwer Academic. several ways. First, there is no claim of breadth or general- McDermott, J. (1988). Preliminary steps toward a taxonomy of ity; these systems are narrowly focused on specific domains problem-solving methods. In S. Marcus, Ed., Automating of knowledge and cannot venture beyond them. Second, the Knowledge Acquisition for Expert Systems. Boston: Kluwer systems are often motivated by a combination of science Academic. and application on real-world tasks; success is defined at Michalski, R. S., and R. L. Chilausky. (1980). Learning by being least in part by accomplishing a useful level of performance told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of on that task. developing an expert system for soybean disease diagnosis. Third, and most significant, the systems are knowledge Policy Analysis and Information Systems 4: 125–160. based in a technical sense: they base their performance on the Musen, M. A. (1989). Knowledge acquisition at the metalevel: accumulation of a substantial body of task-specific knowl- Creation of custom-tailored knowledge acquisition tools. In C. edge. AI has examined other notions of how intelligence R. McGraw and K. L. McGraw, Eds., Special Issue of ACM might arise. GPS (Ernst 1969), for example, was inspired by SIGART Newsletter on Knowledge Acquisition, no. 108. April: the observation that people can make some progress on 45–55. almost any problem we give them, and it depended for its Musen, M. A., L. M. Fagan, D. M. Combs, and E. H. Shortliffe. power on a small set of very general problem-solving meth- (1987). Using a domain model to drive an interactive knowledge ods. It was in that sense methods based. editing tool. International Journal of Man-Machine Studies 26: 105–121. Knowledge-based systems, by contrast, work because of Preston, P., G. Edwards, and P. Compton. (1994). A 2000 rule what they know and in this respect are similar to human expert system without knowledge engineers. In Proceedings of experts. If we ask why experts are good at what they know, the 8th Banff Knowledge Acquisition for Knowledge-Based the answer will contain a variety of factors, but to some sig- Systems Workshop. Banff, Alberta, Canada. nificant degree it is that they simply know more. They do Quinlan, J. R. (1986). Induction of decision trees. Machine Learn- not think faster, nor in fundamentally different ways ing, 1: 81–106. (though practice effects may produce shortcuts); their Runkel, J. T., and W. P. Birmingham. (1993). Knowledge acquisi- expertise arises because they have substantially more tion in the small: Building knowledge-acquisition tools from knowledge about the task. pieces. In Knowledge Acquisition 5(2), 221–243. Human EXPERTISE is also apparently sharply domain-spe- Schreiber, A. T., B. J. Wielinga, R. de Hoog, H. Akkermans, and W. van de Velde. (1994). CommonKADS: A comprehensive cific. Becoming a chess grandmaster does not also make one methodology for KBS development. IEEE Expert, December: an expert physician; the skill of the master chemist does not 28–37. extend to automobile repair. So it is with these programs. Scott, A. C., J. E. Clayton, and E. L. Gibson. (1991). A Practical The systems are at times referred to as expert systems and Guide to Knowledge Acquisition. Reading, MA: Addison- the terms are informally used interchangeably. “Expert sys- Wesley. tem” is however better thought of as referring to the level of Steels, L. (1990). Components of expertise. AI Magazine 11: 28– aspiration for the system. If it can perform as well as an 49. expert, then it can play that role; this has been done, but is rel- Swartout, W. (1983). XPLAIN: A system for creating and explain- atively rare. More commonly the systems perform as intelli- ing expert consulting systems. Artificial Intelligence 21: 285– gent assistants, making recommendations for a human to 325. Knowledge-Based Systems 431 review. Speaking of them as expert systems thus sets too nar- all of the hallmarks that came to be associated with row a perspective and restricts the utility of the technology. knowledge-based systems and as such came to be regarded Although knowledge-based systems have been built with as a prototypical example. Its knowledge was expressed as a a variety of representation technologies, two common archi- set of some 450 relatively independent if/then rules; its tectural characteristics are the distinction between inference inference engine used a simple backward-chaining control engine and knowledge base, and the use of declarative style structure; and it was capable of explaining its recommenda- representations. The knowledge base is the system’s reposi- tions (by showing the user a recap of the rules that had been tory of task-specific information; the inference engine is a used). (typically simple) interpreter whose job is to retrieve rele- The late 1970s and early 1980s saw the construction of a vant knowledge from the knowledge base and apply it to the wide variety of knowledge-based systems for tasks as di- problem at hand. A declarative representation is one that verse as diagnosis, configuration, design, and tutoring. The aims to express knowledge in a form relatively independent decade of the 1980s also saw an enormous growth in indus- of the way in which it is going to be used; predicate calculus trial interest in the technology. One of the most successful is perhaps the premier example. and widely known industrial systems was XCON (for The separation of inference engine and knowledge base, expert configurer), a program used by Digital Equipment along with the declarative character of the knowledge, facil- Corporation (DEC) to handle the wide variation in the ways itates the construction and maintenance of the knowledge a DEC VAX computer could be configured. The system’s base. Developing a knowledge-based system becomes a task task was to ensure that an order had all the required compo- of debugging knowledge rather than code; the question is, nents and no superfluous ones. This was a knowledge- what should the program know, rather than what should it do. intensive task: factors to be considered included the physi- Three systems are frequently cited as foundational in this cal layout of components in the computer cabinet, the elec- area: SIN (Moses 1967) and its descendant MACSYMA trical requirements of each component, the need to (Moses 1971), DENDRAL (Feigenbaum et al. 1971), and establish interrupt priorities on the bus, and others. Tests of MYCIN (Davis 1977). SIN’s task domain was symbolic XCON demonstrated that its error rate on the task was integration. Although its representation was more proce- below 3 percent, which compared quite favorably to human dural than would later become the norm, the program was error rates in the range of 15 percent (Barker and O’Connor one of the first embodiments of the hypothesis that problem 1989). Digital has claimed that over the decade of the solving power could be based on knowledge. It stands in 1980s XCON and a variety of other knowledge-based sys- stark contrast to SAINT (Slagle 1963), the first of the sym- tems saved it over one billion dollars. Other commercial knowledge-based systems of pragmatic consequence were bolic integration programs, which was intended by its constructed by American Express, Manufacturer’s author to work by tree search, that is, methodically explor- Hanover, duPont, Schlumberger, and others. ing the space of possible problem transformations. SIN, on The mid-1980s also saw the development of BAYESIAN the other hand, claimed that its goal was the avoidance of search, and that search was to be avoided by knowing what NETWORKS (Pearl 1986), a form of KNOWLEDGE REPRESEN- to do. It sought to bring to bear all of the cleverness that TATION grounded in probability theory, that has recently seen good integrators used, and attempted to do so using its most wide use in developing a number of successful expert sys- powerful techniques first. Only if these failed would it even- tems (see, e.g., Heckerman, Wellman, and Mamdani 1995). tually fall back on search. Work in knowledge-based systems is rooted in observa- DENDRAL’s task was analytic chemistry: determining tions about the nature of human intelligence, viz., the obser- the structure of a chemical compound from a variety of vation that human expertise of the sort involved in explicit physical data about the compound, particularly its mass reasoning is typically domain specific and dependent on a spectrum (the way in which the compound fragments when large store of task specific knowledge. Use of simple if/then subjected to ionic bombardment). DENDRAL worked by rules—production rules—is drawn directly from the early generate and test, generating possible structures and testing work of Newell and Simon (1972) that used production them (in simulation) to see whether they would produce the rules to model human PROBLEM SOLVING. mass spectrum observed. The combinatorics of the problem The conception of knowledge-based systems as attempts quickly become unmanageable: even relatively modest to clone human reasoning also means that these systems sized compounds have tens of millions of isomers (different often produce detailed models of someone’s mental con- ways in which the same set of atoms can be assembled). ception of a problem and the knowledge needed to solve it. Hence in order to work at all, DENDRAL’s generator had to Where other AI technologies (e.g., predicate calculus) are be informed. By working with the expert chemists, DEN- more focused on finding a way to achieve intelligence DRAL’s authors were able to determine what clues chemists without necessarily modeling human reasoning patterns, found in the spectra that permitted them to focus their knowledge-based systems seek explicitly to capture what search on particular subclasses of molecules. Hence DEN- people know and how they use it. One consequence is that DRAL’s key knowledge was about the “fingerprints” that the effort of constructing a system often produces as a side different classes of molecules would leave in a mass spec- effect a more complete and explicit model of the expert’s trum; without this it would have floundered among the mil- conception of the task than had previously been available. lions of possible structures. The system’s knowledge base thus has independent value, MYCIN’s task was diagnosis and therapy selection for a apart from the program, as an expression of one expert’s variety of infectious diseases. It was the first system to have mental model of the task and the relevant knowledge. 432 Knowledge Compilation MYCIN and other programs also provided one of the O’Leary, D. E. (1994). Verification and validation of intelligent systems: Five years of AAAI workshops. International Journal early and clear illustrations of Newell’s concept of the of Intelligent Systems 9(8): 953–957. knowledge level (Newell 1982), that is, the level of abstrac- Schreiber, G., B. Wielinga, and J. Breuker, Eds. (1993). KADS: A tion of a system (whether human or machine) at which one Principled Approach to Knowledge–Based System Develop- can talk coherently about what it knows, quite apart from ment. New York: Academic Press. the details of how the knowledge is represented and used. Shrobe, H. E., Ed. (1988). Exploring Artificial Intelligence. San These systems also offered some of the earliest evidence Mateo, CA: Morgan Kaufmann. that knowledge could obviate the need for search, with DEN- DRAL in particular offering a compelling case study of the Knowledge Compilation power of domain specific knowledge to avoid search in a space that can quickly grow to hundreds of millions of choices. See also AI AND EDUCATION; DOMAIN SPECIFICITY; See EXPLANATION-BASED LEARNING; METAREASONING FRAME-BASED SYSTEMS; KNOWLEDGE ACQUISITION Knowledge Representation —Randall Davis Knowledge representation (KR) refers to the general topic References of how information can be appropriately encoded and uti- Barker, V., and D. O’Connor. (1989). Expert systems for configu- lized in computational models of cognition. It is a broad, ration at Digital: XCON and beyond. Communications of the rather catholic field with links to logic, computer science, ACM March: 298–318. cognitive and perceptual psychology, linguistics, and other Davis, R. (1977). Production rules as a representation for a parts of cognitive science. Some KR work aims for psycho- knowledge–based consultation program. Artificial Intelligence logical or linguistic plausibility, but much is motivated more 8:15–45. by engineering concerns, a tension that runs through the Ernst, G. W. (1969). GPS: A Case Study in Generality and Prob- entire field of AI. KR work typically ignores purely philo- lem Solving. New York: Academic Press. sophical issues, but related areas in philosophy include anal- Feigenbaum, E. A., B. G. Buchanan, and J. Lederberg. (1971). On yses of mental representation, deductive reasoning and the generality and problem solving: A case study using the DEN- “language of thought,” philosophy of language, and philo- DRAL program. In B. Meltzer and D. Michie, Eds., Machine Intelligence 6, pp. 165–189. sophical logic. Heckerman, D., M. Wellman, and A. Mamdani, Eds. (1995). Real Typically, work in knowledge representation focuses world applications of Bayesian networks. Communications of either on the representational formalism or on the informa- the ACM, vol. 38, no. 3. tion to be encoded in it, sometimes called knowledge Moses, J. (1967). Symbolic Integration. Ph.D. diss., Massachusetts engineering. Although many AI systems use ad-hoc repre- Institute of Technology. sentations tailored to a particular application, such as digital Moses, J. (1971). Symbolic integration: The stormy decade. Com- maps for robot navigation or graphlike story scripts for lan- munications of the ACM 14: 548–560. guage comprehension, much KR work is motivated by the Newell, A. (1982). The knowledge level. Artificial Intelligence perceived need for a uniform representation, and the intu- 18(1): 87–127. ition that because human intelligence can rapidly draw Newell, A., and H. A. Simon. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. appropriate conclusions, KR should seek conceptual frame- Pearl, J. (1986). Fusion, propagation, and structuring in belief net- works in which these conclusions have short derivations. works. Artificial Intelligence (29)3: 241–288. The philosophical integrity or elegance of this assumed Slagle, J. (1963). A heuristic program that solves symbolic integra- framework is less important than its practical effectiveness; tion problems in freshman calculus. In E. A. Feigenbaum and J. for example, Jerry Hobbs (1985) urges a principle of onto- Feldman, Eds., Computers and Thought, pp. 191–206. logical promiscuity in KR. The central topic in knowledge engineering is to iden- Further Readings tify an appropriate conceptual vocabulary; a related collec- tion of formalized concepts is often called an ontology. For Bachant, J., and J. McDermott. (1984). R1 revisited: Four years in example, temporal or dynamic knowledge is often repre- the trenches. Artificial Intelligence Magazine 21–32. sented by describing actions as functions on states of the Buchanan, B. G., and E. H. Shortliffe. (1984). Rule-Based Expert world, using axioms to give sufficient conditions for the Systems. Reading, MA: Addison-Wesley. Clancey, W., and E. Shortliffe. (1984). Readings in Medical Artifi- success of the action, and then using logical reasoning to cial Intelligence. Reading, MA: Addison-Wesley. prove constructively that a state exists that satisfies a goal. Davis, E. (1990). Representations of Commonsense Knowledge. The name of the final state then provides a “plan” of the San Mateo, CA: Morgan Kaufmann. actions necessary to achieve the goal, such as: Davis, R., and D. Lenat. (1982). Knowledge-Based Expert Sys- drink(move(mouth,pickup(cup,start-state))). Though useful, tems. New York: McGraw-Hill. this ontology has several stubborn difficulties, notably the Lenat, D. B., and R. V. Guha. (1990). Building Large Knowledge- FRAME PROBLEM, that is, how compactly to state what Based Systems. Reading, MA: Addison-Wesley. remains unchanged by an action. (For example, picking up Lockwood, S., and Z. Chen. (1995). Knowledge validation of engi- something from a table obviously leaves the table in the neering expert systems. Advances in Engineering Software same place and doesn’t change the color of anything, but 23(2): 97–104. Knowledge Representation 433 because the state has changed this needs to be made edge” is seen as a reservoir of useful information rather explicit; and some actions do have such side effects, so the than as supporting a model of cognitive activity. Here possibility cannot be ruled out on logical grounds.) Many action planning is often unnecessary, and the ontology solutions to the frame problem have been proposed, but fixed by the particular application—for example, medical none are fully satisfactory. Other approaches divide the diagnosis or case law—but issues of scale become impor- world into objects with spatial and temporal boundaries tant. The contents of such systems can often be thought of (Hayes 1985), or use transformations on state descriptions either as sentential knowledge or as program code; one to model actions more directly. Several areas of knowl- school of thought in KR regards this distinction as essen- edge engineering have received detailed attention, notably tially meaningless in any case. More recently, increased intuitive physical knowledge, often called qualitative phys- available memory size has made it feasible to use “com- ics (Davis 1990; Weld and de Kleer 1990). pute-intensive” representations that simply list all the par- KR formalisms need a precisely defined SYNTAX, a use- ticular facts rather than stating general rules. These allow ful SEMANTICS, and a computationally tractable inference the use of statistical techniques such as Markov simula- procedure. A wide variety have been studied. Typical fea- tion, but seem to abandon any claim to psychological plau- tures include a notation for describing concept hierarchies sibility. and mechanisms to maintain property inheritance; an ability Commercial use of knowledge bases and a proliferation to check for, and correct, propositional inconsistency in the of Krep systems with various ad-hoc syntactic restrictions light of new information (”truth-maintenance”; see Forbus has created a need for “standard” or “interchange” formal- and de Kleer 1993); and ways of expressing a “closed isms to allow intertranslation. These include the Knowledge world” assumption, that is, that a representation contains all Interchange Format, a blend of first-order set theory and facts of a certain kind (so if one is omitted it can be assumed LISP (Genesereth et al. 1992) and conceptual graphs, a to be false). graphical notation inspired by C. S. Peirce (Sowa 1997). Many notations are inspired by sentential logics, some by There is also considerable interest in compiling standard semantic networks, others, often called “frame-based,” re- ontologies for commonly used concepts such as temporal semble object-oriented programming languages. Most of relations or industrial process control (see the “ontology these can be regarded as syntactic variations on subsets of page” http://mnemosyne.itc.it:1024/ontology.html for the first-order relational logic, sometimes extended to allow current state of research in this area). Bayesian probabilistic inference, fuzzy reasoning, and other Although there has been a great deal of work on KR, much ways to express partial or uncertain information or degrees intuitive human knowledge still resists useful formalization. of confidence. Many also use some form of default or non- Even such apparently straightforward areas as temporal and monotonic reasoning, allowing temporary assumptions to be spatial knowledge are still subjects of active research, and the cancelled by later or more detailed information. For exam- knowledge involved in comprehending simple stories or ple, if told that something is an elephant one can infer that it understanding simple physical situations is still yet to be ade- is a mammal, but this inference would be withdrawn given quately formalized. Many ideas have been developed in NL the further information that it is a toy elephant. More research, such as “scripts” or idealized story-frameworks and recently there has been considerable interest in diagram- the use of a limited number of conceptual primitive catego- matic representations that are supposed to represent by ries, but none have achieved unqualified success. being directly similar to the subject being represented Early work in AI and cognitive science assumed that (Glasgow, Narayan, and Chandrasekharan 1995). These are suitably represented information must be central in a sometimes claimed to be plausible models of mental IMAG- proper account of cognition, but more recently this assump- tion has been questioned, notably by “connectionist” and ERY, but Levesque and Brachman (1985) point out that a set “situated” theories. Connectionism seeks to connect cogni- of ground propositions together with a closed-world tive behavior directly to neurally inspired mechanisms, assumption has many of the functional properties of a men- whereas situated theories focus on how behavior emerges tal image. from interaction with the environment (Brooks 1991). Both A central issue for KR formalisms is the tradeoff were initially seen as in direct opposition to knowledge- between expressive power and deductive complexity. At one representation ideas, but reconciliations are emerging. In extreme, propositional logic restricted to Horn clauses (dis- particular, phase-encoding seems to enable connectionist junctions of atomic propositions with at most one negation) networks to perform quite sophisticated logical reasoning. admits a very efficient decision procedure but cannot ex- The single most important lesson to emerge from these press any generalizations; at another, full second-order logic controversies is probably that the representation of knowl- is capable of expressing most of mathematics but has no edge cannot be completely isolated from its hypothesized complete inference procedure. Most KR formalisms adopt functions in cognition. various compromises. Network and frame-based formalisms often gain deductive efficiency by sacrificing the ability to See also COGNITIVE MODELING, CONNECTIONIST; COGNI- express arbitrary disjunctions. Description logics (Borgida TIVE MODELING, SYMBOLIC; COMPUTATIONAL THEORY OF et al. 1989) guarantee polynomial-time decision algorithms MIND; CONNECTIONISM, PHILOSOPHICAL ISSUES; KNOWL- by using operators on concept descriptions instead of quan- EDGE ACQUISITION; LOGICAL REASONING SYSTEMS; NEURAL tifiers. NETWORKS; PROBLEM SOLVING Commercial applications use knowledge representation —Patrick Hayes as an extension of database technology, where the “knowl- 434 Language Acquisition acquisition as a testbed for exploring and contrasting theo- References ries of learning, development, and representation. Neither Borgida, A., R. J. Brachman, D. L. McGuinness, and A. L. the natural communication systems of infrahumans nor the Resnick. (1989). CLASSIC: A structural data model for outcomes for apes specially tutored in aspects of spoken or objects. SIGMOD Record 18(2): 58–67. signed systems approach in content or formal complexity Brooks, R. A. (1991). Intelligence without representation. Artifi- the achievements of the most ordinary 3-year-old human cial Intelligence 47(1–3): 139–160. (see also PRIMATE LANGUAGE). Because children are the Davis, E. (1990). Representations of Common-Sense Knowledge. only things (living or nonliving) that are capable of this Stanford: Morgan Kaufmann. learning, computer scientists concerned with simulating the Forbus, K., and J. de Kleer. (1993). Building Problem Solvers. process study language acquisition for much the same rea- Cambridge, MA: MIT Press. Genesereth, M., and R. E. Fikes. (1992). Knowledge Interchange son that Leonardo Da Vinci, who was interested in building Format Reference Manual. Stanford University Logic Group, a flying machine, chose to study birds. report 92-1, Stanford University. Language acquisition begins at birth, if not earlier. Chil- Glasgow, J., N. H. Narayan, and B. Chandrasekharan, Eds. (1995). dren only a few days old can discriminate their own lan- Diagrammatic Reasoning: Cognitive and Computational Per- guage from another, presumably through sensitivity to spectives. Cambridge, MA: AAAI/MIT Press. language-specific properties of prosody and phonetic pat- Hayes, P. (1985). Naive Physics I: Ontology for liquids. In Hobbs terning. In the first several months of life, they discriminate and Moore (1985), pp. 71–107. among all known phonetic contrasts used in natural lan- Hobbs, J. R. (1985). Ontological promiscuity. Proc. 23rd guages, but this ability diminishes over time such that by Annual Meeting of the Association for Computational Lin- about 12 months, children distinguish only among the con- guistics 61–69. Levesque, H., and R. Brachman. (1985). A fundamental tradeoff in trasts made in the language they are exposed to. Between knowledge representation and reasoning. In Brachman and about the seventh and tenth month, infants begin reduplica- Levesque (1985). tive babbling, producing sounds such as “baba” and “gaga” Sowa, J. F. (1997). Knowledge Representation: Logical, Philo- (see also PHONOLOGY, ACQUISITION OF). At this age, deaf sophical and Computational Foundations. Boston: PWS. children too begin to babble—with their hands—producing Weld, D., and J. de Kleer, Eds. (1990). Readings in Qualitative repetitive sequences of sign-language formatives that Reasoning about Physical Systems. San Francisco: Morgan resemble the syllabic units of vocal babbling. In general, the Kaufmann. acquisition sequence does not seem to differ for spoken and signed languages, suggesting that the language-learning Further Readings capacity is geared to abstract linguistic structure, not simply Brachman, R., and H. Levesque. (1985). Readings in Knowledge to speech (see SIGN LANGUAGES). Representation. Stanford: Morgan Kaufmann. Comprehension of a few words has been demonstrated as Hobbs, J., and R. Moore, Eds. (1985). Formal Theories of the early as 9 months, and first spoken words typically appear Common-Sense World. Norwood, NJ: Ablex. between 12 and 14 months. The most common early words McCarthy, J., and V. Lifschitz, Eds. (1990). Formalizing Common are names for individuals (Mama), objects (car), and sub- Sense. Norwood, NJ: Ablex. stances (water); these nominal terms appear early in lan- Russell, S., and P. Norvig. (1995). Artificial Intelligence; A Mod- guage development for children in all known cultures. Other ern Approach. Englewood Cliffs, NJ: Prentice-Hall. common components of the earliest vocabulary are animal names, social terms such as bye-bye and—of course—no. Language Acquisition Verbs and adjectives as well as members of the functional categories (such as the, -ed, is, and with) are rare in the early Language acquisition refers to the process of attaining a vocabulary compared to their frequency in the input corpus. specific variant of human language, such as English, At this initial stage, new words appear in speech at the rate Navajo, American Sign Language, or Korean. The funda- of about two or three a week and are produced in isolation mental puzzle in understanding this process has to do with (that is, in “one word sentences”). The rate of vocabulary the open-ended nature of what is learned: children appropri- growth increases and so does the character of the vocabu- ately use words acquired in one context to make reference in lary, with verbs and adjectives being added and functional the next, and they construct novel sentences to make known morphemes beginning to appear. This change in growth rate their changing thoughts and desires. In light of the creative and lexical class typology coincides with the onset of syntax nature of this achievement, it is striking that close-to-adult (for further discussion and references, see WORD MEANING, proficiency is attained by the age of 4–5 years despite large ACQUISITION OF and SYNTAX, ACQUISITION OF). differences in children’s mentalities and motivations, the The most obvious sign of early syntactic knowledge is circumstances of their rearing, and the particular language canonical phrasal order: Children’s speech mirrors the to which they are exposed. Indeed, some linguists have canonical sequence of phrases (be it Subject-Verb-Object, argued that the theoretical goal of their discipline is to Verb-Subject-Object, etc.) of the exposure language as soon explain how children come to have knowledge of language as they begin to combine words at all (Pinker 1984). through only limited and impoverished experience of it in English-speaking children’s first word combinations were the speech of adults (i.e., “Plato’s problem”; Chomsky originally called “telegraphic” by investigators (Brown and 1986). For closely related reasons, philosophers, evolution- Bellugi 1964) because they are short and because they lack ary biologists, and psychologists have long used language most function words and morphemes, giving the speech the Language Acquisition 435 minimalist flavor of telegrams and classified ads. But more plenty of evidence that the abused and neglected children of recent investigation shows this characterization to be inade- the world, regardless of their other difficulties, adequately quate, in part because this pattern of early speech is not uni- acquire the language of their communities. versal. In languages where closed-class morphemes are Because speech to children is almost perfectly grammati- stressed and syllabic, they appear before age 2 (Slobin cal and meaningful, it seems to offer a pretty straightfor- 1985). ward model for acquisition. Yet in principle such a “good Another problem with calling children’s language “tele- sample” of, say, English, is a limited source of information graphic” is that this characterization underestimates the for of necessity it doesn’t explicitly expose the full range of extent of child knowledge. There is significant evidence that structure and content of the language. Notice, as an exam- infants’ knowledge of syntax, including the forms and ple, that although an adjective can appear in two structural semantic roles played by functional elements, is radically in positions in certain otherwise identical English sentences advance of their productive speech. One source of evidence (e.g., Paint the red barn and Paint the barn red), this is not is preferential looking behavior in tasks where a heard sen- always so: Woe to the learner who generalizes from See the tence matches only one of two visually displayed scenes. red barn to See the barn red. It has been proposed, there- Such methods have shown, for example, that appreciation of fore, that “negative evidence”—information about which word order and its semantic implications are in place well sentences are ill-formed in some way—might be crucial for before the appearance of two-word speech. For example, deducing the true nature of the input language. Such infor- infants will look primarily to the appropriate action video in mation might be available in parents’ characteristic reac- response to hearing Big Bird tickles Cookie Monster versus tions to the child’s ill-formed early speech, thus providing Cookie Monster tickles Big Bird (Hirsh-Pasek and Golinkoff statistical evidence as to right and wrong ways to speak the 1996) and show a rudimentary understanding of the seman- language. However, a series of studies beginning with tic implications of functional elements (Snedeker 1996). Brown and Hanlon (1970) have demonstrated that there is Another compelling source of evidence for underlying syn- little reliable correlation between the grammaticality of tactic knowledge is that analyses of the relative positioning children’s utterances and the sorts of responses to these that of subjects, negative words, and verbs, and their interaction their parents give, even for children raised in middle-class with verb morphology, makes clear that children have sig- American environments. Moreover, children learn language nificant implicit knowledge of functional projections; for in cultures in which nobody speaks to infants until the example, they properly place negative words with respect to infants themselves begin to talk (Lieven 1994). finite versus infinitive forms of the verb. That is, the French As mentioned above, neonates can detect and store at toddler regularly says “mange pas” but “pas manger” least some linguistic elements and their patterning from (Deprez and Pierce 1993). minimal distributional information (Saffran, Aslin, and By the age of 3 years or before, the “telegraphic” stage of Newport 1997). Recent computational simulations suggest speech ends and children’s utterances increase in length and that certain lexical category, selectional, and syntactic prop- complexity. For instance, at age 3 and 4 we hear such con- erties of the language can be gleaned from such patterns in structions as inverted yes/no questions (Is there some food text (e.g., Cartwright and Brent 1997). Most theories of lan- over there?), relative clauses (the cookie that I ate), and guage development, following Chomsky (1965) and Mac- control structures (I asked him to go). Production errors at namara (1972), assume that children also have access to this stage are largely confined to morphological regulariza- some nonlinguistic encoding of the context. That is, they are tions (e.g., goed for went) and syntactic overextensions (She more likely to hear hippopotamus in the presence of hippos giggled the baby for She made the baby giggle; see Bower- than of aardvarks, and The cat is on the mat is, other things man 1982), though even these errors are rare (Marcus et al. being equal, a more likely sentence in the presence of cats 1992). and mats than in their absence. The effects of such contex- One important commitment that all acquisition theories tual support are obvious for learning the meanings of words, make is to the sort of input that is posited to be required by but they are likely to be critical as well for the acquisition of the learner. After all, the richer and more informative the syntax (Bloom 1994; Pinker 1984). information received, the less preprogrammed capacity and Whatever the detailed cause-and-effect relations between inductive machinery the child must supply “from within.” input properties and learning functions, it seems a truism Adults tend to communicate with their offspring using slow, that the variant of human language attained by young chil- high-pitched speech with exaggerated intonation contours, a dren is that modeled by the adult community of language relatively simple and restricted vocabulary, and short sen- users. Yet even in this regard, there are significant excep- tences. Although whimsically dubbed “Motherese” (New- tions. For example, isolated deaf children not exposed to port, Gleitman, and Gleitman 1977), this style of speech is signed languages spontaneously generate gestural systems characteristic of both males and females, and even of older that share many formal and substantive features with the children when talking to younger ones. DARWIN, who was received languages (e.g., Feldman, Golden-Meadow, and Gleitman 1978; see also SIGN LANGUAGES). Similarly, chil- interested in language learning and kept diaries of his chil- dren’s progress, called this style of speech “the sweet music dren whose linguistic exposure is to “pidgin” languages of the species.” Infants resonate to it, strongly preferring (simple contact systems with little grammatical structure) Motherese to the adult-to-adult style of speech (Fernald and elaborate and regularize these in the course of learning them Kuhl 1987). Although it is possible that these apparent adult (Bickerton 1981; Newport 1990; see CREOLES). Much of simplifications might facilitate aspects of learning, there is language change can be explained as part of this process of 436 Language Acquisition guage deficiencies are not across-the-board but may affect regularization and elaboration of received systems by chil- only a single aspect. For example, both in SLI (Rice and dren. In a very real sense, then, children do not merely learn Wexler 1996) and in Down’s syndrome (Fowler 1990), language; they create it. word learning and some aspects of syntax (e.g., word The acquisition properties just described sketch the order) are adequate, but there is a specific deficit for func- known facts about this process as it unfolds in young chil- tional projections (i.e., knowledge of the interaction of verb dren, under all the myriad personal, cultural, and linguistic morphology and syntactic structure). circumstances into which they are born. Acquisition is To many theorists the bottom-line message from joint astonishingly “robust” to environmental circumstance. This investigation of normal and unusual learning is this: Lan- finding is at the heart of theorizing that assigns the child’s guage acquisition is little affected by the ambiance in which acquisition in large part to internal, biological, predisposi- learners find themselves, so long as they have intact young tions that can correct for adventitious properties of the envi- human brains; whereas any change in mentality (even: ronment. One more stunning indication of the merit of such growing up!) massively, and negatively, impacts the learn- a view is that young children simultaneously exposed to ing process. two—or even three—languages in infancy and early child- Beyond questions of nature-nurture in the acquisition hood acquire them all as rapidly and systematically as the process are more specific questions about acquisition func- child in monolingual circumstances acquires one language. tions for language at various levels in the linguistic hierar- That is, if the child in a bilingual home sleeps as much as chy, including PHONOLOGY, SYNTAX, SEMANTICS, and word her monolingual cousin, she necessarily hears a smaller sample of each of the languages; yet attainment of each is at learning. At this moment in the progress of the field, inter- age-level under conditions where both languages are in reg- pretation of such results is limited to the extent that a num- ular use in the child’s immediate environment. ber of viable theories of language design—that is, of the If it is true that biologically “prepared” factors in young targets of acquisition—contend in the linguistic literature. children significantly support and constrain the learning Moreover, theoretical pressure to account for the emerging process, then learning might look different when those bio- facts about language learning often motivates revision of the logical factors vary. And indeed it does. The most obvious linguistic theory. One way to think about how language and first case to inspect is that of learners who come into a its learning are jointly explored, then, is to contrast two dif- language-learning situation at a later brain-maturational ferent questions that investigators ask, depending on their state. In contrast to the almost universal attainment of a high orientations: “What is a child, such that it can learn lan- level of language knowledge by humans exposed to a model guage?” versus “What is language, such that every child can in infancy or early childhood stand the sharply different learn it?” results for such late learners. In the usual case, these are sec- Particularly informative in adjudicating these issues are ond-language learners, and the level of their final attainment studies that contrast learning and language effects crossling- is inferior to infant learners as a direct negative function of uistically (a classic collection of such articles appears in maturational status at first exposure. However, the same Slobin 1985). Another useful direction has been to consider generalizations apply to acquisition of a primary language acquisition across various levels of the linguistic hierarchy, late in life (Newport 1990). Apparently, these increasing to examine the extent of their interaction. For example, pho- decrements in language-acquisition skills over maturational nological and semantic knowledge underlie aspects of how time apply both to learning idiosyncratic details of the expo- children understand the syntax of a sentence; syntactic and sure language and to language universals, properties com- morphological cues support the acquisition of word mean- mon across the languages of the world. Such “critical ing; and so on. Recent technological developments are period” or “sensitive period” effects, though undeniable on beginning to enable acquisition work in directions unfore- the obtained evidence, cannot by themselves reveal just seen and infeasible up until the last few years. One impor- what is being lost or diminished in the late learner: This tant example concerns child on-line language processing, an could be some aspects of learning specific to language area that has remained largely closed to investigation (for an itself, general capacities for structured cognitive learning, or exception, see McKee 1996) until the advent of eyetracking some combination of the two (for discussion of the complex equipment that can monitor the child’s linguistic representa- interweave of nature and nurture in accounting for critical tions as these are constructed, in milliseconds, dependent on period effects, see Marler 1991). the reference world (Trueswell, Sekerina, and Hill 1998). A large and informative literature considers tragedies of Another is neuropsychological investigation employing nature, populations of learners with conceptually or lin- techniques such as POSITRON EMISSION TOMOGRAPHY and guistically relevant deficits. The effect of these studies, functional MAGNETIC RESONANCE IMAGING (fMRI) record- taken as a whole, is to demonstrate, first, that linguistic and ing of neural activity during linguistic processing. Finally, general-cognitive capacities are often dissociated in pathol- the increasing use of computer models and simulations ogy. For example, in Williams syndrome (Bellugi et al. allows explicit and detailed theories to be tested on large 1988), language-learning abilities are virtually unscathed bodies of computerized linguistic data. but these children’s IQs are severely below normal; in SLI In light of the growing armamentarium of investigative (Specific Language Impairment; van der Lely 1997), chil- technique and increasingly sophisticated linguistic analysis, dren with normal and even superior IQs exhibit defects in understanding of language acquisition can be expected to the timing and content of language acquisition. The second increase rapidly in the coming decade. All current theoreti- overall finding from these unusual populations is that lan- cal positions, no matter how different in other ways, Language Acquisition 437 acknowledge that language acquisition is the result of an Feldman, H., S. Goldin-Meadow, and L. R. Gleitman. (1978). Beyond Herodotus: The creation of language by linguistically interaction between aspects of the human mind and aspects deprived deaf children. In A. Lock, Ed., Action, Gesture, and of the environment: children learn language but rocks and Symbol. New York: Academic Press. dogs do not; children in France learn French and children in Fernald, A., and P. Kuhl. (1987). Acoustic determinants of infant Mawu learn Mawukakan. Substantive debates in the present preference for motherese speech. Infant Behavior and Develop- literature largely center on the nature of the learning mecha- ment 10: 279–293. nism. One issue is whether language is learned through a Fowler, A. (1990). Language abilities in children with Down’s syn- specialized organ or module or is the product of more gen- drome: Evidence for a specific syntactic delay. In D. Cicchetti eral learning capacities. A further issue, logically distinct and M. Beeghly, Eds., Children with Down’s Syndrome: A from the question of specialization, is whether the mecha- Developmental Perspective. Cambridge: Cambridge University nisms of language acquisition and representation exploit Press, pp. 302–328. Hirsh-Pasek, K., and R. Golinkoff (1996). The Origins of Gram- symbolic rules, associations, or some combination of the mar: Evidence from Early Language Comprehension. Cam- two. bridge, MA: MIT Press. In conclusion, we should stress that language acquisition Lieven, E. V. M. (1994). Crosslinguistic and crosscultural aspects is a diverse process. It is extremely likely, for instance, that of language addressed to children. In C. Gallaway and B. J. the right explanation for how children learn to form wh- Richards, Eds., Input and Interaction in Language Acquisition. questions will involve different cognitive mechanisms from Cambridge: Cambridge University Press. those required for learning proper names or learning to take Macnamara, J. (1972). Cognitive basis of language learning in turns in conversation. We suspect, in fact, that there is no infants. Psychological Review 79: 1–13. single story to be told about how children acquire language. Marcus, G. F., S. Pinker, M. Ullman, M. Hollander, T. Rosen, and Rather, the formatives, the learning machinery, and the com- F. Xu. (1992). Overgeneralization in language acquisition. Monographs of the Society for Research in Child Development. binatorial schemata at each level of the final system can be Marler, P. (1991). The instinct to learn. In S. Carey and R. Gelman, expected to vary. At the same time, knowledge of one aspect Eds., The Epigenesis of Mind: Essays on Biology and Cogni- of the system can be expected to piggyback on the next in a tion. Hillsdale, NJ: Erlbaum. complex incremental learning scheme, some of whose McKee, C. (1996). On-line methods. In D. McDaniel, C. McKee, dimensions and procedures cannot now even be guessed at. and H. Smith-Cairns, Eds., Methods for Assessing Children’s See also INDUCTION; INNATENESS OF LANGUAGE; NATIV- Syntax. Cambridge, MA: MIT Press, pp. 190–208. ISM, HISTORY OF; POVERTY OF THE STIMULUS ARGUMENTS; Newport, E. L. (1990). Maturational constraints on language learn- THEMATIC ROLES ing. Cognitive Science 14: 11–28. Newport, E. L., L. Gleitman, and H. Gleitman. (1977). Mother, I’d —Lila Gleitman and Paul Bloom rather do it myself: Some effects and non-effects of maternal speech style. In C. E. Snow and C. A. Ferguson, Eds., Talking to Children: Language Input and Acquisition. Cambridge: References Cambridge University Press. Bellugi, U., S. Marks, A. Bihrle, and H. Sabo. (1988). Dissociation Pinker, S. (1984). Language Learnability and Language Develop- between language and social functions in Williams syndrome. ment. Cambridge, MA: Harvard University Press. In K. Mogford and D. Bishop, Eds., Language Development in Rice, M., and K. Wexler. (1996). Toward tense as a clinical marker Exceptional Circumstances. New York: Churchill-Livingstone of Specific Language Impairment in English-speaking children. Inc., pp. 177–189. Journal of Speech and Hearing Research 39: 1239–1257. Bickerton, D. (1981). Roots of Language. Ann Arbor, MI: Karoma. Saffran, J. R., R. N. Aslin, and E. L. Newport. (1996). Statistical Bloom, P. (1994). Semantic competence as an explanation for learning by 8-month old infants. Science 274: 1926–1928. some transitions in language development. In Y. Levy and I. Slobin, D. I., Ed. (1985). The Crosslinguistic Study of Language Schlesinger, Eds., Other Children, Other Languages: Theoreti- Acquisition. Vol. 1, The Data. Hillsdale, NJ: Erlbaum. cal Issues in Language Development. Hillsdale, NJ: Erlbaum. Snedeker, J. (1996). “With” or without: children’s use of a func- Bowerman, M. (1982). Starting to talk worse: Clues to language tional preposition in sentence comprehension. Unpublished acquisition from children’s late speech errors. In S. Strauss, manuscript, University of Pennsylvania. Ed., U-Shaped Behavioral Growth. New York: Academic Trueswell, J., I. Sekerina, and N. Hill. (1998). On-line sentence Press. processing in children: Evidence from experiments during lis- Brown, R., and U. Bellugi. (1964). Three processes in the child’s tening. Paper presented at the CUNY Sentence Processing Con- acquisition of syntax. In E. H. Lenneberg, Ed., New Directions ference, Rutgers, NJ, March 19, 1998. in the Study of Language. Cambridge, MA: MIT Press. van der Lely, H. K. J. (1997). Language and cognitive development Brown, R., and C. Hanlon. (1970). Derivational complexity and in a Grammatical SLI boy: Modularity and innateness. Journal order of acquisition in child speech. In J. R. Hayes, Ed., Cogni- of Neurolinguistics 10: 75–107. tion and the Development of Language. New York: Wiley. Cartwright, T. A., and M. R. Brent. (1997). Syntactic categoriza- Further Readings tion in early language acquisition: Formalizing the role of dis- tributional analysis. Cognition 63(2): 121–170. Aksu-Koc, A. A., and Slobin D. I. (1985). The acquisition of Turk- Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, ish. In D. I. Slobin, Ed., The Crosslinguistic Study of Language MA: MIT Press. Acquisition. Vol. 1, The Data. Hillsdale, NJ: Erlbaum. Chomsky, N. (1986). Knowledge of Language: Its Nature, Origin, Bickerton, D. (1984). The language bioprogram hypothesis. and Use. New York: Praeger. Behavioral and Brain Sciences 7: 173–187. Deprez, V., and A. Pierce. (1993). Negation and functional projec- Bloom, L. (1970). Language Development: Form and Function in tions in early grammar. Linguistic Inquiry 24(1): 25–67. Emerging Grammars. Cambridge, MA: MIT Press. 438 Language and Cognition Bloom, P. (1990). Syntactic distinctions in child language. Journal function of listener. Monographs of the Society for Research in of Child Language 17: 343-355. Child Development 38 (5, Serial No. 152). Bloom, P. (1994). Recent controversies in the study of language Singleton, J. L., and E. L. Newport. (1994). When learners surpass acquisition. In M. A. Gernsbacher, Ed., Handbook of Psychol- their models: The acquisition of American Sign Language from inguistics. San Diego, CA: Academic Press. impoverished input. Unpublished manuscript, University of Braine, M. D. S. (1976). Children’s first word combinations. Illinois. Monographs of the Society for Research in Child Development Slobin, D. I., and T. G. Bever. (1982). Children’s use of canonical 41. sentence schemas: A crosslinguistic study of word order and Brown, R. (1973). A First Language: The Early Stages. Cam- inflections. Cognition 12: 229–265. bridge, MA: Harvard University Press. Werker, J. F., and R. C. Tees. (1984). Cross-linguistic speech per- Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, ception: Evidence for perceptual reorganiztion in the first year MA: MIT Press. of life. Infant Behavior and Development 7: 49–63. Darwin, C. H. (1877). A biographical sketch of a young child. Kos- Wexler, K. (1982). A principle theory for language acquisition. In mos 1: 367–376. E. Wanner and L. Gleitman, Eds., Language Acquisition: The Gleitman, L. R., and Gleitman, H. (1992). A picture is worth a State of the Art. Cambridge: Cambridge University Press, pp. thousand words but that’s the problem: The role of syntax in 288–318. vocabulary acquisition. Current Directions in Psychological Science 1(1): 31–35. Language and Cognition Gleitman, L., H. Gleitman, B. Landau, and E. Wanner. (1988). Where learning begins: Initial representations for language learning. In F. Newmeyer, Ed., Language: Psychological and SeeINTRODUCTION: LINGUISTICS AND LANGUAGE; LAN- Biological Aspects. Vol. 3, Linguistics. The Cambridge Survey. GUAGE AND THOUGHT; LANGUAGE OF THOUGHT New York: Cambridge University Press. Gleitman, L., and E. L. Newport. (1995). The invention of lan- Language and Communication guage by children: Environmental and biological influences on the acquisition of language. In L. Gleitman and M. Liberman, Eds., An Invitation to Cognitive Science, 2nd ed. Vol. 1, Lan- Language and communication are often defined as the guage. Cambridge MA: MIT Press, pp. 1–24. human ability to refer abstractly and with intent to influence Johnson, J. S., and E. L. Newport. (1989). Critical period effects in the thinking and actions of other individuals. Language is second language learning: The influence of maturational state thought of as the uniquely human part of a broader system on the acquisition of English as a second language. Cognitive of communication that shares features with other ANIMAL Psychology 21: 60–99. COMMUNICATION systems. In the twentieth century, lan- Jusczyk, P. W. (1997). The Discovery of Spoken Language. Cam- bridge, MA: MIT Press/Bradford Books. guage research has focused largely on those aspects of vocal Kelly, M. H., and S. Martin. (1994). Domain general abilities communication (or their homologs in SIGN LANGUAGES) applied to domain specific tasks: Sensitivity to probabilities in that are organized as categorial oppositions (de Saussure perception, cognition, and language. Lingua 92: 105–140. (1916/1959); for example, categories of sound, grammar, Lenneberg, E. (1967). Biological Foundations of Language. New and meaning. The domain of language research has been York: Wiley. largely speech. In 1960, the linguist Charles Hockett advo- Marcus, G. F. (1993). Negative evidence in language acquisition. cated restricting the term “[human] language” to just those Cognition 46: 53–85. dimensions of communication that are vocal, syntactic, arbi- Mintz, T. H., E. L. Newport, and T. G. Bever. (1995). Distribu- trary in relation to their referents, abstractly referential (that tional regularities of grammatical categories in speech to is, meaning is determinable independently of the immediate infants. In J. Beckman, Ed., Proceedings of the North East Lin- guistic Society 25, vol. 2. Amherst, MA: GLSA. context of utterance), and learned. The host of other pat- Mehler, J., P. W. Jusczyk, N. Lambertz, J. Halsted, J. Bertoncini, terned dimensions of communicative acts—social, kinesic and C. Amiel-Tison. (1988). A precursor of language acquisi- and affective-volitional: vocal and nonvocal—has often tion in young infants. Cognition 29: 143–178. been labeled “paralanguage” and regarded as outside the Newport, E. L., and R. P. Meier. (1985). The acquisition of proper domain of linguistic inquiry. American Sign Language. In D.I. Slobin, Ed., The Cross-Lin- The dominant paradigm in linguistics since the 1950s guistic Study of Language Acquisition. Hillsdale, NJ: (Piatelli-Palmarini 1980) has as its foundation some notion Erlbaum. of language as a disembodied symbol manipulation device. Petitto, L. A., and P. F. Marentette. (1991). Babbling in the manual This Cartesian rationalist approach (Dreyfus 1992) has mode: Evidence for the ontogeny of language. Science 251: informed much research in experimental PSYCHOLINGUIS- 1493–1496. Piaget, J. (1926). The Language and Thought of the Child. New TICS. The approach meshes with a general theory of mind in York: Routledge and Kegan Paul. cognitive science whose influence has spread with the Pinker, S. (1994). The Language Instinct. New York: Morrow. increasing use of computer technology. Based on the meta- Pinker, S., and A. Prince. (1988). On language and connectionism: phor of the mind as a computer, higher human mental func- Analysis of a Parallel Distributed Processing model of lan- tions (language among them) are modeled on analogy with guage acquisition. Cognition 28: 73–193. the operating principles of formal information processing Plunkett, K., and V. Marchman. (1991). U-shaped learning and fre- (IP) systems, including, “decomposition of complexity into quency effects in a multi-layered perceptron: Implications for simpler units; routines and subroutines that automatically child language acquisition. Cognition 38: 43–102. and recursively operate on atomic units” (Gigerenzer and Shatz, M., and R. Gelman. (1973). The development of communi- Goldstein 1996). In modular IP models, subroutines are cation skills: Modifications in the speech of young children as a Language and Communication 439 “unintelligent” and encapsulated; that is, each functions in many dimensions of human communicative acts that have relative isolation from other modules in the system. Giger- been theorized to be functionally independent. Sociolinguis- enzer and Goldstein cite Levelt’s (1989) influential model tics (for example, Labov 1980) has shown that individuals’ of LANGUAGE PRODUCTION as representative of this general attempts to position themselves relative to different groups approach. As well, a great deal of basic research on SPEECH or social strata can have effects even to the level of language PERCEPTION, SENTENCE PROCESSING, and LANGUAGE AC- phonology. The psycholinguist Locke (1994) has noted that, QUISITION has been concerned with the nature of psycholin- “linguists have neglected the role of language as a medium guistic functions when variables of meaning and context for social interaction and emotional expression.” He distin- enter the experimental format only in a controlled manner, guishes “talk” (loosely: social speech) in human communi- via symbol and syntax. One goal has been to understand cation from a more restricted sense of speech and claims how arbitrary symbol and syntax, thought of as the “purely that the language acquisition process targets talk first, linguistic” dimensions of human communicative acts, func- assembling an essential social-interactional framework tion as pointers to extralinguistic meaning and context. within which speech is then acquired. The anthropologist Olson (1994) implicates the technology of writing as Kendon (1994) assigns equal status to gesture and speech as another source for this way of thinking about language, say- communicative resources. Cognitive and functional linguis- ing, “we introspect our language in terms of the categories tic theory (Lakoff 1987) attempts to ground language in laid down by our script.” In this view, spoken linguistic embodied experience and rejects the analytic separation of units are containers to carry abstract reference and these grammar from meaning. In later years, Hockett himself units have an existence independent from the context that (1987), in writing about “how language means,” acknowl- gives rise to them. (That linguistic units can be made to edged the difficulty of drawing a sharp boundary in situated function in this way in vitro is an issue crucially distinct language use between language and “paralanguage”; from what their nature and functions may be in the organic between the arbitrary and the iconic. He exhorted linguists whole of face-to-face communication.) to learn about language by studying the communicative Research cued by formal linguistics links to the com- package as a whole, “in vivo,” and to avoid focusing too bined senses of what language is (speech and syntax) and exclusively on, thereby making too much of, the abstract what its “main function” is (abstract reference). Lieber- referring and symbol manipulation properties of language. man’s (1991) research on the evolution of the human vocal The linguist Dwight Bolinger (Bolinger and Sears 1975) tract, for instance, pinpoints when in evolutionary time pursued linguistic research with a very different focus, stat- humans became capable of producing the modern range of ing, “language is speech embedded in gesture.” This state- spoken phonetic categorial distinctions, identifying this ment puts paralanguage (“gesture,” both vocal and kinesic) with the onset of human language as a whole. Research on before abstract, syntactic speech in studies of human lan- guage. The “gesture” with which Bolinger was primarily PRIMATE LANGUAGE often imports frameworks and units of concerned is prosody, the patterns of stress and intonation in analysis from linguistic analyses of human language. For speech. He also reported, however, observations of the example, reports of chimpanzee and gorilla attempts to use facial and bodily gestures that occur together with speech artificial and signed languages give totals of the words in the and noted their relation to prosody. He was interested, for animals’ vocabularies and comment on the extent to which example, in how a prosodic contour determines the meaning the animals are able to negotiate task demands absent infor- of an utterance as much as do the combined meanings of the mation from context on the basis of symbol manipulation words in the sentence; also in how gesture expresses affect alone. Cheney and Seyfarth’s (1990) finding that vervet or can reveal a speaker’s perspective on her own utterance at monkeys have distinct calls to alert conspecifics to the pres- the moment of speaking. ence of different types of predators gives rise to discussions McNeill and Duncan (1999) largely reject the language/ about the role of something like “words” in these animals’ paralanguage distinction. These authors theorize that ges- natural communication system. ture, broadly construed to include prosodic and rhythmic The dominant paradigm puts a chasm between animal phenomena (cf. Tuite 1993), iconic gestures, nonrepresenta- communication and human language, one that has been tional movements of the hands and body, semiotic valuation notably difficult for theories of the EVOLUTION OF LAN- of gesture space, as well as analog (as opposed to discrete) GUAGE to bridge. Saltationist evolutionary accounts assume patterning on other communicative dimensions, is intrinsic a significant structure-producing genetic mutation that to language. On this view, language is an organized form of makes human-language syntax possible. Some researchers on-line, interactive, embodied, and contextualized human hypothesize that humans are uniquely capable of develop- cognition. By hypothesis, the initial organizing impulse of a ing a THEORY OF MIND necessary for intentional communi- communicative production is a unit of thinking patterned cation; some that a particular mimetic ability is the key. simultaneously according to two distinct representational Other accounts emphasize the continuity between human systems: one categorial, compositional, and analytic; the cognitive, social, and communicative abilities and those of other imagistic, synthetic, and holistic. Patterning in either our primate relatives (see the collected papers in Hurford, system may emerge in speech or gesture production, though Studdert-Kennedy, and Knight 1998). The crucial evolu- it has often been convenient to think of speech as the embod- tionary join, however, remains underspecified. iment of the categorial, and gesture as the embodiment of the Alternative research strategies in linguistics, sociolin- noncategorial. According to McNeill and Duncan, reduction- guistics, anthropology, and psychology have assembled evi- ist accounts that isolate categorial speech from image, dence of much interpenetration and interdependence of the 440 Language and Communication prosody, and gesture in separate processing modules (Levelt Kendon, A. (1994). Do gestures communicate? Research on Lan- guage and Social Interaction 27(3): 3–28. 1989; Krauss, Chen, and Gottesman 1999) will fail to Krauss, R. M., Y. Chen, and R. F. Gottesman. (1999). Lexical ges- account for the way language production and comprehension tures and lexical access: A process model. In D. McNeill, Ed., evolve on-line in real-time communication. Speech and ges- Language and Gesture: Window into Thought and Action. ture together provide an enhanced window on cognition dur- Cambridge: Cambridge University Press. ing communication. Much is externalized, a fact that Labov, W. (1980). The social origins of sound change. In W. minimizes the burden on interlocutors’ theories of mind. Labov, Ed., Locating Language in Time and Space. New York: Gigerenzer and Goldstein (1996) identify the information Academic Press, pp. 251–266. encapsulation feature of modularist IP models such as Lev- Lakoff, G. (1987). Women, Fire, and Dangerous Things: What elt’s as the Achilles’ heel of these models. Pushed by cross- Categories Reveal about the Mind. Chicago: University of Chi- disciplinary study of language in vivo, there is growing cago Press. Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. consensus that models must deal with the massive interpene- Cambridge, MA: MIT Press. tration of what have traditionally been analyzed as function- Levinson S. C. (Forthcoming). The body in space: Cultural differ- ally distinct levels of linguistic analysis. Such consensus ences in the use of body-schema for spatial thinking and ges- points to a paradigm shift underway, one that encompasses ture. In G. Lewis and F. Sigaud, Eds., Culture and Uses of the Saussure’s paradigm-shifting formulation but moves beyond Body. Cambridge: Oxford University Press. it. It locates human language in the human body and postu- Lieberman, P. (1991). Uniquely Human: The Evolution of Speech, lates as its theoretic atom the conversational dyad, rather than Thought, and Selfless Behavior. Cambridge, MA: Harvard Uni- a monad with a message to transmit or receive (Goodwin versity Press. 1986; Schegloff 1984). The shift was foreshadowed by Lev Locke, J. (1994). Phases in a child’s development of language. Semenovich VYGOTSKY (1934/1986), who analyzed commu- American Scientist 82: 436–445. McNeill, D., and S. Duncan. (1999). Growth points in thinking for nicative events as developing simultaneously on an “inter-” speaking. In D. McNeill, Ed., Language and Gesture: Window into as well as an “intra-psychic plane.” At the vangard are socio- Thought and Action. Cambridge: Cambridge University Press. and psycholinguistic gesture research, research on sign lan- Olson, D. R. (1994). The World on Paper: Conceptual and Cogni- guages liberated from earlier constraints to minimize iconic tive Implications of Writing and Reading. Cambridge: Cam- dimensions of patterning (Armstrong, Stokoe, and Wilcox bridge University Press. 1995), cognitive and functional linguistic research, as well as Piatelli-Palmarini, M., Ed. (1980). Language and Learning: The philosophical (Heidegger: see Dreyfus 1991) and anthropo- Debate Between Jean Piaget and Noam Chomsky. Cambridge, logical (Levinson, forthcoming) re-search that highlights the MA: Harvard University Press. situated, culturally embedded character of language use. Saussure, F. de. [1916] (1959). Course in General Linguistics. Translated by W. Baskin. Reprint. New York: Philosophical See also MODULARITY AND LANGUAGE; MODULARITY OF Library. MIND; PROSODY AND INTONATION; PROSODY AND INTONA- Schegloff, E. A. (1984). On some gestures’ relation to talk. In J. M. TION, PROCESSING ISSUES; SITUATEDNESS/EMBEDDEDNESS Atkinson and J. Heritage, Eds., Structures of Social Action. —Susan Duncan Cambridge: Cambridge University Press, pp. 266–295. Tuite, K. (1993). The production of gesture. Semiotica 93(1–2): 83–105. References Vygotsky, L. S. [1934] (1986). Thought and Language, A. Kozu- Armstrong, D. F., W. C. Stokoe, and S. E. Wilcox. (1995). Gesture lin, ed. and trans. Cambridge, MA: MIT Press. and the Nature of Language. Cambridge: Cambridge Univer- sity Press. Further Readings Bolinger, D. L., and D. A. Sears. (1975). Aspects of Language. 2nd ed. New York: Harcourt, Brace, Jovanovich. Armstrong, D. (1983). Iconicity, arbitrariness, and duality of pat- Cheney, D. L., and S. M. Seyfarth. (1990). How Monkeys See the tering in signed and spoken language: Perspectives on language World. Chicago: University of Chicago Press. evolution. Sign Language Studies 38(Spring): 51–69. Dreyfus, H. L. (1991). Being-in-the-World: A Commentary on Bavelas, J. B. (1994). Gestures as part of speech: Methodological Heidegger’s Being and Time, Division I. Cambridge, MA: The implications. Research on Language and Social Interaction MIT Press. 27(3): 201–221. Dreyfus, H. L. (1992). What Computers Still Can’t Do: A Critique Bernieri, F. J., and R. Rosenthal. (1991). Interpersonal coordina- of Artificial Reason. Cambridge, MA: MIT Press. tion: Behavior matching and interactional synchrony. In R. Gigerenzer, G., and D. Goldstein. (1996). Mind as computer: The Feldman and B. Rime, Eds., Fundmentals of Nonverbal Behav- birth of a metaphor. Creativity Research Journal 9: 131–144. ior. Cambridge: Cambridge University Press, pp. 401–432. Goodwin, C. (1986). Gesture as a resource for the organization of Bickerton, D. (1990). Language and Species. Chicago: University mutual orientation. Semiotica 62(1–2): 29–49. of Chicago Press. Hockett, C. F. (1987). Refurbishing our Foundations. Amsterdam: Bolinger, D. L. (1946). Thoughts on “yep” and “nope.” American John Benjamins. Speech 21: 90–95. Hockett, C. F. (1960). Logical considerations in the study of ani- Crystal, D. (1976). Paralinguistic behavior as continuity between mal communication. In W. E. Lanyon and W. N. Tavolga, Eds., animal and human communication. In W. C. McCormack and Animal Sounds and Communication. Washington: American S. A. Wurm, Eds., Language and Man: Anthropological Issues. Institute of Biological Science, pp. 392–430. The Hague: Mouton, pp. 13–27. Hurford, J. R., M. Studdert-Kennedy, and C. Knight, Eds. (1998). Givon, T. (1985). Iconicity, isomorphism, and non-arbitrary coding Approaches to the Evolution of Language: Social and Cogni- in syntax. In J. Haiman, Ed., Iconicity in Syntax. Amsterdam: tive Bases. Edinburgh: Edinburgh University Press. John Benjamins, pp. 187–219. Language and Culture 441 Current linguistic theory proceeds by positing universal Hockett, C. F. (1978). In search of Jove’s brow. American Speech 53: 243–313. hypotheses across all languages, and workers in other Kendon, A. (1980). Gesticulation and speech: Two aspects of the branches of cognitive science may therefore be led to think process of utterance. In M. R. Key, Ed., The Relationship that large numbers of universals of language have been Between Verbal and Nonverbal Communication. The Hague: established. In actual fact these have proved very hard to Mouton, pp. 207–228. formulate, and nearly all successful generalizations are Kendon, A. (1972). Some relationships between body motion and either very abstract (and correspondingly difficult to test) or speech: An analysis of an example. In A. Siegman and B. Pope, of the form “if a language is of a certain type T, then it tends Eds., Studies in Dyadic Communication. New York: Pergamon to have property P” (see Greenberg 1978), usually with Press, pp. 177–210. exceptions rapidly discovered. Most databases for extrapo- McNeill, D. (1985). So you think gestures are nonverbal? Psycho- lation cover less than 10% of the world’s languages; the logical Review 92(3): 350–371. McNeill, D. (1992). Hand and Mind: What Gestures Reveal About great majority of languages have never been described, let Thought. Chicago: Chicago University Press. alone carefully analyzed. Streeck, J. (1994). Gesture as communication II: the audience as Through language, and to a lesser extent other semiotic co-author. Research on Language and Social Interaction 27(3): systems, individuals have access to the large accumulation 239–267. of cultural ideas, practices, and technology that instantiate a distinct cultural tradition. The question then arises as to Language and Culture what extent these ideas and practices are actually embodied in the language in lexical and grammatical distinctions. Languages differ in fundamental ways: their phoneme Humboldt, and later SAPIR and Whorf, are associated with inventories vary from 11 to 141 (Maddieson 1984), they the theory that a language encapsulates a cultural perspec- may have elaborate or no morphology, may or may not use tive and actually creates conceptual categories (see CON- word order or constituent structure or case to signify syntac- CEPTS; LINGUISTIC RELATIVITY HYPOTHESIS; Gumperz and tic relations, may or may not have word roots of fixed gram- Levinson 1996). In some respects this seems clearly true matical word class, may make use of quite different (consider notions like “tort” or “manslaughter” that reflect semantic parameters, and so on (see TYPOLOGY). There are and constitute part of the English legal tradition, not an an estimated 7000 or more distinct languages in the world, aspect of culture-independent reality); in other respects it each a cultural tradition of (generally) thousands of years in seems to be false (“black” appears to be a universal concept, the making, and there are at least 20 (how many is contro- reflecting aspects of PSYCHOPHYSICS). Yet many cognitive versial) language families across which relationships cannot scientists assume that basic semantic parameters are univer- be demonstrated. Each is adapted to a unique cultural and sal, culture-specific notions like “tort” being constructed social environment, with striking differences in usage pat- from such universal semantic primitives (an influential terns (Bauman and Sherzer 1974). This cultural adaptation exception is Fodor, who claims that all such notions are uni- constitutes the cultural capital of language, and language versal unanalyzed wholes in the LANGUAGE OF THOUGHT). differences are perhaps the most perduring of all aspects of Current work on semantics, however, makes it clear than culture. On the other hand, language is a biological capacity, even apparently fundamental notions may vary crosslinguis- instantiated in the anatomy of our vocal tract and the corre- tically, and children learning language do not invariably sponding acuity of our hearing and in dedicated areas of the appear to make the same initial assumptions about meaning brain (see LANGUAGE, NEURAL BASIS OF). In fact, language (Slobin 1985). Take for example spatial notions: readers are provides the best evidence for the thesis of coevolution, likely to think of the things on the desk before them in terms whereby cultural replication and genetic replication became of things “in front” of themselves, “to the left,” or “to the intertwined, each providing the context for the evolution of right.” But some languages do not lexicalize these notions at the other (see EVOLUTION OF LANGUAGE; also Durham all. Instead one must refer to things as, for example, “to the 1991). CULTURAL VARIATION also requires that the biologi- north,” “to the east” or “to the west,” and so forth, as appro- cal capacity for language be malleable (see NEURAL PLAS- priate. Consequently, speakers of these languages must keep TICITY)—for example, able to learn and parse speech of their bearings, and they can be shown to conceive of spatial quite different sound and structural type (see PSYCHOLIN- arrangements differently in nonverbal memory and infer- GUISTICS)—although this malleability is progressively lost ence (Levinson 1996). during maturation of the individual. There are many aspects of the cultural patterning of lan- Most models of human cognition abstract away from guage that may be fundamental to its role in cognition. One variation, whether cultural or individual. But in the case of is special elaborations of linguistic ability—for instance, language, the capacity to handle the cultural variation is a highly skilled performance as in simultaneous translation or central property of cognitive ability. Consider for example rapid sports commentary that can be delivered at twice the that language ability is modality independent; according to speed of the fastest conversation. Perhaps the majority of cultural tradition it can be not only spoken or signed (see the world’s population are multingual, and multilingualism SIGN LANGUAGES) but also represented visually by reference is a capacity largely beyond current psycholinguistic under- to sounds, meanings, or both (according to the writing sys- standing. Another is the elaboration of technologies of lan- tem) or signed with the hands as in auxilliary hand-sign sys- guage, of which writing is the most fundamental (Goody tems. In this modality independence it is very unlike any 1977) and NATURAL LANGUAGE PROCESSING the most other sensory input or motor output system. advanced. Natural languages are learned in and through 442 Language and Gender social interaction and constitute probably the most complex References cognitive task that humans routinely undertake (and quite Bates, E., and B. Wulfek. (1989). Crosslinguistic studies of apha- plausibly the major pressure for brain evolution in our spe- sia. In B. Macwhinney and E. Bates, Eds., The Crosslinguistic cies; see Byrne and Whiten 1988; COOPERATION AND COM- Study of Sentence Processing. New York: Cambridge Univer- PETITION). Many aspects of natural language can only be sity Press, pp. 328–374. understood in relation to this interactional context, includ- Bauman, R., and J. Sherzer, Eds. (1974). Explorations in the Eth- ing INDEXICALS AND DEMONSTRATIVES, speech acts, and nography of Speaking. Cambridge: Cambridge University conveyed politeness (Brown and Levinson 1987). Press. Cognitive scientists are not interested in all aspects of Brown, P., and S. Levinson. (1987) Politeness. Cambridge, UK: language (most aspects of their history for example, but see Cambridge University Press. Byrne, R. W., and A. Whiten. (1988). Machiavellian Intelligence. LANGUAGE VARIATION AND CHANGE). Four aspects though Oxford: Clarendon Press. are of particular importance. One is how language is learned Durham, W. (1991). Coevolution: Genes, Cultures, and Human (see LEARNING). A second is how language is processed Diversity. Palo Alto, CA: Stanford. (viewing the mind or the brain as an information-processing Goody, J. (1977). The Domestication of the Savage Mind. Cam- device), both in comprehension (Tyler 1992) and production bridge: Cambridge University Press. (Levelt 1989). A third is how language interfaces with other Greenberg, J. (1978). Universals of Language, vols. 1–4. Stanford, cognitive abilities and how semantic representations are CA: Stanford University Press. related to other conceptual representations (Nuyts and Ped- Gumperz, J., and S. Levinson. (1996). Rethinking Linguistic Rela- erson 1997). A fourth concerns how linguistic ability is tivity. Cambridge: Cambridge University Press. instantiated in neurophysiology. Levelt, W. (1989). Speech Production. Cambridge, MA: MIT Press. In all four aspects, the complex interplay between culture Levinson, S. C. (1996). Frames of reference and Molyneux’s ques- and biology in language is crucial to our understanding of tion: Crosslinguistic evidence. In I. P. Bloom, M. Peterson, L. the phenomena. In LANGUAGE ACQUISITION, the cultural Nadel, and M. Garnett, Eds., Language and Space. Cambridge, variability makes learning a fundamental puzzle; even if MA: MIT Press. there are significant universals, the child must still pair Maddieson, I. (1984). Patterns of Sounds. Cambridge: Cambridge sounds and meanings, where the analysis of neither is given University Press. by first principles. For language processing, again language Nuyts, J., and E. Pederson, Eds. (1997). Linguistic and Conceptual variation is highly problematic: it is hard to see how the Representation. Cambridge: Cambridge University Press. same mechanisms can be involved in radically different lan- Slobin, D. (1985). The Crosslinguistic Study of Language Acquisi- guages. For example, languages with verbs in medial or final tion. Vols. 1 and 2. Hillsdale, NJ: Erlbaum. Tyler, L. (1992). Spoken Language Comprehension. Cambridge, position in the sentence allow one to start speech production MA: MIT Press. before the sentence is fully worked out; but languages with verbs in initial position, fully marked for agreement with subject and object, would seem to require a different produc- Language and Gender tion strategy. Similarly, parsing strategies for comprehension would seem necessarily divergent in languages with fixed word order or no fixed word order, with rich morphology or Exploring the interaction of language and gender raises none. Thirdly, fundamental variation in semantic parameters many fundamental questions of cognitive science. What are makes the interface between language and general cognition the connections of LANGUAGE AND THOUGHT, of LANGUAGE look much more problematic than is commonly assumed. AND CULTURE, of language and action? What role do LAN- Not all concepts are directly expressible in a language. Fur- GUAGE ACQUISITION and language processing play in forg- ther, the semantic distinctions obligatorily required by the ing these connections? Even research on language and grammar are not necessarily of the kind that would univer- gender that does not address such questions explicitly can sally be noted and memorized for future possible linguistic usually be seen as implicitly relevant to them. Relatively expression (e.g., was the referent visible at event time, was few language and gender researchers have cast their work in the participant to be described of greater or lesser rank than primarily cognitive terms, more often emphasizing social or the speaker and the addressee, was the referent a singleton or economic or political phenomena, but of course such phe- not, etc.). This points to the likelihood that to speak a partic- nomena can themselves be fruitfully approached from the ular language, experience must be coded in the appropriate perspective of cognitive science—for example, in work on categories, and it also raises questions about the universality SOCIAL COGNITION. of the language of thought. Finally, with regards to brain and Two basic families of questions have dominated lan- language, there is evidence from selective brain damage (see guage and gender research. First, does gender influence lan- APHASIA) that linguistic abilities are localized partly in guage and, if so, how? That is, how might gender identities accordance with the structure of a particular language (Bates and relations of interlocutors be connected to the form and and Wulfreck 1989). content of what they say? There is considerable sociolin- guistic research on how women and men use language in See also HUMAN UNIVERSALS; INNATENESS OF LAN- different situations (see, e.g., Coates 1992; Coates and Cam- GUAGE; LANGUAGE AND THOUGHT; LANGUAGE PRODUC- eron 1988; Eckert 1990); there is some developmental TION; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR research that looks at gender issues and a very little neurol- —Stephen C. Levinson inguistic work on sex differences in brain activity during Language and Gender 443 language processing (along with sociolinguistic work, Phil- many men lack (Tannen 1994 makes this kind of point, as ips, Steele, and Tanz 1987 include contributions addressing does Holmes 1995). both language development and neurolinguistics). There is As the discussion of these questions suggests, the dis- also a very little work on phonetic issues (e.g., Hanson tinction between these two families of questions is not as 1996; Henton 1992). With the exception of some of the best clearcut as it might at first seem. To talk in this way is to sociolinguistic work (e.g., Brown 1990; Goodwin 1991; and suppose that gender and language are somehow quite sepa- a number of the articles in Coates 1998), most research on rate phenomena and that our goal is to articulate their con- how gender affects language conflates gender with sex, nections. While useful for certain purposes, this supposition focusing on overall sex differences while ignoring the inter- can mislead us. In particular, it does not come to grips with twining of gender with other aspects of social identity and the rooting of both language use and gender in situated social relations (see Eckert and McConnell-Ginet 1992, social practices, which involve the jointly orchestrated 1996; Ochs 1992) and also ignoring other aspects of intra- actions of groups of cognitive agents. Gal (1991) and Eckert sex variation as well as complexities in sexual classification. and McConnell-Ginet (1992) have argued that the real inter- Some work has addressed ways in which gender arrange- action of gender and language, their coevolution, can be ments may affect the development of linguistic conventions best illumined by examining both language and gender as (e.g., McConnell-Ginet 1989 discusses how changes in lexi- they are manifest in social practice. cal meaning might be driven by certain features of gender Both language and gender are of course also partly bio- arrangements, including, for example, practices that give logical phenomena. In the view of most linguists and many greater voice in public contexts to men than to women). And other cognitive scientists, the possibilities for human lan- much sociolinguistic work has addressed the question of guage systems are very much constrained by the biologically how gender affects changes in patterns of pronunciation and given language capacity. There are, however, not only other aspects of language use that ultimately can change parameters along which language systems can vary; there are language structure (e.g., Milroy et al. 1995). also many different ways that communities can (and do) use Second, does language influence gender and, if so, how? language in their activities. And gender is linked to sexual That is, how might linguistic resources and conventions difference. Biological sex itself, however, is far less dichoto- affect the shape of a culture’s gender arrangements? There mous than many assume (English speakers show little toler- have been a number of psycholinguistic studies on such top- ance for gradations between “female” and “male,” insisting ics as masculine generics (see, e.g., Sniezek and Jazwinski on classifying intersexed people in one or the other category; 1986) and studies in PRAGMATICS and related fields on top- see Bing and Bergvall 1996), and there is considerable intra- sex variation on many dimensions (including cognitive func- ics like METAPHOR and linguistic discrimination (see, e.g., tioning as well as physical and behavioral attributes), even papers in Vetterling-Braggin 1981). There is also consider- among those whose classification seems biologically quite able work on these topics from anthropologists and cultural unproblematic. There is an extraordinary amount of socio- theorists, who look at a culture’s favored figures of speech cultural work done to elaborate femaleness and maleness, to (e.g., using the terminology of delectable edibles to talk construct gender identities (not limited to two in all cultures) about women) and other linguistic clues to cultural assump- and also gender relations. Most of this work is at the same tions in order to map the gender terrain. There is some evi- time also shaping other aspects of social identities and rela- dence that gender (and other) stereotypes help drive tions (e.g., ethnicity, race, class). Because language use is an inferencing even in those who don’t accept them, suggesting integral component of most social practices, language is nec- that linguistically triggered stereotyping may be consequen- essarily a major instrument of the sociocultural construction tial both in acculturating children and in helping to maintain of gender. Thus an emerging research goal is exploration of the gender status quo. What should be seen as the direction the linguistic practices through which sex and gender are of influence is certainly not clear, however, and many stud- elaborated in a wide range of different communities (see, ies of gender biases evident in linguistic resources are posit- e.g., Hall and Bucholtz 1996). Although rather little of this ing the influence of gender arrangements on language as research has a cognitive orientation and some interesting much as the other way around (McConnell-Ginet 1989 and work is not really empirical or scientific, it raises many others suggest that influences typically go in both direc- important questions for cognitive scientists to address. tions). Some of the same sociolinguistic studies that explore how women and men speak also look at the social and polit- See also CULTURAL EVOLUTION; LANGUAGE AND COM- ical effects of different speech styles and interactional MUNICATION; LANGUAGE AND CULTURE; METAPHOR AND dynamics. Lakoff (1975) made popular the idea that CULTURE women’s language use is part of what contributes to men’s —Sally McConnell-Ginet dominance over them. Although Lakoff herself and others would frame matters somewhat differently now, looking References also at men’s language use and at attitudes toward different speech styles, there continues to be considerable research Bing, J. M., and V. L. Bergvall. (1996). The question of questions: suggesting that ways of speaking often play a role in main- Beyond binary thinking. In V. L. Bergvall, J. M. Bing, and A. F. taining male dominance (see, e.g., Henley and Kramarae Freed, Eds., Rethinking Language and Gender Research: The- 1991). Much of this research, however, also emphasizes the ory and Practice. London: Longman. advantages that can accrue to women who have developed Brown, P. (1990). Gender, politeness, and confrontation in Tene- certain kinds of interactionally useful speech skills that japa. Discourse Processes 13: 123–141. 444 Language and Modularity Language and Modularity Coates, J. (1992). Women, Men, and Language. 2nd ed. London: Longman. Coates, J. (1993). Language and Gender: A Reader. Oxford: See MODULARITY AND LANGUAGE Blackwell. Coates, J., and D. Cameron, Eds. (1988). Women in their Speech Communities: New Perspectives on Language and Sex. Lon- Language and Thought don: Longman. Eckert, P. (1990). The whole woman: Sex and gender differences in variation. Language Variation and Change 1: 245–267. Perhaps because we typically think in words, language and Eckert, P., and S. McConnell-Ginet. (1992). Think practically and thought seem completely intertwined. Indeed, scholars in look locally: Language and gender as community-based prac- tice. Annual Review of Anthropology 21: 461–490. various fields—psychology, linguistics, anthropology—as Eckert, P., and S. McConnell-Ginet. (1996). Constructing meaning, well as laypeople have entertained these questions: Is constructing selves: Snapshots of language, gender, and class thought possible without language? Does the structure of from Belten High. In K. Hall and M. Bucholtz, Eds., Gender our language shape our thinking? Does our perception/cog- Articulated: Language and the Socially Constructed Self. Lon- nition shape the structure of language? Are our abilities to don: Routledge, pp. 459–507. learn and use language part of our general intelligence? Gal, S. (1991). Between speech and silence: The problematics of Is thought possible without language? Research on research on language and gender. In M. DiLeonardo, Ed., Gen- babies and children who have not yet acquired any language der at the Crossroads of Knowledge: Feminist Anthropology in suggests that little babies muse over rather important things. the Postmodern Era. Berkeley and Los Angeles: University of For instance, 3- to 4-month-old babies seem to think that California Press, pp. 175–203. Goodwin, M. H. (1991). He-Said-She-Said. Bloomington: Indiana each object occupies its own space and so one solid object University Press. cannot go through another solid object. Five-month-old Hall, K., and M. Bucholtz, Eds. (1996). Gender Articulated: Lan- babies can do simple arithmetic. If they see a hand carrying guage and the Socially Constructed Self. London: Routledge. two objects to the back of a screen and reappearing empty- Hanson, H. (1996). Synthesis of female speech using the Klatt handed, they seem to expect two more objects behind the synthesizer. Speech Communication Group Working Papers, screen than before this addition event. When developmental vol. 10. MIT Research Laboratory of Electronics, pp. 84– psychologists such as Renee Baillargeon and Karen Wynn 103. concocted “magic shows” that violated fundamental princi- Henley, N. M., and C. Kramarae. (1991). Gender, power, and mis- ples of physics or numbers by clever subterfuge, preverbal communication. In N. Coupland, H. Giles, and J. M. Wiemann, babies showed surprise by staring at the scenes longer than Eds., Miscommunication and Problematic Talk. Newbury Park, CA: Sage Publications. they would at physically plausible scenes. Other evidence Henton, C. (1992). The abnormality of male speech. In G. Wolf, for thought without language came from profoundly deaf Ed., New Departures in Linguistics. New York: Garland, pp. children who had not been exposed to any sign language. 27–59. Susan Goldin-Meadow, a psychologist, found several such Holmes, J. (1995). Women, Men, and Politeness. London: Long- children growing up in loving families. They invented their man. own signs and gestures to communicate their thoughts and Lakoff, R. (1975). Language and Woman’s Place. New York: needs (e.g., talking about shoveling snow, requesting some- Harper and Row. one to open a jar). McConnell-Ginet, S. (1989). The sexual (re-)production of mean- Still other evidence for thinking without language has to ing: A discourse-based theory. In F. W. Frank and P. A. Tre- do with mental images. Scientists and writers as well as ichler, Eds., Language, Gender, and Professional Writing: Theoretical Approaches and Guidelines for Nonsexist Usage. visual artists have claimed that some of their most creative New York: Modern Language Association. work was inspired by their mental images. One of the best Milroy, J., L. Milroy, S. Hartley, and D. Walshaw. (1995). Glottal known examples, perhaps, is James Watson and Francis stops and Tyneside glottalization: Competing patterns of varia- Crick’s discovery of the double helix structure of DNA. tion and change in British English. Language Variation and Albert Einstein was another self-described visual thinker. It Change 6: 327–357. seems, then, brilliant as well as mundane thought is emi- Ochs, E. (1992). Indexing gender. In A. Duranti and C. Goodwin, nently possible without language. Eds., Rethinking Context: Language as an Interactive Phe- Does language shape or even dictate thought? The lin- nomenon. Cambridge: Cambridge University Press, pp. 335– guistic anthropologist Edward SAPIR argued, “No two lan- 358. guages are ever sufficiently similar to be considered as Philips, S. U., S. Steele, and C. Tanz, Eds. (1987). Language, Gen- der, and Sex in Comparative Perspective. Cambridge: Cam- representing the same social reality.” His student, Benjamin bridge University Press. Lee Whorf, asserted, “The world is presented in a kaleido- Sniezek, J. A., and C. H. Jazwinski. (1986). Gender bias in scopic flux of impressions which has to be organized . . . English: In search of fair language. Journal of Applied Social largely by the linguistic systems in our minds.” The Sapir- Psychology 16: 642–662. Whorf hypothesis has two tenets: LINGUISTIC RELATIVITY Tannen, D. (1994). Talking from 9 to 5: How Women’s and Men’s HYPOTHESIS (i.e., structural differences between languages Conversational Styles Affect Who Gets Heard, Who Gets will generally be paralleled by nonlinguistic cognitive dif- Credit, and What Gets Done at Work. New York: William Mor- ferences) and linguistic determinism (i.e., the structure of a row. language strongly influences or fully determines the way its Vetterling-Braggin, M., Ed. (1981). Sexist Language: A Modern native speakers perceive and reason about the world). Philosophical Analysis. Totawa, NJ: Littlefield. Language and Thought 445 John Lucy, an anthropologist, has written about language Our cognition also seems to shape our language. For differences associated with perceptual differences. For instance, when asked in their native language “Paul amazes example, speakers of languages with different basic color Mary. Why?” and “Paul admires Mary. Why?”, the psychol- vocabularies might sort nonprimary colors (e.g., turquoise, ogist Roger Brown found that both Chinese and English chartreuse) in slightly different ways. But such subtle speakers tended to talk about something amazing about Paul effects were hardly what Sapir and Whorf had in mind when and something admirable about Mary. Note that in English, they wrote about how language might be related to, or might “amazing” and “admirable”—rather than “amazable” and even shape, its speakers’ worldview (e.g., time, causality, “admiring”—are entrenched adjectives for describing peo- ontological categories). ple’s disposition. These cognitive causal schemas, then, One notable exception is the psychologist Alfred might be a universal and may have influenced the derivation Bloom’s intriguing claim that the lack of a distinct counter- of dispositional adjectives in English, rather than the other factual marker in the Chinese language might make it diffi- way around. cult for Chinese speakers to think counterfactually—that is, One more central issue in the study of language and to think hypothetically about what is not true (e.g., If Plato thought: Are our abilities to learn and use language part of had been able to read Chinese, he could have . . .). Upon our general intelligence? Or, are they subserved by a special close scrutiny, however, Chinese speakers’ purported diffi- “language faculty”? Recent findings on language-specific culty in understanding Bloom’s counterfactual stories disap- impairments (i.e., language delay or disorder experienced peared when researchers such as Terry Au and Lisa Liu by children who are not hearing or cognitively impaired) rewrote those stories in idiomatic Chinese with proper coun- and Williams syndrome (i.e., extreme mental retardation terfactual markers. With 20/20 hindsight, perhaps we should with almost intact language abilities) suggest that cognition have realized that Bloom’s initial finding had to be too fas- and language can be decoupled (see LANGUAGE IMPAIR- cinating to be true. Note that when we feel lucky and realize MENT, DEVELOPMENTAL). In short, although language and that things could have turned out badly but didn’t, or when thought might be quite modular (see also MODULARITY OF we regret having done something and wish that we had MIND), there is some evidence for our cognition and percep- acted differently, we have to think counterfactually. How tion shaping the evolution of our language. Evidence for can something so fundamental and pervasive in human influence in the opposite direction, however, seems more thinking be difficult in any human language? elusive. Despite early disappointing efforts to uncover evidence See also CATEGORIZATION; IMAGERY; MODULARITY AND for language shaping thought, a “Whorfian Renaissance” LANGUAGE; NATIVISM; VYGOTSKY seems to be in the making. For instance, the anthropologist —Terry Au Stephen Levinson has reported interesting variations in spa- tial language across cultures (see LANGUAGE AND CUL- References TURE). However, how each language carves up space seems to be principled—influenced by the direction of gravity Au, T. K. (1983). Chinese and English counterfactuals: The Sapir- (e.g., up, down), human perception (e.g., near, far, front, Whorf hypothesis revisited. Cognition 15: 155–187. back) and so forth—rather than random or arbitrary. More- Au, T. K. (1986). A verb is worth a thousand words: The causes over, there is as yet no evidence for fundamental differences and consequences of interpersonal events implicit in language. in spatial cognition associated with linguistic variations. For Journal of Memory and Language 25: 104–122. example, unlike some Papuan languages, English has no Baillargeon, R. (1993). The object concept revisited: New direc- simple word meaning “that far away up there.” Are English tions in the investigation of infants’ physical knowledge. In C. Granrud, Ed., Visual Perception and Cognition in Infancy. speakers less capable than Papuan speakers to construe such Hillsdale, NJ: Erlbaum, pp. 265–315. a location? Probably not. While the jury is still out for the Bellugi, U., A. Bihrle, H. Neville, and S. Doherty. (1992). Lan- Sapir-Whorf hypothesis, it is probably safe to say that guage, cognition, and brain organization in a neurodevelop- important aspects of our worldview are unlikely to be at the mental disorder. In M. Gunnar and C. Nelson, Eds., mercy of arbitrary aspects of our language. Developmental Behavioral Neuroscience. Hillsdale, NJ: Erl- How about the other way around? By virtue of being baum, pp. 201–232. human, we tend to perceive, organize, and reason about the Berlin, B., and P. Kay. (1969). Basic Color Terms: Their Univer- world in certain ways. Do languages build upon our perceptual sality and Evolution. Berkeley and Los Angeles: University of categories and conceptual organization? Consider color per- California Press. ception. Four-month-old babies prefer looking at primary col- Bloom, A. H. (1981). The Linguistic Shaping of Thought: A Study in the Impact of Language on Thinking in China and the West. ors (red, blue, green, yellow) to colors near the boundaries of Hillsdale, NJ: Erlbaum. primary colors; toddlers can identify primary colors better than Bornstein, M. H. (1975). Qualities of color vision in infancy. Jour- nonprimary ones. Interestingly, anthropologists Brent Berlin nal of Experimental Child Psychology 19: 410–419. and Paul Kay found that, if a language has fewer than five Brown, R., and D. Fisher. (1983). The psychological causality basic color words, it would include red, blue, green, and yel- implicit in language. Cognition 14: 237–273. low (in addition to black and white). Nonprimary colors such Chomsky, N. (1959). A review of B. F. Skinner’s monograph “Ver- as brown, pink, orange, and purple are encoded only in lan- bal Behavior.” Language 35: 26–58. guages that have encoded the four primary colors (see COLOR Goldin-Meadow, S., and C. Mylander. (1990). Beyond the input CATEGORIZATION). Perceptual salience, then, seems to shape given: The child’s role in the acquisition of language. Language the encoding of color words rather than the other way around. 66: 323–355. 446 Language Change that approximately 8 percent of boys and 6 percent of girls Leonard, L. (1987). Is specific language impairment a useful con- struct? In S. Rosenberg, Ed., Advances in Applied Psycholin- have a significant developmental language impairment of guistics. Vol. 1., Disorders of First-language Development. unknown origin, referred to as specific language impairment Cambridge: Cambridge University Press. (SLI; Tomblin et al. 1997). This epidemiological study also Levinson, S. C. (1996). Language and space. Annual Review of showed that the clinical identification of language impair- Anthropology 25: 353–382. ments remains low. Only 29 percent of the parents of chil- Lucy, J. A. (1992). Language Diversity and Thought: A Reformu- dren identified as having SLI had previously been informed lation of the Linguistic Relativity Hypothesis. Cambridge: that their child had a speech or language problem. Cambridge University Press. The differential diagnosis of developmental language Pinker, S. (1994). The Language Instinct. New York: William Mor- impairments is based on behavioral evaluations that include row. audiological, neurological, psychological, and speech and Shepard, R. N. (1978). The mental image. American Psychologist 33: 125–137. language testing. Developmental language disorders are Wynn, K. (1992). Addition and subtraction in human infants. divided into two basic categories, expressive language dis- Nature 358: 749–750. order and mixed receptive-expressive language disorder, a disorder encompassing both language comprehension and Further Readings production deficits (DSM-IV. 1994). Comprehension or production problems may occur within one or more of the Au, T. K. (1988). Language and cognition. In L. Lloyd and R. components of language, including PHONOLOGY, MORPHOL- Schiefelbusch, Eds., Language Perspectives II. Austin, TX: OGY, SEMANTICS, or SYNTAX. Problems with PRAGMATICS, Pro-Ed, pp. 125–146. that is, conversational skills, also occur frequently. A high Au, T. K. (1992). Counterfactual reasoning. In G. R. Semin and K. Fiedler, Eds., Language, Interaction, and Social Cognition. proportion of children with developmental language disor- London: Sage, pp. 194–213. ders also have concomitant speech articulation defects, that is, they have difficulty clearly and correctly producing one Language Change or more of the speech sounds of their language. However, speech articulation defects and developmental language impairments can occur independently of each other. See CREOLES; LANGUAGE VARIATION AND CHANGE; PARAM- Developmental language impairment has been shown to ETER-SETTING APPROACHES TO ACQUISITION, CREOLIZA- be a risk factor for other childhood disorders. For example, TION, AND DIACHRONY; TYPOLOGY epidemiological studies showed that children referred to child guidance clinics for a variety of social and emotional Language Development conditions were found to have a higher-than-expected inci- dence of developmental language disorders (Beitchman et al. 1986b). Conversely, children diagnosed with develop- SeeINNATENESS OF LANGUAGE; LANGUAGE ACQUISITION; mental language disorders also have been found, upon PHONOLOGY, ACQUISITION OF; SEMANTICS, ACQUISITION OF examination, to have a preponderance of behavioral and emotional disorders (Cantwell, Baker, and Mattison 1979). Language Evolution Longitudinal research studies that have followed children with early developmental language impairments prospec- tively from the preschool though elementary school years See EVOLUTION OF LANGUAGE; LANGUAGE VARIATION AND have demonstrated a striking link between early develop- CHANGE mental language impairments and subsequent learning dis- abilities, especially DYSLEXIA, a developmental reading Language Impairment, Developmental disability (Aram et al. 1984; Bishop and Adams 1990; Riss- man, Curtiss, and Tallal 1990; Catts 1993). Research that It has been estimated that approximately thirteen percent of has compared children classified as SLI with those classi- all children have some form of language impairment fied as dyslexic has shown that both groups are character- (Beitchman et al. 1986a). The most common known causes ized by a variety of oral language deficits, specifically of developmental language impairments are hearing loss phonological analysis deficits (Liberman et al. 1974; Wag- (including intermittent hearing loss resulting from chronic ner and Torgeson 1987; Shankweiler et al. 1995). Whether otitis media), general MENTAL RETARDATION, neurological the phonological deficit derives from speech-specific mech- disorders such as lesions or epilepsy affecting the auditory anisms, or from more basic acoustic processing deficits, has processing or language areas of the brain, and motor defects been the focus of considerable research and theoretical affecting the oral musculature. Many other developmental debate. disorders, such as pervasive developmental disability Phonological processing deficits are generally accompa- (including AUTISM), attention deficit disorder, central audi- nied by central auditory processing disorders, particularly in tory processing disorder, and Down’s syndrome, may the areas of AUDITORY ATTENTION and serial memory. include delay in language development. In addition to these These processing problems may result from a more basic known causes of developmental language impairment, a impairment in the rate of neural information processing, recent epidemiological study of monolingual English- specifically a severe backward masking deficit (Tallal, speaking kindergarten children in the United States found Miller, and Fitch 1993a; Tallal et al. 1993b; Wright et al. Language Impairment, Developmental 447 history of developmental language impairment demonstrate 1997). Tallal and colleagues have shown that children with longer processing times than matched infants born into phonologically based speech, language, and READING disor- families with a negative family history for SLI. When fol- ders need significantly more neural processing time (hun- lowed prospectively, the processing rate established at six dreds of milliseconds instead of tens of milliseconds) months of age has been shown to predict the rate of lan- between brief, rapidly successive acoustic cues in order to guage development, with the infants showing impaired pro- process them correctly. This slowed processing rate has a cessing rate subsequently being most likely to be delayed particularly detrimental effect on phonological processing. in language development (Benasich and Tallal 1996). It Many acoustic changes occurring within syllables and also has been found that adults with dyslexia, or who have words, necessary to distinguish the individual phonological a family history of language-learning impairments, show elements (speech sounds), occur within the tens-of-millisec- significant psychoacoustic deficits, particularly a slower onds time window. Research has demonstrated that percep- auditory processing rate. These adults also show poorer tion of those speech syllables that incorporate rapidly phonological analysis abilities when compared with successive acoustic cues is most problematic for these chil- matched adults who are family history–negative for devel- dren (see Tallal et al. 1993a for review). opmental language learning disorders (Tomblin, Freese, These findings linking slow auditory processing rate and and Records 1992). phonological processing deficits to developmental language Recently, brain imaging technologies also have been and reading disorders have recently led to the development of used to examine the neurobiological basis of developmental novel remediation strategies for the treatment of developmen- language impairment. Electrophysiological studies support tal language-learning impairments. In a treatment-controlled the results of behavioral studies showing specific deficits in study, using speech that was acoustically computer modified acoustic analysis and phonological processing (Kraus et al. to extend and amplify the rapidly successive components, it 1995). Results from MAGNETIC RESONANCE IMAGING (MRI) was shown that intensive, individually adaptive daily training with computer-based exercises resulted in highly significant show that the pars triangularis (Broca’s area) is significantly gains in auditory processing rate, speech discrimination, and smaller in the left hemisphere of children with SLI and that language comprehension of syntax and grammar (Merzenich these children are more likely to have rightward asymmetry et al. 1996; Tallal et al. 1996). of language structures. The opposite pattern is seen in chil- There also have been considerable advances made in dren developing language normally (Jernigan et al. 1991; understanding the specific linguistic deficits of these chil- Gauger, Lombardino, and Leonard 1997). dren. Research has demonstrated particular difficulty with See also APHASIA; GRAMMAR, NEURAL BASIS OF; LAN- components of syntax and morphology (see Leonard 1998 GUAGE, NEURAL BASIS OF; PHONOLOGY, NEURAL BASIS OF for review). The results of longitudinal studies have shown, —Paula Tallal however, that children with developmental language impair- ments develop the linguistic structures of language along a References similar linguistic trajectory to that observed in normal younger children. There is little evidence that these children Aram, D. M., B. L. Ekelman, and J. E. Nation. (1984). Preschool- make linguistic errors that are deviant or substantially dif- ers with language disorders: 10 years later. Journal of Speech ferent from younger children at the same stage of language and Hearing Research 27: 232–244. development. Rather, children with language impairments Beitchman, J. H., R. Nair, M. Clegg, and P. G. Patel. (1986a). Prev- take much longer to progress through the stages of normal alence of speech and language disorders in 5-year-old kinder- language development (Curtiss, Katz, and Tallal 1992). A garten children in the Ottawa-Careton region. Journal of Speech and Hearing Disorders 51: 98–110. similar pattern of delay, rather than deviance, occurs across Beitchman, J. H., R. Nair, M. Ferguson, and P. G. Patel. (1986b). most populations of children with developmental language Prevalence of psychiatric disorders in children with speech and impairment. Looking cross-linguistically at children with language disorders. Journal of American Academy of Child developmental language impairments learning different lan- Psychiatry 24: 528–535. guages in different countries, it also has been shown that Benasich, A. A., and P. Tallal. (1996). Auditory temporal process- whatever linguistic structures are unstressed or of weak ing thresholds, habituation, and recognition memory over the phonetic substance in a particular language are the most dif- first year. Infant Behavior and Development 19: 339–357. ficult for children to learn and also the most delayed in chil- Bishop, D. V., and C. Adams. (1990). A prospective study of the dren with language impairments (Leonard 1992). That the relationship between specific language impairment, phonologi- same order of development occurs across populations and cal disorders and reading retardation. Journal of Child Psychol- ogy and Psychiatry Allied Disciplines 31: 1027–1050. language environments gives strong support that there is a Bishop, D. V. M., T. North, and C. Donlan. (1995). Genetic basis potent metric of sheer difficulty (whether representational of specific language impairment: Evidence from a twin study. or perceptual) that is imposed on the learning of linguistic Developmental Medicine and Child Neurology 37: 56–71. structures and contents. Cantwell, D. P., L. Baker, and R. E. Mattison. (1979). The preva- There is growing evidence from family and twin studies lence of psychiatric disorder in children with speech and lan- that developmental language impairments may aggregate in guage disorder: An epidemiological study. Journal of the families and may be genetically transmitted. Twin studies American Academy of Child Psychiatry 18(3): 450–461. have shown a high concordance rate (heritability) for mea- Catts, H. W. (1993). The relationship between speech-language sures of phonological analysis (Bishop, North, and Donlan impairments and reading disabilities. Journal of Speech and 1995). As a group, infants born into families with a positive Hearing Research 36: 948–958. 448 Language, Innateness of Further Readings Curtiss, S., W. Katz, and P. Tallal. (1992). Deviance versus delay in language acquisition of language impaired children. Journal of Blachman, B., Ed. (1997). Foundations of Reading Acquisition and Speech and Hearing Research 35: 373–383. Dyslexia. Hillsdale, NJ: Erlbaum. Diagnostic and Statistical Manual of Mental Disorders, 4th ed. Farmer, M. E., and R. Klein. (1995). The evidence for a temporal (DSM-IV). (1994). Washington, DC: American Psychiatric processing deficit linked to dyslexia: A review. Psychonomic Association. Bulletin Reviews 2: 460–493. Gauger, L. M., L. J. Lombardino, and C. M. Leonard. (1997). Gowasmi, U., and P. Bryant. (1990). Phonological Skills and Brain morphology in children with specific language impair- Learning to Read. Hillsdale, NJ: Erlbaum. ment. Journal of Speech, Language, and Hearing Research 40: Mauer, D. M., and A. G. Kamhi. (1996). Factors that influence 1272–1284. phoneme-grapheme correspondence learning. Journal of Lear- Jernigan, T. L., J. R. Aesselink, E. Sowell, and P. Tallal. (1991). ning Disabilities 29: 359–370. Cerebral structure on magnetic resonance imaging in language Merzenich, M. M., and W. M. Jenkins. (1995). Cortical plasticity, and learning impaired children. Archives of Neurology 48: 539– learning and learning dysfunction. In B. Julesz and I. Kovacs, 545. Eds., Maturational Widows and Adult Cortical Plasticity. Santa Kraus, N., T. McGee, T. Carrell, C. King, K. Tremblay, and T. Fe, NM: Addison-Wesley, pp. 247–272. Nicol. (1995). Central auditory system plasticity with speech Protopapas, A., M. Ahissar, and M. M. Merzenich. (1997). Audi- discrimination. Journal of Cognitive Neuroscience 7: 25–32. tory processing deficits in adults with a history of reading diffi- Leonard, L. (1992). Specific language impairments in three lan- culty. Society for Neuroscience Abstracts 23: 491. guages: Some cross-linguistic evidence. In P. Fletcher and D. Reed, M. A. (1989). Speech perception and the discrimination of Hall, Eds., Specific Speech and Language Disorder in Children. brief auditory cues in reading disabled children. Journal of London: Whurr, pp. 119–126. Experimental Child Psychology 48: 270–292. Leonard, L. B. (1998). Children with Specific Language Impair- Tallal, P. (1980). Auditory temporal perception, phonics, and read- ment. Cambridge, MA: MIT Press. ing disabilities in children. Brain and Language 9: 182–198. Liberman I., D. Shankweiler, F. W. Fischer, and B. Carter. (1974). Tallal, P., D. Dukette, and S. Curtiss. (1989). Behavioral/emotional Explicit syllable and phoneme segmentation in the young child. profiles of preschool language-impaired children. Development Journal of Experimental Child Psychology 18: 201–212. and Psychopathology 1: 51–67. Merzenich, M., W. Jenkins, P. S. Johnston, C. Schreiner, S. L. Tallal, P., J. Townsend, S. Curtiss, and B. Wulfeck. (1991). Pheno- Miller, and P. Tallal. (1996). Temporal processing deficits of typic profiles of language-impaired children based on genetic/ language-learning impaired children ameliorated by training. family history. Brain and Language 41: 81–95. Science 271: 77–80. The following Web site contains information on developmental Rissman, M., S. Curtiss, and P. Tallal. (1990). School placement language disorders, as well as links to related sites: http:// outcomes of young language impaired children. Journal of www.scientificlearning.com. Speech and Language Pathology and Audiology 14: 49–58. Shankweiler, D., S. Crain, L. Katz, A. Fowler, A. Liberman, S. Brady, R. Thorton, E. Lundquist, L. Dreyer, J. Fletcher, K. Language, Innateness of Stuebing, S. Shaywitz, and B. Shaywitz. (1995). Cognitive pro- files of reading disabled children: Comparisons of language skills in phonology, morphology, and syntax. Psychological See INNATENESS OF LANGUAGE; LANGUAGE ACQUISITION; Science 6(3): 149–155. PARAMETER SETTING APPROACHES TO ACQUISITION, CRE- Tallal, P., S. Miller, and R. H. Fitch. (1993a). Neurobiological basis OLIZATION, AND DIACHRONY of speech: A case for the preeminence of temporal processing. Annals of the New York Academy of Sciences 682: 27–47. Tallal, P., A. M. Galaburda, R. R. Llinas, and C. von Euler, Eds. Language, Neural Basis of (1993b). Temporal information processing in the nervous sys- tem: Special reference to dyslexia and dysphasia. Annals of the New York Academy of Sciences 682. Investigations into the neural basis of language center Tallal, P., S. L. Miller, G. Bedi, G. Byma, X. Wang, S. S. Nagara- around how the brain processes language. To do this, we jan, C. Schreiner, W. M. Jenkins, and M. M. Merzenich. (1996). must understand that language is a most complex function, Language comprehension in language-learning impaired chil- one that encompasses numerous subprocesses, including the dren improved with acoustically modified speech. Science 271: recognition and articulation of speech sounds, the compre- 81–84. hension and production of words and sentences, and the use Tomblin, J. B., P. R. Freese, and N. L. Records. (1992). Diagnosing of language in pragmatically appropriate ways. Underlying specific language impairment in adults for the purpose of pedi- and interacting with these are also the functions of ATTEN- gree analysis. Journal of Speech and Hearing Research 35: TION and MEMORY. All contribute in special ways to our 332–343. Tomblin, J. B., N. L. Records, P. Buckwalter, X. Zhang, E. Smith, ability to process language, and each may, in fact, be han- and M. O’Brien. (1997). The prevalence of specific language dled differently by the human brain. Classic neurolinguistic impairment in kindergarten children. Journal of Speech, Lan- theories, developed over a hundred years ago, have sug- guage, and Hearing Research 40: 1245–1260. gested that certain brain areas play specific roles in the pro- Wagner, R. K., and J. K. Torgeson. (1987). The nature of phono- cess of language. Since then, modern techniques are logical processing and its causal role in the acquisition of read- offering us new data on what these areas might actually do ing skills. Psychological Bulletin 102: 192–212. and how they might contribute to a network of brain areas Wright, B. A., L. J. Lombardino, W. M. King, C. S. Puranik, C. M. that collectively participate in language processing. Leonard, and M. M. Merzenich. (1997a). Deficits in auditory Most of our information about how the brain processes temporal and spectral processing in language-impaired chil- language has been gleaned from studies of brain-injured dren. Nature 387: 176–178. Language, Neural Basis of 449 patients who have suffered strokes due to blockage of blood or written language. Those with frontal lobe lesions gener- flow to the brain or from gunshot wounds to the head during ally have speech or articulation difficulty but tend to under- wartime. In these cases, the language deficits resulting from stand simple sentences fairly well. these injuries have been compared to the areas of the brain Like most theories, this one has its problems. First, the which became lesioned. correlation between lesions to Broca’s area and language In the past, such investigations had to rely on autopsy deficits is far from perfect. Lesions to Broca’s area alone data obtained long after the behavioral data had been col- never result in a persisting Broca’s aphasia. The language lected. These days, structural neuroimaging using computed deficits that are seen in these patients during the first few tomography (CT) and MAGNETIC RESONANCE IMAGING weeks after the injury always resolve into a mild or nonex- (MRI) can help us view the location and extent of the brain istent problem (Mohr 1976). Several patients have also been lesion while behavioral data can be collected concurrently. reported that suffer from a persisting Broca’s aphasia with Modern electrophysiology studies as well as functional neu- no involvement of Broca’s area whatsoever (Dronkers et al. roimaging with POSITRON EMISSION TOMOGRAPHY (PET) 1992). Furthermore, Broca’s original patient, on whom a and functional magnetic resonance imaging (fMRI) are now significant part of the model is based, had multiple events being conducted with normal non–brain-injured subjects that led to his aphasia. Since Broca never cut the brain, he and are also beginning to assist in our understanding of the could not have seen that other brain areas were also affected brain mechanisms involved in language. by these injuries. Classic descriptions of the brain areas involved in lan- Similar discrepancies can be seen with regard to Wer- guage have largely implicated those in the left cerebral nicke’s area. Lesions to this area alone do not result in a per- hemisphere known as Broca’s area, Wernicke’s area, and the sisting language comprehension problem, nor do all chronic connecting bundle of fibers between them, the arcuate fas- Wernicke’s aphasia patients have lesions in Wernicke’s area ciculus. These descriptions began in 1861 when Pierre Paul (Dronkers, Redfern, and Ludy 1995). In fact, the data that BROCA described his examination of a chronically ill patient originally predicted the participation of Wernicke’s area in with an unusual speech deficit that restricted his ability to language comprehension were based on Wernicke’s two communicate (Broca 1861). Whenever the patient attempted problematic cases. One of these was noted to have resolved to speak, his utterance was reduced to the single recurring the comprehension problem after seven weeks and was never syllable “tan,” though he could intone it in different ways to even brought to autopsy, while the second was a demented change its meaning. When the patient died a few days after patient with numerous other pathological changes besides the examination, Broca discovered a lesion involving the just those in the superior temporal gyrus. Current findings posterior inferior frontal gyrus (i.e., the back part of the are showing that lesions must encompass far more than just lowest section within the frontal lobe). Though Broca never the inferior frontal gyrus or superior temporal gyrus to pro- cut the brain to examine the extent of this lesion, he sug- duce the respective chronic Broca’s or Wernicke’s aphasia. gested that this specific region was responsible for the artic- Still, those who study language and the brain have clung ulation of speech. The area later became known as Broca’s to traditional theory for several good reasons. First, there area and the behavioral deficit as Broca’s APHASIA. have been no good substitute theories that can explain as much of the data as the classic theories have been able to do. In 1874, Carl Wernicke reported on two patients with a Most aphasic patients do show the pattern of behaviors language disturbance that was very different from the one described by Broca, Wernicke, and Geschwind. Further- Broca described (Wernicke 1874). These patients had diffi- more, physicians and speech pathologists have found it easy culty with what Wernicke described as the “auditory mem- to diagnose and treat their aphasic patients within this ory for words.” In short, they had trouble understanding framework. While the original cases may have been faulty, spoken language, even though their own speech was fluent the theories that resulted have served to answer most of the and unencumbered. Wernicke examined the brain of one of questions that surround these patients. these patients at autopsy and thought that the most signifi- Modern techniques and technologies may gradually be cant area of damage in this patient was in the superior tem- changing the classic model. Most of these contributions poral gyrus (i.e., the top part of the temporal lobe). He concern the roles of traditional language areas, the possibil- concluded that this region was crucial to the purpose of lan- ity that other brain areas might also be important, and the guage comprehension, and subsequently the disorder in lan- likelihood that language processing involves a network of guage comprehension was referred to as Wernicke’s aphasia. brain areas that contribute in individual but interactive ways. Wernicke also developed an elaborate model of language Being so new, these conclusions have not yet made it into processing that was revived by the neurologist Norman neuroscience or linguistics textbooks. Still, it is clear that GESCHWIND in the 1960s, and which has formed the basis of the classic model, despite its important contributions to our current investigations (Geschwind 1965, 1972). understanding language mechanisms in the brain, will see The Wernicke-Geschwind model holds that the compre- some revision in the next decade. hension and formulation of language is dependent on Wer- Take the elusive role of Broca’s area as an example. nicke’s area, after which the information is transmitted over Though Broca thought it was concerned only with the artic- the arcuate fasciculus to Broca’s area where it can be pre- ulation of speech, it has since been associated with many pared for articulation. In general, data from patients with functions. Work in the 1970s included the manipulation of lesions in these areas support this model. Those patients grammatical rules as a function of Broca’s area, since those with injuries that involve temporal lobe structures have eas- patients with a “Broca’s aphasia” and lesions encompassing ily articulated speech but do not always understand spoken 450 Language, Neural Basis of processing. Patients with lesions to this area do show deficits Broca’s area had difficulty in using and comprehending on constrained verbal fluency tasks in which they must gen- grammatical information (Zurif, Caramazza, and Meyerson erate words that begin only with certain letters or belong 1972). Recent functional imaging work with PET has sug- only to certain semantic categories, yet these patients are not gested that it may play a role in short-term memory for lin- obviously aphasic. Thus, the true contribution of this area to guistic information (Stromswold et al. 1996). Still other language must still be determined. PET studies conclude that it is part of an articulatory loop The basal temporal area is another region that was not (Paulesu, Frith, and Frackowiak 1993), while those that implicated in classic models. This region lies at the base of involve electrically stimulating the exposed brain during the temporal lobe and is not usually affected by stroke, neurosurgery specify it as an end stage for motor speech though it can be the source of epileptic seizure activity. (Ojemann 1994). Some epileptic patients have had electrodes temporarily Lesion studies, coupled with high-resolution structural placed under the skull directly on the cortex to monitor sei- neuroimaging, continue to give us new information regarding zure activity. These electrodes can also deliver small electri- other brain regions that might participate in speech and lan- cal charges to the cortex that interfere with normal guage. One new area that may participate in the articulation functioning. When placed over the basal temporal area, of speech is deep in the insula, the island of cortex that lies stimulation disrupts patients’ ability to name objects, imply- beneath the frontal, temporal, and parietal lobes. A recent ing that this area is somehow involved in word retrieval study used computerized lesion reconstructions to find that a (Luders et al. 1986). In a different kind of study, an epileptic very specific area of the insula (high on its precentral gyrus) patient with a seizure focus in the basal temporal area had was lesioned in twenty-five stroke patients who all had a dis- an aphasia associated with the duration of the seizures that order in coordinating the movements for speech articulation resolved once the seizures were stopped (Kirshner et al. (Dronkers 1996). Nineteen stroke patients without this partic- 1995). All these data are derived from a different source ular disorder had lesions that spared the same area. than are most of the data from stroke patients, but still pro- The insula is also lesioned in the majority of cases of vide strong evidence that this area may also be important for Broca’s aphasia (Vanier and Caplan 1990). This is not sur- normal language processing. prising, since Broca’s aphasia patients have trouble in coor- There are several other areas that may also contribute to dinating articulatory movements in addition to their other language in their own way. The cingulate gyrus has been language deficits. Even Broca’s original case had a large implicated in word retrieval, possibly because of its role in lesion that included the insula, as confirmed by a CT scan of maintaining attention to the task. The anterior superior the preserved brain done a hundred years later (Signoret et temporal gyrus, just in front of primary auditory cortex, al. 1984). The fact that Broca’s aphasia requires a large may also play a role in sentence comprehension because of lesion that involves multiple brain areas supports the idea its rich connections to hippocampal structures important in that many different regions must participate in the normal memory. The fact that there are so many new brain regions processing of language. emerging in modern lesion and functional imaging studies The functional imaging literature has given us new of language suggests that the classic Wernicke-Geschwind insight into areas of the brain that are actively engaged dur- model, though useful for so many years, is now seen as ing a language task. Some of these areas would not have oversimplified. Areas all over the brain are recruited for been detected from traditional lesions studies because the language processing; some are involved in lexical retrieval, vascular supply to the brain is more susceptible to stroke in some in grammatical processing, some in the production of certain areas than others. For example, the supplementary speech, some in attention and memory. These new findings motor area is consistently activated in functional imaging are still too fresh for any overarching theories to have studies involving speech, but strokes to this area are rela- developed that might explain how these areas interact. tively uncommon or do not come to the attention of those Future imaging and electrophysiological studies will who study or treat language disorders. The same holds for undoubtedly show us not only the areas involved in lan- posterior temporal areas that appear to be active in word guage but also the recruitment of these areas at any stage of form recognition. Also, functional imaging studies can sig- the process, the manner in which they interact, the time nal the involvement of the right hemisphere in a given course of these activities, and the change in activation and speech or language task, in addition to activation of tradi- allocation of resources relative to task complexity. The tional areas within the left hemisphere. study of the neural mechanisms of language is evolving One area of the frontal lobe that has received a fair rapidly in conjunction with advances in the technologies amount of attention in the functional imaging literature is the that allow us to study it. left inferior prefrontal cortex, the area of the brain in front of Other avenues of interest that are being pursued include and below Broca’s area. Peterson and colleagues found it to the intriguing possibility that the brain may choose where to be activated in semantic retrieval tasks where subjects gener- store lexical information depending on the semantic cate- ated verbs associated with nouns presented to them (Peterson gory to which the word belongs (Damasio et al. 1996; Mar- et al. 1988). Others have also found it involved in tasks tin et al. 1996). Another is that the brain might store and requiring word retrieval or semantic encoding (e.g., Demb et process language in different ways depending on the modal- al. 1995; Warburton et al. 1996). Still, it is not clear whether ity of acquisition (auditory vs. visual) (Neville, Mills, and the role played by this area is one that is truly related to lan- Lawson 1992). Others question whether the brain mecha- guage or whether it is related to attention or executive func- nisms involved in language may differ for men and women, tioning and merely plays an assistive role in language Language of Thought 451 left-handers and right-handers, or monolinguals and bilin- Stromswold, K., D. Caplan, N. Alpert, and S. Rauch. (1996). Localization of syntactic comprehension by positron emission guals. These are all challenges for continued exploration tomography. Brain and Language 52: 452–473. whose findings are shaping contemporary models of lan- Vanier, M., and D. Caplan. (1990). CT-scan correlates of agram- guage processing. matism. In L. Menn and L. Obler, Eds., Agrammatic Aphasia: See also BILINGUALISM AND THE BRAIN; CORTICAL A Cross-Linguistic Narrative Sourcebook. Amsterdam: John LOCALIZATION, HISTORY OF; GRAMMAR, NEURAL BASIS OF; Benjamins, pp. 37–114. HEMISPHERIC SPECIALIZATION; LEXICON, NEURAL BASIS; Warburton, E., R. Wise, C. Price, C. Weiller, U. Hadar, S. Ramsey, SIGN LANGUAGE AND THE BRAIN and R. Frackowiak. (1996). Noun and verb retrieval by normal subjects. Studies with PET. Brain 119: 159–179. —Nina F. Dronkers Wernicke, C. (1874). Der aphasische Symptomencomplex. Bre- slau: Kohn und Weigert. Zurif, E. B., A. Caramazza, and R. Meyerson. (1972). Grammati- References cal judgements of agrammatic aphasics. Neuropsychologia 10: Broca, P. (1861). Remarques sur le siège de la faculté du langage 405–417. articulé, suivies d’une observation d’aphémie (perte de la parole). Bulletin de la Société Anatomique de Paris 6: 330–357. Further Readings Damasio, H., T. J. Grabowski, D. Tranel, R. D. Hichwa, and A. Damasio. (1996). A neural basis for lexical retrieval. Nature Benson, D. F., and A. Ardila. (1996). Aphasia: A Clinical Perspec- 499–505. tive. New York: Oxford University Press. Demb, J., J. Desmond, A. Wagner, C. Valdya, G. Glover, and J. Caplan, D. (1987). Neurolinguistics and Linguistic Aphasiology. Gabrieli. (1995). Semantic encoding and retrieval in the left New York: Cambridge University Press. inferior prefrontal cortex: A functional MRI study of task diffi- Dronkers, N., and R. T. Knight. (Forthcoming). The neural archi- culty and process specificity. Journal of Neuroscience 15(9): tecture of language disorders. In M. Gazzaniga, Ed., The Cog- 5870–5878. nitive Neurosciences. Dronkers, N. F. (1996). A new brain region for coordinating Goodglass, H. (1993). Understanding Aphasia. San Diego: Aca- speech articulation. Nature 384: 159–161. demic Press. Dronkers, N. F., B. B. Redfern, and C. A. Ludy. (1995). Lesion Stemmer, B., and H. Whitaker, Eds. (1998). Handbook of Neurol- localization in chronic Wernicke’s aphasia. Brain and Lan- inguistics. New York: Academic Press. guage 51(1): 62–65. Dronkers, N. F., J. K. Shapiro, B. Redfern, and R. T. Knight. Language of Thought (1992). The role of Broca’s area in Broca’s aphasia. Journal of Clinical and Experimental Neuropsychology 14: 52–53. Geschwind, N. (1965). Disconnexion syndromes in animals and man. Brain 88: 237–294. The language of thought hypothesis is an idea, or family of Geschwind, N. (1972). Language and the brain. Scientific Ameri- ideas, about the way we represent our world, and hence an can 226: 76–83. idea about how our behavior is to be explained. Humans are Kirshner, H., T. Hughes, T. Fakhoury, and B. Abou-Khalil. (1995). marvelously flexible organisms. The commuter surviving Aphasia secondary to partial status epilepticus of the basal tem- her daily trip to New York, the subsistence agriculturalist, poral language area. Neurology 45(8): 1616–1618. the inhabitant of a chaotic African state all thread their dif- Luders, H., R. P. Lesser, J. Hahn, D. S. Dinner, H. Morris, S. ferent ways through the maze of their day. This ability to Resor, and M. Harrison. (1986). Basal temporal language area adapt to a complex and changing world is grounded in our demonstrated by electrical stimulation. Neurology 36: 505–510. mental capacities. We navigate our way through our social Martin, A., C. L. Wiggs, L. G. Ungerleider, and J. V. Haxby. and physical world by constructing an inner representation, (1996). Neural correlates of category-specific knowledge. Nature 379: 649–652. an inner map of that world, and we plot our course from that Mohr, J. P. (1976). Broca’s area and Broca’s aphasia. In H. Whi- inner map and from our representation of where we want to taker and H. Whitaker, Eds., Studies in Neurolinguistics, vol. 1. get to. Our capacity for negotiating our complex and vari- New York: Academic Press, pp. 201–233. able environment is based on a representation of the world Neville, H., D. Mills, and D. Lawson. (1992). Fractionating lan- as we take it to be, and a representation of the world as we guage: Different neural subsystems with different sensitive would like it to be. In the language of FOLK PSYCHOLOGY— periods. Cerebral Cortex 2(3): 244–258. our everyday set of concepts for thinking about ourselves Ojemann, G. (1994). Cortical stimulation and recording in lan- and others—these are an agent’s beliefs and desires. Their guage. In A. Kertesz, Ed., Localization and Neuroimaging in interaction explains action. Thus Truman ordered the bomb- Neuropsychology. San Diego: Academic Press, pp. 35–55. ing of Japan because he wanted to end World War II as Paulesu, E., C. D. Frith, and R. S. J. Frackowiak. (1993). The neu- ral correlates of the verbal component of working memory. quickly as possible, and he believed that bombing offered Nature 362: 342–345. his best chance of attaining that end. Peterson, S. E., P. T. Fox, M. I. Posner, M. Mintun, and M. E. We represent—think about—many features of our world. Raichle. (1988). Positron emission tomographic studies of the We have opinions on politics, football, food, the best way to cortical anatomy of single-word processing. Nature 331: 585– bring up children, and much more. Our potential range of 589. opinion is richer still. You may not have had views on the Signoret, J., P. Castaigne, F. Lehrmitte, R. Abelanet, and P. pleasures of eating opossum roadkills, but now that you are Lavorel. (1984). Rediscovery of Leborgne’s brain: Anatomi- prompted, you quickly will. This richness of our cognitive cal description with CT scan. Brain and Language 22: 303– range is important to the language of thought hypothesis. Its 319. 452 Language of Thought defenders take our powers of MENTAL REPRESENTATION to nothing like the semantic equivalent of words. Thus the con- be strikingly similar to our powers of linguistic representa- cept “tiger” might itself be a complex semantic structure. tion. Neither language nor thought are stimulus bound: we (3) The minimal version leaves open the relationship can both speak and think of the elsewhere and the elsewhen. between Mentalese and natural languages. Perhaps only Both language and thought are counterfactual: we can both learning a natural language powers the development of speak and think of how the world might be, not just how it Mentalese. Perhaps learning a natural language enhances is. We can misrepresent the world; we can both say and and transforms the more rudimentary language of thought think that it is infested by gods, ghosts, and dragons. More- with which one begins. Perhaps learning a natural language over, thoughts and sentences can be indefinitely complex. is just learning to produce linguistic representations that are Of course, if sentences are too long and complex, we cease equivalent to those that can already be formulated in Men- to understand them. But this limit does not seem intrinsic to talese. our system of linguistic representation but is instead a fea- Jerry Fodor goes beyond the minimal language of ture of our capacities to use it. Under favorable circum- thought hypothesis. Fodor argues for Mentalese not just stances, our capacity to handle linguistic complexity from intentional psychology but from cognitive psychology. extends upwards; in unfavorable circumstances, down- He argues that our best accounts, and often our only wards. Moreover, the boundary between the intelligible and accounts, of cognitive abilities presuppose the existence of a the unintelligible is fuzzy and not the result of hitting the rich, language-like internal code. So, for example, any system’s walls. The same seems true of mental representa- account of rational action presupposes that rational agents tion. These similarities are no surprise. Although there may have a rich enough representational system to represent a be thoughts we cannot express, surely there are no utter- range of possible actions and possible outcomes. They must ances we cannot think. have the capacity not just to represent actual states of the The power of linguistic representation comes from the world but possible ones as well. Most importantly, learning organization of language. Sentences are structures built out in general, and concept acquisition in particular, depends on of basic units, words or morphemes. The meaning of the hypothesis formation in the inner code. You cannot learn the sentence—what it represents—depends on the meaning of concept “leopard” or the word leopard unless you already those words together with its structure. So when we learn a have an inner code in which you can formulate an appropri- language, we learn the words together with recipes for ate hypothesis about leopards. So Fodor thinks of Mentalese building sentences out of them. We thus acquire a represen- as semantically rich, with a large, word-like stock of basic tational system of great power and flexibility, for indefi- units. For example, he expects the concepts “truck,” “ele- nitely many complex representations can be constructed out phant,” and even “reactor” to be semantically simple. More- of its basic elements. Since mental representation exhibits over, this large stock of basic units is innate. Experience is these same properties, we might infer that it is organized in causally relevant to an agent’s conceptual repertoire. But we the same way. Thoughts consist of CONCEPTS assembled do not learn our basic concepts from experience. Concept acquisition is more like the development of secondary sex- into more complex structures. A minimal language of ual characteristics than like learning the dress code at the thought hypothesis is the idea that our capacities to think local pub. So Mentalese is independent of any natural lan- depend on a representational system, in which complex rep- guage we speak. The expressive power of natural language resentations are built from a stock of basic elements; the depends on the expressive power of Mentalese, not vice meaning of complex representations depend on their struc- versa. ture and the representational properties of those basic ele- The language of thought hypothesis has been enor- ments; and the basic elements reappear with the same mously controversial. One response focuses on the infer- meaning in many structures. This representational system is ence from intentional psychology to the language of “Mentalese.” thought. For example, Daniel Dennett has long argued that This minimal version of the language of thought hypoth- the relationship between an agent’s intentional profile—the esis leaves many important questions open. (1) Just how beliefs and desires she has—and her internal states is likely “languagelike” is the language of thought? Linguists to be very indirect. In a favorite illustration, he asks us to emphasize the complexity and abstractness of natural- consider a chess-playing computer. These play good chess, language sentence structure. Our thoughts might be com- so we treat them as knowing a lot about the game, as know- plex structures built out of simple elements without thought ing, for example, that the passed-rook pawn is dangerous. structures being as complex as those of natural language. We are right to do so, even though there is no single causally Mentalese may have no equivalent of the elaborate MOR- salient inner state that corresponds to that belief. Dennett PHOLOGY and PHONOLOGY of natural languages. Natural thinks that the relationship between our beliefs and our languages probably have features that reflect their history as causally efficacious inner states is likely to be equally indi- spoken systems. If so, these are unlikely to be part of Men- rect. I think this argument is best seen as a response to talese. (2) The minimal hypothesis leaves open the nature of Fodor’s strong version of the language of thought hypothe- the basic units. Perhaps the stock of concepts is similar to sis. The same is true of many other critical responses, for the stock of simple words of a natural language. Just as these often focus on Fodor’s denial that learning increases there are words for tigers and trucks, there are concepts of the expressive capacity of our thoughts. The Churchlands, them amongst the basic stock out of which thoughts are for example, have taken this to be a reduction of the lan- built. But the minimal version is also compatible with the guage of thought hypothesis itself, but if anything, it is a idea that the basic units out of which complexes are built are Language Production 453 reduction only of Fodor’s strong version of it. Connectionist ongoing time. This requires an explanation of the action models of cognition, on the other hand, do seem to be a system that puts language knowledge to use. threat to any version of a language of thought hypothesis, The action system for language production has a COGNI- for in connectionist mental representation, meaning is not a TIVE ARCHITECTURE along the lines shown in figure 1. function of the structure plus the meaning of the atomic Imagine a speaker who wishes to draw a listener’s attention units (see CONNECTIONISM, PHILOSOPHICAL ISSUES). to a rabbit browsing in the garden. The process begins with a communicative intention, a message, that stands at the See also IMAGERY; INTENTIONALITY; LANGUAGE AND interface between thought and language. Little is known THOUGHT; MENTAL CAUSATION; PROPOSITIONAL ATTITUDES about the content or structure of messages, but they are —Kim Sterelny assumed to include at least conceptual categorizations (in figure 1, tacit rabbit-knowledge) and the information needed References for making distinctions such as tense, number, aspect, and speaker’s perspective. Less certain is whether messages Churchland, P. S. (1986). Neurophilosophy. Cambridge, MA: MIT habitually include different kinds of information as a func- Press. tion of the language being spoken, along the lines proposed Dennett, D. C. (1987). True believers. In D. C. Dennett, The Inten- in the Sapir-Whorf hypothesis (see LINGUISTIC RELATIVITY tional Stance. Cambridge, MA: MIT Press. Fodor, J. A. (1975). The Language of Thought. Sussex: Harvester HYPOTHESIS; Slobin 1996). Press. Of primary interest to contemporary theories of produc- Fodor, J. A. (1981). The present status of the innateness contro- tion are the processing components dubbed grammatical and versy. In J. A. Fodor, Representations. Cambridge, MA: MIT phonological in figure 1 (following Levelt 1989). These are Press. the processes immediately responsible for recruiting the lin- Fodor, J. A., and Z. Pylyshyn. (1988). Connectionism and cogni- guistic information to create the utterances that convey mes- tive architecture: a critical analysis. Cognition 28: 3–71. sages. Grammatical encoding refers to the cognitive mechanisms for retrieving, ordering, and adjusting words for Further Readings their grammatical environments, and phonological encoding refers to the mechanisms for retrieving, ordering, and adjust- Clark, A. (1989). Microcognition: Philosophy, Cognitive Science, and Parallel Distributed Processing. Cambridge, MA: MIT ing sounds for their phonological environments. Press. The motivation for separating these components comes Clark, A. (1993). Associative Engines: Connectionism, Concepts, from several lines of evidence for a division between and Representational Change. Cambridge, MA: MIT Press. word-combining and sound-combining processes. Speech Fodor, J. A. (1987). Psychosemantics: The Problem of Meaning in errors suggest that there are two basic sorts of elements the Philosophy of Mind. Cambridge, MA: MIT Press. that are manipulated by the processes of production, Fodor, J. A. (1990). A Theory of Content and Other Essays. Cam- roughly corresponding to words and sounds (Dell 1995). bridge, MA: MIT Press. So-called tip-of-the-tongue states (the familiar frustration Harman, G. (1975). Language, thought, and communication. In K. of being unable to retrieve a word that one is certain one Gunderson, Ed., Minnesota Studies in the Philosophy of Sci- knows) can carry word-specific grammatical information, ence. Vol. 7, Language, Mind, and Knowledge. Minneapolis: University of Minnesota Press. in the absence of sound information (Miozzo and Cara- Lower, B., and G. Rey, Eds. (1991). Jerry Fodor and His Critics. mazza 1997; Vigliocco, Antonini, and Garrett 1997). Elec- Oxford: Blackwell, ch.11–13. trophysiological evidence also suggests that grammatical Smolensky, P. (1988). On the proper treatment of connectionism. information about words is accessible about 40 ms before Behavioral and Brain Sciences 11: 1–84. information about sounds (van Turennout, Hagoort, and Brown 1998). Finally, the arbitrariness of the linguistic Language Processing mapping from meaning to sound creates a computational problem that can only be solved by a mediating mecha- nism (Dell et al. 1997). These and other observations argue See LANGUAGE PRODUCTION; PSYCHOLINGUISTICS; SEN- that there are distinct grammatical and phonological TENCE PROCESSING; SPEECH PERCEPTION encoding mechanisms. Grammatical encoding Adult speakers of English know Language Production between 30,000 and 80,000 words. The average for high- school graduates has been estimated at 45,000 words. These Language production means talking, but not merely that. words can be arranged in any of an infinite number of ways When people talk, it is usually because they have something that conform to the grammar of English. The ramifications of to say. Psycholinguistic research on language production this can begin to be appreciated in the number of English sentences with 20 or fewer words, which is about 1030. concerns itself with the cognitive processes that convert nonverbal communicative intentions into verbal actions. Using these resources, speakers must construct utterances to These processes must translate perceptions or thoughts into convey specific messages. They normally do so in under two sounds, using the patterns and elements of a code that con- seconds, although disruptions are common enough that aver- stitutes the grammar of a language. For theories of language age speakers spend about half of their speaking time in not production, the goal is to explain how the mind uses this speaking—hemming, hawing, and pausing between three code when converting messages into spontaneous speech in and twelve times per minute (Goldman-Eisler 1968). These 454 Language Production Figure 1. A cognitive architecture for language production. disfluencies reflect problems in retrieving a suitable set of singular subject will be inflected differently than the same words and arranging them into a suitable syntactic structure. verb accompanying a plural subject: A razor cuts, whereas What is suitable, of course, is not merely a matter of gram- Scissors cut). Thus, speakers must recover information maticality (though it is also that) but of adequacy for convey- about grammatical class and morphology. ing a particular message to particular listeners in particular Lexical selection involves locating a lexical entry (tech- places and times. nically, a lemma) that adequately conveys some portion of a Lexical selection and retrieval are integral to grammatical message, ensuring that there exists a word in one’s mental processing because of the kinds of information that words lexicon that will do the job. A rough analogy is looking for a carry about their structural and positional requirements. In word in a reverse dictionary, which is organized semanti- everyday language use, words are rarely produced in isola- cally rather than alphabetically. If the desired meaning is tion. Instead, they occupy places within strings of words, listed in the dictionary with a single word that expresses the with their places determined in part by their grammatical sense, there is an appropriate word to be had in the lan- categories (e.g., in most English declarative sentences, at guage; if not, the search fails. The mental lexicon is presum- least one noun will precede a verb) and their MORPHOLOGY ably accessible in a comparable fashion, permitting determined in part by their positions with respect to other speakers to determine whether they know a word that con- words (e.g., a present-tense English verb accompanying a veys the meaning they intend. Most English speakers, for Language Production 455 The discrete-stage view (Levelt, Roelofs, and Meyer, forth- example, will find at least one lemma for their concept of a coming) argues that each step of the retrieval process is member of the family Oryctolagus cuniculus. completed before the next is begun (discrete processing), Locating a lemma yields basic information about how a and without feedback to higher level processes from lower word combines with other words. This corresponds to infor- levels (strict feedforward processing). In contrast, interac- mation about grammatical class (noun, verb, adjective, etc.) tive views (Dell et al. 1997) embrace the possibilities of par- and other grammatical features that control a word’s combi- tial information from one stage affecting the next (cascaded natorial privileges and requirements (e.g., nouns must be processing) and of information from lower levels affecting specified as mass or count, and if count, as singular or plu- higher levels of processing (feedback). ral; verbs must be specified as intransitive or transitive, and What is at stake in this theoretical battle, in part, is the if transitive, as simple or ditransitive, etc.; cf. LEXICON). The role that an explanation for speech errors should play in the lemma for an instance of Oryctolagus cuniculus, for exam- account of normal speech. Speech errors are a traditional ple, is a noun, count, and singular. These features in turn foundation for the study of language production (Dell 1986; affect the determination of syntactic functions such as sub- Fromkin 1973; Garrett 1988), and the properties of errors ject phrases, predicate phrases, and their arrangement (cf. have long been viewed as informative about the production SYNTAX). of error-free speech. Consider the word exchange by a Once a lemma is found, the word’s morphology (techni- speaker who intended to say minds of the speakers and cally, its lexeme) may have to be adjusted to its syntactic instead uttered “speakers of the minds.” Or the sound environment. In connected speech, this will encompass exchange that yielded “accipital octivity” instead of occipi- inflectional processes (e.g., making a verb singular or plu- tal activity. Such errors point to the embodiment of rule-like ral). Lexical retrieval processes yield an abstract specifica- constraints in the arrangement process. When words tion of the morphological structure of the selected word. So, exchange, they exchange almost exclusively with other retrieving the lexeme for the count noun that denotes a words from the same syntactic class (noun, verb, adjective, member of the family Oryctolagus cuniculus should yield a and so on). When sounds exchange, they exchange almost word stem for rabbit. exclusively with other sounds from the same phonological The role of active morphological processes in language class (consonant or vowel). The net effect is that erroneous use is currently disputed in some quarters. The issue is utterances are almost always grammatical, albeit often non- whether regularly inflected words are stored and retrieved sensical. In the spirit of exceptions proving the rule, this has from memory in the same way as uninflected words been taken to mean that speech errors represent small, local, (Seidenberg 1997) or require a separable set of specifically and most importantly, principled departures from normal inflectional operations (Marslen-Wilson and Tyler 1997). retrieval or assembly operations. Although this debate has been confined primarily to Models of word production that incorporate discrete research on word recognition, logically comparable issues stages have been less successful in accounting for the dis- arise regarding language production. In production, how- tribution and the features of speech errors than interactive ever, it may be harder to account for the available data with views, in part because of a difference in explanatory passive retrieval mechanisms (see Bock, Nicol, and Cutting, emphasis. Levelt, Roelofs, and Meyer, forthcoming) elabo- forthcoming). rate a discrete-stage approach that is designed primarily to Phonological encoding Words can be comfortably artic- account for experimental data about the time course of ulated at a rate of four per second, calling on more muscle word production, and not for errors, for the simple reason fibers than may be required for any other mechanical perfor- that errors are departures from the norm. What the produc- mance of the human body (Fink 1986). Yet errors in the pro- tion system does best, and remarkably well under the cir- duction of speech sounds are rare, only occurring once in cumstances, is retrieve words and sounds appropriate for every 5,000 words or so (Deese 1984). Controlling the speakers’ communicative intentions. Within the approach activity requires some specification of phonological seg- endorsed by Levelt, Roelofs, and Meyer, errors are a prod- ments (/r/, /æ/, /b/, etc.), syllabification, and metrical struc- uct of aberrations from the basic operating principles of the ture. The broad outlines of phonological encoding are production system and are correspondingly rare events. By sketched similarly in current theories (Dell et al. 1997; Lev- this argument, errors demand a separate theory. elt, Roelofs, and Meyer, forthcoming). Counter to the intu- Despite these differences, the leading theories of lan- ition that words are stored as wholes, sound segments are guage production are in agreement that knowledge about actually assembled into word forms during the encoding words comes in pieces and that the pieces are not recovered process. Consonantal and vocalic segments must be selected all at once. In normal speaking, the semantic, syntactic, and and assigned to positions within syllabic frames. Addition- phonological properties of words are called upon in quick ally, the syllables and sounds must be integrated into the succession, not simultaneously. Thus, what normally feels stream of speech: In a sequence of words such as rabbit in, like a simple, unitary act of finding-and-saying-a-word is the /t/ in rabbit will be produced differently than it is in rab- actually a complicated but fast assembly of separate, inter- bit by. One implication is that a description of the sound locking features. More broadly, speaking cannot be likened segments in an isolated word is insufficient to explain the to the recitation of lines, as E. B. Titchener once did in word’s form in connected speech. describing it as “reading from a memory manuscript” Where theories of word production diverge is in their (1909). It involves active, ongoing construction of utter- view of the relationship between phonological encoding and ances from rudimentary linguistic parts. the higher level processes of lexical selection and retrieval. 456 Language Universals Communicative processes All the processes of language Deese, J. (1984). Thought into Speech: The Psychology of a Lan- guage. Englewood Cliffs, NJ: Prentice-Hall. production serve communication, but certain activities tailor Dell, G. S. (1986). A spreading-activation theory of retrieval in messages and utterances to the needs of particular listeners sentence production. Psychological Review 93: 283–321. at particular places and times. The tailoring requirements Dell, G. S. (1995). Speaking and misspeaking. In L. R. Gleitman are legion. They range from such patent demands as choos- and M. Liberman, Eds., An Invitation to Cognitive Science. Vol. ing language (English? Spanish?) and gauging loudness 1, Language. Cambridge, MA: MIT Press, pp. 183–208. (whisper? shout?) to the need to infer what the listener is Dell, G. S., M. F. Schwartz, N. Martin, E. M. Saffran, and D. A. likely to be thinking or capable of readily recollecting. Gagnon. (1997). Lexical access in aphasic and nonaphasic These are aspects of PRAGMATICS (cf. PSYCHOLINGUISTICS). speakers. Psychological Review 104: 801–838. A common shortcoming of speakers is presuming too much, Fink, B. R. (1986). Complexity. Science 231: 319. failing to anticipate the myriad misconstruals to which any Fromkin, V. A., Ed. (1973). Speech Errors as Linguistic Evidence. The Hague: Mouton. given utterance or expression is vulnerable. The source of Garrett, M. F. (1982). Production of speech: Observations from this presumptuousness is transparent: Speakers know what normal and pathological language use. In A. Ellis, Ed., Normal- they intend. For them, there is no ambiguity in the message. ity and Pathology in Cognitive Functions. London: Academic The direct apprehension of the message sets speakers Press, pp. 19–76. apart from their listeners, for whom ambiguity is rife. In this Garrett, M. F. (1988). Processes in language production. In F. J. crucial respect, language production has little in common Newmeyer, Ed., Linguistics: The Cambridge Survey. Vol. 3, with language comprehension. In other respects, however, Language: Psychological and Biological Aspects. Cambridge: successful communication demands that production and Cambridge University Press, pp. 69–96. comprehension share certain competencies. They must Goldman-Eisler, F. (1968). Psycholinguistics: Experiments in somehow draw on the same linguistic knowledge, because Spontaneous Speech. London: Academic Press. Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. we speak as well as understand our native languages. Cambridge, MA: MIT Press. The cognitive processing systems responsible for com- Levelt, W. J. M., A. Roelofs, and A. S. Meyer. (Forthcoming). A prehension and production may nonetheless be distinct. theory of lexical access in speech production. Behavioral and Research on language disorders suggests a degree of inde- Brain Sciences. pendence between them, because people with disorders of Marslen-Wilson, W. D., and L. K. Tyler. (1997). Dissociating types production can display near-normal comprehension abili- of mental computation. Nature 387: 592–594. ties, and vice versa (Caramazza 1997). At a minimum, the Miozzo, M., and A. Caramazza. (1997). Retrieval of lexical-syn- flow of information must differ, leading from meaning to tactic features in tip-of-the-tongue states. Journal of Experi- sound in production and from sound to meaning in compre- mental Psychology: Learning, Memory and Cognition 23: hension. 1410–1423. Seidenberg, M. S. (1997). Language acquisition and use: Learning This simple truth about information flow camouflages the and applying probabilistic constraints. Science 275: 1599–1603. deep questions that are at stake in current debates about the Slobin, D. I. (1996). From “thought and language” to “thinking for isolability and independence of the several cognitive and lin- speaking.” In J. Gumperz and S. C. Levinson, Eds., Rethinking guistic components of production. The questions are a piece Linguistic Relativity. Cambridge: Cambridge University Press. of the overarching debate about modularity, to do with Titchener, E. B. (1909). Lectures on the Experimental Psychology whether language and its parts are in essence the same as of the Thought-Processes. New York: Macmillan. other forms of cognition and more broadly, whether all types van Turennout, M., P. Hagoort, and C. M. Brown. (1998). Brain of knowledge are in essence the same in acquisition and use. activity during speaking: From syntax to phonology in 40 milli- Accordingly, answers to the important questions about lan- seconds. Science 280: 572–574. guage production bear on our understanding of fundamental Vigliocco, G., T. Antonini, and M. F. Garrett. (1997). Grammatical gender is on the tip of Italian tongues. Psychological Science 8: relationships between LANGUAGE AND THOUGHT, between 314–317. free will and free speech, and between natural and artificial intelligence. Language Universals See also CONCEPTS; CONNECTIONIST APPROACHES TO LANGUAGE; MEANING; NATURAL LANGUAGE GENERATION; SENTENCE PROCESSING See LANGUAGE AND CULTURE; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; TYPOLOGY —Kathryn Bock Language Use References Bock, K. (1995). Sentence production: From mind to mouth. In J. See INTRODUCTION: LINGUISTICS AND LANGUAGE; DIS- L. Miller and P. D. Eimas, Eds., Handbook of Perception and COURSE; MEANING; PRAGMATICS Cognition. Vol. 11, Speech, Language, and Communication. Orlando, FL: Academic Press, pp. 181–216. Language Variation and Change Bock, J. K., J. Nicol, and J. C. Cutting. (Forthcoming). The ties that bind: Creating number agreement in speech. Journal of Memory and Language. The speech of no two people is identical, so it follows that if Caramazza, A. (1997). How many levels of processing are there in one takes manuscripts from two eras, one will be able to lexical access? Cognitive Neuropsychology 14: 177–208. Language Variation and Change 457 tion of theories of grammar, acquisition, or change—except identify differences and so point to language “change.” In under one set of circumstances, where the new distribution this sense, languages are constantly changing in piecemeal, of cues results from an earlier parametric shift; in that cir- gradual, chaotic, and relatively minor fashion. However, cumstance, one has a “chain” of grammatical changes. historians also know that languages sometimes change This approach to change is not tied to any particular abruptly, several things changing at the same time, and then grammatical model. Warner (1995) offers a persuasive anal- settle into relative stasis, in a kind of “punctuated equilib- ysis of parametric shift using a lexicalist HEAD-DRIVEN rium.” So, all of the long vowels in English were raised (and the PHRASE STRUCTURE GRAMMAR model. Interesting diachro- highest vowels diphthongized) in the famous Great Vowel nic analyses have been offered for a wide range of phenom- Shift, which took place in late Middle English. Similarly, ena, invoking different grammatical claims; see the Further the language lost several uses of the verb be simultaneously Readings for examples. in the nineteenth century (I wish our opinions were the This approach to abrupt change, where children acquire different systems from those of their parents, is echoed in same, but in time they will; you will be to visit me in prison; their being going to be married) and developed the first pro- work on creolization (see CREOLES) under the view of Bick- gressive passives: everything is being done (Warner 1995). erton (1984), and the acquisition of signing systems by chil- We may adopt a cognitive view of grammars, that they dren exposed largely to unnatural input (Goldin-Meadow are mental entities that arise in the mind/brain of individual and Mylander 1990; Newport 1999; Supalla 1990; see SIGN children. Hermann Paul (1880) was the first person to study LANGUAGES). Bickerton argues that situations in which “the change with roughly this view of grammars. Then it is natu- normal transmission of well-formed language data from one ral to try to interpret cascades of changes in terms of generation to the next is most drastically disrupted” will tell changes in grammars, a new setting for some parameter, us something about the innate component and how it deter- sometimes having a wide variety of surface effects and per- mines acquisition (Bickerton 1999). haps setting off a chain reaction. Such “catastrophic” The vast majority of deaf children are exposed initially changes are recognizable by the distinctive features dis- to fragmentary signed systems that have not been internal- cussed in Lightfoot 1991 (chap. 7). So grammatical approa- ized well by their primary models. Goldin-Meadow and ches to language change have focussed on these large-scale Mylander (1990) take these to be artificial systems, and changes, assuming that the clusters of properties tell us they show how deaf children go beyond their models in about the harmonies that follow from particular parameters. such circumstances and “naturalize” the system, altering By examining the clusters of simultaneous changes and by the code and inventing new forms that are more consistent taking them to be related by properties of Universal Gram- with what one finds in natural languages. The acquisition mar, we discover something about the scope and nature of of signed languages under these circumstances casts light parameters and about how they are set. Work on language on abrupt language change, creolization, and on cue-based change from this perspective is fused with work on lan- learning (Lightfoot 1998). guage variation and acquisition. Change illuminates the There has been interesting work on the replacement of principles and parameters of Universal Grammar in the one grammar by another—that is, the spread of change same way that, when we view a forest at some distance, we through a community. So, Kroch and his associates (Kroch may not see the deer until it moves. 1989; Kroch and Taylor 1997; Pintzuk 1990; Santorini This grammatical approach to diachrony explains 1992, 1993; Taylor 1990) have argued for coexisting gram- changes at two levels. First, the set of parameters postulated mars. That work postulates that speakers may operate with as part of UG explains the unity of the changes, why super- more than one grammar in a kind of “internalized diglossia” ficially unrelated properties cluster in the way that they do. and it enriches grammatical analyses by seeking to describe Second, historical records, where they are rich, show not the variability of individual texts and the spread of a gram- only when catastrophic change takes place but also what matical change through a population. kinds of changes were taking place in the language prior to Niyogi and Berwick (1995) have offered a population the parametric shift. This enables us to identify in what genetics computer model for describing the spread of new ways the trigger experiences of children who underwent the grammars. Certain changes progress in an S-curve and now parametric shift differed from those of people with the older Niyogi and Berwick provide a model of the emergent, glo- grammar. This, in turn, enables us to hypothesize what the bal population behavior, which derives the S-curve. They crucial trigger experience is for setting a given parameter. postulate a learning theory and a population of child learn- Recent work has treated this topic in terms of cue-based ers, a small number of whom fail to converge on preexisting learning (Dresher and Kaye 1990; Dresher 1997; Lightfoot grammars, and they produce a plausible model of popula- 1997): under this view, parameters have a designated cue; tion changes for the loss of null subjects in French. children scan their linguistic environment for the relevant Taking grammars to be elements of cognition has been cues and set parameters accordingly. The distribution of productive for work on language change, but it is not as those cues may change in such a way that a parameter was common an approach as one that takes grammars to be set differently. social entities. The distinction and its implications for his- This model has nothing to say about why the distribution torical linguistics are discussed by Lightfoot (1995). The of the cues should change. That may be explained by claims approach described here is analogous to the study of evolu- about language contact or socially defined speech fashions, tionary change in order to learn about general biological and it is a function of the use of grammars and not a func- principles and about particular species. 458 Lashley, Karl Spencer (1890–1958) See also Clark, R., and I. Roberts. (1993). A computational approach to lan- LANGUAGE ACQUISITION; LANGUAGE AND CUL- guage learnability and language change. Linguistic Inquiry 24: TURE; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; 299–345. PARAMETER-SETTING APPROACHES TO ACQUISITION, CRE- Fontana, J. M. (1993). Phrase Structure and the Syntax of Clitics in OLIZATION, AND DIACHRONY the History of Spanish. Ph.D. diss., University of Pennsylvania. —David Lightfoot Jespersen, O. (1922). Language, Its Nature, Development, and Origin. London: Allen and Unwin. Kemenade, A. van. (1987). Syntactic Case and Morphological References Case in the History of English. Dordrecht: Foris. Kiparsky, P. (1995). Indo-European origins of Germanic syntax. In Bickerton, D. (1984). The language bioprogram hypothesis. Battye and Roberts (1995). Behavioral and Brain Sciences 7(2): 173–222. Kiparsky, P. (1997). The rise of positional licensing in Germanic. Bickerton, D. (1999). How to acquire language without positive In van Kemenade and Vincent (1997). evidence: What acquisitionists can learn from creoles. In Lass, R. (1997). Historical Linguistics and Language Change. DeGraff, Ed. (1999). Cambridge: Cambridge University Press. DeGraff, M., Ed. (1999). Language Creation and Language Lightfoot, D. W. (1979). Principles of Diachronic Syntax. Cam- Change. Cambridge, MA: MIT Press. bridge: Cambridge University Press. Dresher, B. E. (1997). Charting the learning path: Cues to parame- Lightfoot, D. W. (1993). Why UG needs a learning theory: Trig- ter setting. To appear in Linguistic Inquiry. gering verb movement. In C. Jones, Ed., Historical Linguistics: Dresher, B. E., and J. Kaye. (1990). A computational learning Problems and Perspectives. London: Longman, pp. 190–214. model for metrical phonology. Cognition 137–195. Reprinted in Battye and Roberts (1995). Goldin-Meadow, S., and C. Mylander. (1990). Beyond the input Lightfoot, D. W. (1997). Shifting triggers and diachronic reanaly- given: The child’s role in the acquisition of language. Language ses. In van Kemenade and Vincent (1997). 66: 323–355. Lightfoot, D. W. (1999). Creoles and cues. In DeGraff, Ed. (1999). Kemenade, A. van, and N. Vincent, Eds. (1997). Parameters of Mor- Pearce, E. (1990). Parameters in Old French Syntax. Dordrecht: phosyntactic Change. Cambridge: Cambridge University Press. Kluwer. Kroch, A. (1989). Reflexes of grammar in patterns of language Roberts, I. G. (1985). Agreement patterns and the development of change. Journal of Language Variation and Change 1: 199–244. the English modal auxiliaries. Natural Language and Linguistic Kroch, A., and A. Taylor. (1997). Verb movement in Old and Mid- Theory 3: 21–58. dle English: Dialect variation and language contact. In van Roberts, I. G. (1993a). Verbs and Diachronic Syntax. Dordrecht: Kemenade and Vincent (1997). Kluwer. Lightfoot, D. W. (1991). How to Set Parameters: Arguments from Roberts, I. G. (1993b). A formal account of grammaticalization in Language Change. Cambridge, MA: MIT Press. the history of Romance futures. Folia Linguistica Historica 13: Lightfoot, D. W. (1995). Grammars for people. Journal of Linguis- 219–258. tics 31: 393–399. Roberts, I. G. (1998). Verb movement and markedness. In Lightfoot, D. W. (1997). Catastrophic change and learning theory. DeGraff, Ed.(1999). Lingua 100: 171–192. Sprouse, R., and B.Vance. (1999). An explanation for the loss of Lightfoot, D. W. (1998). The Development of Language: Acquisi- null subjects in certain Romance and Germanic languages. In tion, Change, and Evolution. Oxford: Blackwell. DeGraff, Ed. (1999). Newport, E. L. (1999). Reduced input in the acquisition of signed Vance, B. (1995). On the decline of verb movement to Comp in languages: Contributions to the study of creolization. In Old and Middle French. In A. Battye and I. Roberts, Eds., DeGraff, Ed. (1999). Clause Structure and Language Change. Oxford: Oxford Uni- Niyogi, P., and R. C. Berwick. (1995). The logical problem of lan- versity Press. guage change. MIT A. I. Memo No. 1516. Warner, A. R. (1983). Review article on Lightfoot 1979. Journal of Paul, H. (1880). Prinzipien der Spachgeschichte. Tübingen: Niem- Linguistics 19: 187–209. eyer. Warner, A. R. (1993). English Auxiliaries: Structure and History. Pintzuk, S. (1990). Phrase Structures in Competition: Variation Cambridge: Cambridge University Press. and Change in Old English Word Order. Ph.D. diss., University Warner, A. R. (1997). The structure of parametric change, and V of Pennsylvania. movement in the history of English. In van Kemenade and Vin- Santorini, B. (1992). Variation and change in Yiddish subordinate cent (1997). clause word order. Natural Language and Linguistic Theory 10: 595–640. Santorini, B. (1993). The rate of phrase structure change in the his- Lashley, Karl Spencer (1890–1958) tory of Yiddish. Journal of Language Variation and Change 5: 257–283. Supalla, S. (1990). Segmentation of Manually Coded English: Donald HEBB described Karl Lashley’s career as “perhaps Problems in the Mapping of English in the Visual/Gestural Mode. Ph.D. diss., University of Illinois. the most brilliant in the psychology of this century” (Hebb Taylor, A. (1990). Clitics and Configurationality in Ancient Greek. 1959: 142). Lashley’s intellectual odyssey about brain and Ph.D. diss., University of Pennsylvania. behavior extended from the earliest days of Watsonian Warner, A. R. (1995). Predicting the progressive passive: Parametric BEHAVIORISM to astonishingly modern cognitive views. change within a lexicalist framework. Language 71(3): 533–557. Lashley attended the University of West Virginia, where he studied with John Black Johnston, a neurologist. When Further Readings taking his first class from Johnston in zoology, Lashley “knew that I had found my life’s work.” Lashley plunged Battye, A., and I. Roberts, Eds. (1995). Clause Structure and Lan- into the study of zoology, neuroanatomy, embryology, and guage Change. Oxford: Oxford University Press. Lashley, Karl Spencer(1890–1958) 459 animal behavior, graduating in 1910. He received an M.S. in the end of his career, looking over his lifetime of research bacteriology at the University of Pittsburgh in 1911, where on memory, Lashley (1950) concluded that he studied experimental psychology with K. M. Dallenbach. This series of experiments has yielded a good bit of information Following this he enrolled for the Ph.D. in zoology at Johns about what and where the memory trace is not. It has discovered Hopkins with H. S. Jennings, with a minor in psychology nothing directly of the real nature of the memory trace. I sometimes with Adolf Meyer and John B. Watson. Watson’s develop- feel, in reviewing the evidence of the localization of the memory ing theory of behaviorism, and Watson himself, had a pro- trace, that the necessary conclusion is that learning is just not found influence on Lashley. possible. It is difficult to conceive of a mechanism that can satisfy In a letter written much later to Ernest Hilgard at Stan- the conditions set for it. Nevertheless, in spite of such evidence ford, Lashley described taking a seminar with Watson in against it, learning sometimes does occur. (477–478). 1914. Watson called attention in the seminar to the writings Lashley’s positive contributions were extraordinary. Per- of Bechterev and Pavlov on conditioned reflexes, which haps most striking was his brilliant analysis of the “problem they translated. of serial order in behavior” (Lashley 1951). Ranging from In the spring I served as an unpaid assistant and we constructed the properties of language to the performance of complex apparatus and did experiments, repeating a number of their motor sequences, he showed that “associative chaining” experiments. Our whole program was then disrupted by the move cannot account for serial behavior. Rather, higher-order rep- to the lab in Meyer’s Clinic. There were no adequate animal resentations must exist in the brain in the form of patterns of quarters there. Watson started work with infants as the next best action “where spatial and temporal order are. . .interchange- material available. I tagged along for awhile but disliked the babies and found me a rat lab in another building. able.” (Lashley 1951: 128). We accumulated a considerable amount of experimental Lashley made a number of other major contributions, material on the conditioned reflex that was never published. including an insightful analysis of “instinctive” behaviors, Watson sought the basis of a systemic psychology and was not analysis of sexual behaviors, characterization of the patterns greatly concerned with the reaction itself. of thalamocortical projections, functions of the visual cor- tex, a rethinking of cortical cytoarchitectonics (Beach et al. The conditioned reflex thus came to form the basis of 1960). Watson’s behaviorism. Lashley, on the other hand, had In 1936 James B. Conant, then the new president of become interested in the physiology of the reaction and the Harvard, appointed an ad hoc committee to find “the best attempt to trace conditioned reflex paths through the central psychologist in the world” (Beach 1961). Lashley, then at nervous system. the University of Chicago, was chosen and hired in 1937. During the period at Hopkins, Lashley worked with He became director of the Yerkes Laboratory of Primate Shepherd Irvory Franz at St. Elizabeth’s Hospital in Wash- Biology in Florida but maintained his chair at Harvard, ington. Together, they developed a new approach to the traveling to Cambridge once each year to give his two- study of brain mechanisms of learning and memory and week graduate seminar. The roster of eminent psycholo- published landmark papers on the effect of cortical lesions gists and neuroscientists who worked with Lashley is with- on learning in rats. From this time (1916) until 1929, Lash- out parallel. ley systematically used the lesion method in an attempt to According to his student Roger SPERRY (personal com- localize memory traces in the brain. Following Watson munication), Lashley was interested in the problem of CON- (and Pavlov), Lashley conceived of the brain as a massive SCIOUSNESS but refused to write about it, considering it reflex switchboard, with sequential chaining of input-output something of an epiphenomenon. He did, however, specu- circuitries via the cerebral cortex as the basis of memory. late about the mind, in itself heretical for a behaviorist: This work culminated in his classic 1929 monograph Brain Mechanisms of Intelligence. He was also president of the “Mind is a complex organization, held together by interaction American Psychological Association that year. His presi- processes and by time scales of memory, centered about the dential address, and his 1929 monograph, destroyed the body image. It has no distinguishing features other than its switchboard reflexology theory of brain function and learn- organization . . . there is no logical or empirical reason for ing as it existed at that time. In complex mazes, rats were denying the possibility that the correlation (between mental and impaired in LEARNING and MEMORY in proportion to the neural activities) may eventually show a complete identity of the degree of cerebral cortex destroyed, independent of locus. two organizations” (Lashley 1958: 542). He employed the terms “mass action” and “equipotentiality” In his biographical memoir of Lashley, Frank Beach more to describe his results than as a major theory. Lashley (1961) offered the following tribute (163): did not deny localization of function in the neocortex. Rather, he argued that the neural substrate for higher-order Eminent psychologist with no earned degree in psychology memorial functions—he often used the term “intelli- Famous theorist who specialized in disproving theories, gence”—as in complex maze learning in rats, was widely including his own distributed in the CEREBRAL CORTEX. Inspiring teacher who described all teaching as useless. Lashley was perhaps the most formidable critical thinker of his time, successfully demolishing all major theories of See also CORTICAL LOCALIZATION, HISTORY OF; brain behavior, from Pavlov to the Gestalt psychologists to GESTALT PSYCHOLOGY his own views. “He remarked that he had destroyed all theo- ries of behavior, including his own” (Hebb 1959: 149). Near —Richard F. Thompson 460 Laws with little or no reliance on sensory landmarks (see ANIMAL References NAVIGATION and ANIMAL NAVIGATION, NEURAL NET- Beach, F. A. (1961). A biographical memoir [of Karl Lashley]. WORKS). Biographical Memoirs of the National Academy of Sciences 35: Spatial learning may be considered a special case within 162–204. a broader category of learning, in which an organism gains Beach, F. A., D. D. Hebb, C. T. Morgan, and H. W. Nissen, Eds. (“memorizes”) information about its environment (see (1960). The Neuropsychology of Lashley. New York: McGraw- MEMORY). In humans, this information may be derived Hill. directly from firsthand experience, or indirectly, from what Hebb, D. D. (1959). Karl Spencer Lashley, 1890-1958. The Ameri- one reads or hears from others. This information may then can Journal of Psychology 72: 142–150. be used later on for some memory-based report (e.g., a Lashley, K. S. (1929). Brain Mechanisms and Intelligence. Chi- cago: University of Chicago Press. response to the question, “What happened yesterday?”), or Lashley, K. S. (1950). In search of the engram. Society of Experi- as a basis for modifying future action. In any of these cases, mental Biology, Symposium 4, pp. 454–482. one encodes the information into memory during the initial Lashley, K. S. (1951). The problem of serial order in behavior. In exposure, and then retrieves this stored information later on. Cerebral Mechanisms in Behavior. New York: Wiley, pp. 112– The initial encoding may be intentional (if one is seeking to 136. memorize the target information) or incidental (if the learn- Lashley, K. S. (1958). Cerebral organization and behavior. Pro- ing is a by-product of one’s ordinary commerce with the ceedings of the Association for Research in Nervous and Men- world, with no intention of learning). Similarly, the subse- tal Diseases 36: 1–18. quent use of the information may involve explicit memory (if one wittingly and deliberately seeks to use the stored Laws information later on) or implicit memory (if the stored infor- mation has an unwitting and automatic influence on one’s subsequent behavior; see IMPLICIT VS. EXPLICIT MEMORY). See CAUSATION; PSYCHOLOGICAL LAWS Still another form of learning is skill learning, in which one learns how to perform some action or procedure, often Learning without any ability to describe the acquired skill (also see MOTOR LEARNING). In this case, one is said to have acquired Although learning can be understood as a change in an “procedural knowledge” (knowing how to carry out some organism’s capacities or behavior brought about by experi- procedure), as opposed to “declarative knowledge” (know- ence, this rough definition encompasses many cases usually ing that some proposition is correct). It should be empha- not considered examples of learning (e.g., an increase in sized, however, that skill learning is not limited to the muscular strength brought about through exercise). More acquisition of motor skills (such as learning how to serve a important, it fails to reflect the many forms of learning, tennis ball or how to ride a bicycle). In addition, much of which may be distinguished, for example, according to what our mental activity can be understood in terms of skill is learned, may be governed by different principles, and may acquisition—we acquire skills for reading, solving prob- involve different processes. lems within a particular domain, recognizing particular pat- One form of learning is associative learning, in which terns, and so on. Thus, for example, chess masters have the learner is exposed to pairs of events or stimuli and has acquired the skill of recognizing specific configurations of the opportunity to learn these pairings—which event or chess pieces, a skill that helps them both in remembering stimulus goes with which (see CONDITIONING). The study of the arrangement of the game pieces and (probably more associative learning has led to the discovery of numerous important) allows them to think about the game in terms of learning principles applicable to species as diverse as flat- strategy-defined, goal-oriented patterns of pieces, rather worms and humans, and behaviors as different as salivation than needing to focus on individual pieces (see EXPERTISE and the onset of fear. Investigators have also learned a great and PROBLEM SOLVING). deal about the biological bases for this form of learning (see Skill learning can also lead to AUTOMATICITY for the par- ticular skill or procedure. Once automatized, a skill can be CONDITIONING AND THE BRAIN). Another crucial form of learning involves the acquisition run off as a single, integrated action, even though the skill of knowledge about the spatial layout of the organism’s sur- was initially composed of numerous constituent actions. rounding—these COGNITIVE MAPS can include the locations The skilled tennis player, for example, need not focus on of food sources or of dangers, the boundaries of one’s terri- wrist position, the arch of the back, and the position of the tory, and so on. The acquisition of this spatial knowledge shoulders, but instead launches the single (complex) behav- often involves latent learning: The organism derives some ior, “backhand swing.” This automatization promotes flu- knowledge from its experiences, but with no immediately ency among the constituents of a complex behavior, visible change in the organism’s behavior. (The latent learn- dramatically decreases the extent to which one must attend ing does become visible later on, however, when the organ- to the various elements of the behavior, and thus frees ism finds occasion to use what it has earlier learned.) In ATTENTION for other tasks. On the other hand, automatic many species, this learning about spatial layout can be behaviors are often inflexible and difficult to control, lead- extraordinarily sophisticated, allowing the organism to navi- ing some to speak of them as “mental reflexes.” gate across great distances, or to remember the locations of A further form of learning is INDUCTION, in which the hundreds of food caches, or to navigate by dead reckoning, learner is exposed to a series of stimuli or events and has the Learning Systems 461 opportunity to discover a general rule or pattern that summa- extremes, however, we may be unable to formulate general rizes these experiences. In some cases, induction is produced “laws of learning,” applicable to all learning types. by the simple forgetting of an episode’s details and the con- See also BAYESIAN LEARNING; BEHAVIORISM; COMPUTA- sequent blurring together in memory of that episode with TIONAL LEARNING THEORY; EXPLANATION-BASED LEARN- other similar episodes. This blurring together is, for example, ING; STATISTICAL LEARNING THEORY; VISION AND the source of our knowledge of, say, what a kitchen is likely LEARNING to contain. Investigators refer to knowledge acquired in this —Daniel Reisberg fashion as “generic” or “schematic knowledge” (see EPI- SODIC VS. SEMANTIC MEMORY and SCHEMATA). Further Readings In other cases, induction results from a more deliberate judgment process in which one actively seeks to generalize Reisberg, D. (1997). Cognition: Exploring the Science of the Mind. from one’s previous experiences, a process that seems to New York: Norton. Schwartz, B., and S. Robbins. (1995). Psychology of Learning and rely on a relatively small number of strategies or JUDGMENT Behavior. New York: Norton. HEURISTICS. For example, subjects in many studies seem to Tarpy, R. M. (1997). Contemporary Learning Theory and Re- rely on the assumption that the categories they encounter are search. New York: McGraw-Hill. relatively homogeneous, and this encourages them to extrapolate freely from the sample of observations made so far, even if that sample is relatively small, and even (in some Learning and Vision cases) if warnings are in place that the sample is not repre- sentative of the larger category (see CATEGORIZATION). See VISION AND LEARNING Some aspects of induction seem to be governed by highly specialized domain-specific skills. One clear example is provided by LANGUAGE ACQUISITION in the small child. The Learning Systems human infant appears to be well prepared to induce the reg- ularities of language, so that language acquisition is rela- The ability to formulate coherent and predictive theories of tively swift and successfully achieved by virtually all the environment is a salient characteristic of our species. children, independent (within certain boundary conditions) We perform this feat at diverse stages of development and of the child’s individual abilities or circumstances. The with respect to sundry features of our experience. For same learning skills, however, seem irrelevant to the acqui- example, almost all infants construct grammatical theories sition of information in other domains (also see COGNITIVE of their caretakers’ language; most children master the DEVELOPMENT and DOMAIN SPECIFICITY). moral and aesthetic codes of their household and commu- Finally, let us note still other forms of learning: Many nity; and selected adults discover scientific principles that species are capable of learning through IMITATION, in which govern fundamental aspects of the physical world. In each an action is first observed and then copied. A number of case, our theories are underdetermined by the data that trig- species display imprinting, in which a young organism ger them in the sense that there exist alternative hypotheses learns to recognize its parents or its conspecifics. Human (with different predictive consequences) that are equally learning often also involves DEDUCTIVE REASONING, in compatible with the evidence in hand. In some cases, the which one is able to discover (or generate) new knowledge, underdetermination reaches dramatic proportions, revealed based on beliefs one already holds. In some cases of deduc- by comparing the fragmentary nature of available data to tion, one’s reasoning is guided by relatively abstract rules or the scope and apparent accuracy of the theories they engen- principles. In others, one’s reasoning is guided by a specific der. Such appears to be the case in the physical sciences. remembered experience; one then draws an analogy, based For example, Dodelson, Gates, and Turner (1996) describe on that experience, and the analogy indicates how one a theory of the origin of astrophysical structure, from stars should act, or what one should conclude, for the current to great walls of galaxies, presenting evidence that such problem (see CASE-BASED REASONING AND ANALOGY). structure arose from quantum mechanical fluctuations dur- Thus the term learning plainly covers a diversity of phe- ing the first 10–34 seconds in the life of the universe. If the nomena. But having now emphasized this diversity, we theory is true, surely one of its most curious features is that should ask what these many forms of learning have in com- it could be known by a human being. Similarly, radical mon. At a general level, some principles may apply across underdetermination has also been suggested for the gram- domains—for example, the importance of acknowledging matical theories constructed by infants learning their first task-specific learning skills, or the possibility of latent language (for elaboration of this view, see Chomsky 1988 learning, not immediately manifest in behavioral change. At and POVERTY OF THE STIMULUS ARGUMENTS; a critical a much finer-grained level, it is likely that similar processes rejoinder is provided by Pullum 1996; see also INDUCTION in the nervous system provide the substrate for diverse and LANGUAGE ACQUISITION). forms of learning, including, for example, the process of The psychological processes mediating discovery no LONG-TERM POTENTIATION, in which the pattern of interac- doubt vary with the specific problem to which they must tion among neurons is modified through experience. Simi- apply. There may be little in common, for example, between larly, it is plausible that connectionist models may provide the neural substrate of grammatical hypotheses and that powerful accounts of many of these forms of learning (see underlying the conjectures of professional geologists. Such COGNITIVE MODELING, CONNECTIONIST). In between these 462 Learning Systems matters are controversial, so it will be prudent to limit the nam (1975), Solomonoff (1964), and Gold (1967). Analysis remainder of our discussion to discovery of a patently scien- proceeds by distinguishing five components of empirical tific kind. inquiry, namely: (1) potential realities or “worlds”; (2) a sci- The psychological study of discovery has focused on entific problem; (3) a set of potential data streams or “envi- how people choose tests of specific hypotheses, and how ronments” for each world, which provide information about they modify hypotheses in the face of confirming or discon- the world; (4) scientists; and (5) a criterion of success that firming data. Many of the experiments are inspired by stipulates the conditions under which a scientist is credited “Mill’s methods” of causal inquiry, referring to the nine- with solving a given problem. Any precise formalization of teenth-century logician John Stuart Mill. The results suggest the preceding items is called a “model” or “paradigm” of that both children and adults are apt to test hypotheses by inquiry, and may be analyzed mathematically using the seeking data that cannot be disconfirmatory, and to retain techniques developed within the general theory (the five hypotheses whose predictions are observed to be falsified components are adapted from Wexler and Culicover 1980). (see, for example, Kuhn 1996). In contrast to this bleak pic- Particular attention is devoted to characterizing the kinds of ture of intuitive science, other researchers believe that Mill’s problems that can be solved, distinguishing them from prob- methods are too crude for the framing of pertinent questions lems that resist solution by any scientist. about the psychology of empirical inquiry. A different One of the simplest paradigms to be studied in depth has assessment of lay intuition is thought to arise from a subtler a numerical character, and may be described as follows (for account of normative science (see, for example, Koslowski fuller treatment see Jain et al. forthcoming). Let N be the set 1996 and SCIENTIFIC THINKING AND ITS DEVELOPMENT). {0,1,2, . . .} of natural numbers. More generally, investigation of the psychology of the- 1. A world is any infinite subset of N, for example: N – {0} ory discovery can benefit from a convincing model of ratio- or N – {1}. The numbers making up a world are con- nal inquiry, if only to help define the task facing the ceived as codes for individual facts that call for predic- reasoner. Two formal perspectives on discovery have been tion and explanation. developed in recent years, both quite primitive but in differ- 2. A scientific problem is any collection of worlds, for example, the collection P = { N – {x} | x ∈ N} of all sub- ent ways. One view focuses on the credibility that scientists sets of N with just one number missing. A problem thus attach to alternative theories, and on the evolution of these specifies a range of theoretically possible worlds, the credibilities under the impact of data. Interpreting credibil- “real” member of which must be recognized by the sci- ity as probability leads to the Bayesian analysis of inquiry, entist. which has greatly illuminated diverse aspects of scientific 3. An environment for a world is any listing of all of its practice (see BAYESIAN LEARNING). For example, it is members. For example, one environment for N – {3} widely acknowledged that a theory T is better confirmed by starts off: 0,1,2,4,5,6 . . . . We emphasize that an environ- data Ds, which verify a surprising prediction, than by data ment for a world S may list S in any order. Do, which verify an obvious one. The Bayesian analysis of 4. A scientist is any mapping from initial segments of envi- this fact starts by interpreting surprise probabilistically: 0 < ronments into worlds. To illustrate, consider the scientist P (Ds) < P (Do) ≤ 1. Because both predictions are assumed S that responds to each initial segment of its environment with the set N – {x}, where x is the least number not yet to follow deductively from T, the probability calculus encountered. Then faced with the environment implies P (Ds | T) = P (Do | T) = 1, and Bayes’s theorem 0,1,2,4,5,6 . . . shown above, S would first conjecture N – yields {1}, then N – {2}, then N – {3}, then again N – {3}, and P( Ds T ) × P( T ) so on. P(T) P( T ) P ( T D s ) = -------------------------------------- = -------------- > --------------- 5. A scientist is said to “solve” a given problem just in - - - P ( D s ) P ( Do ) P ( Ds ) case the following is true. No matter what world W is P(Do T ) × P( T ) drawn from the problem, and no matter how the mem- = --------------------------------------- = P ( T Do ). bers of W are listed to form an environment e, the sci- - P ( Do ) entist’s conjectures on e are wrong only finitely often. That is, starting at some point in e, the scientist begins to (correctly) hypothesize W, and never deviates there- The greater support of T offered by Ds compared to Do is after. thus explained in terms of the posterior probabilities P(T | It is not difficult to see that the scientist S solves the Ds) and P (T | Do). This example and many others are dis- cussed in Earman (1992), Horwich (1982), Howson and problem P described above. In contrast, it can be demon- strated that no scientist whatsoever solves the problem that Urbach (1993), and Rosenkrantz (1977). results from adding the additional world N to P. This new A second perspective on inquiry is embodied in the “the- ory of scientific discovery” (see, for example, Kelly 1996; problem is unsolvable. The foregoing model of inquiry can be progressively Martin and Osherson 1998; for a computational perspective, enriched to provide more faithful portraits of science. In one see Langley et al. 1987; COMPUTATIONAL LEARNING THE- version, worlds are relational structures for a first-order lan- ORY; and MACHINE LEARNING). Scientific success here con- guage, scientists implement belief revision operators in the sists not in gradually increasing one’s confidence in the true sense of Gärdenfors (1988), and success consists in fixing theory, but rather in ultimately accepting it and holding on upon an adequate theory of the structure giving rise to the to it in the face of new data. This way of viewing inquiry is atomic facts of the environment (see Martin and Osherson consonant with the philosophy of Karl Popper (1959), and 1998). was first studied from an algorithmic point of view in Put- Lévi-Strauss, Claude 463 Lévi-Strauss develops this approach by taking up the See also INNATENESS OF LANGUAGE; LEARNING; PROB- concept of structure and by proposing a new use of this con- LEM SOLVING cept in anthropology. He abandons the notion of social —Daniel Osherson structure (promoted by A. R. Radcliffe-Brown and G. P. Murdock) as the totality of directly observable relations in a References society (a notion that still refers to the traditional under- standing of “structure” as architectural frame or organic sys- Chomsky, N. (1988). Language and Problems of Knowledge: The tem). Lévi-Strauss’s conception of structure as a model Managua Lectures. Cambridge, MA: MIT Press. stems directly from linguistics (particularly from Trou- Dodelson, S., E. Gates, and M. Turner. (1996). Cold dark matter. betskoi and JAKOBSON) where structure refers to a recurring Science 274: 69–75. relation between terms (such as phonemes) considered as Earman, J. (1992). Bayes or Bust? Cambridge, MA: MIT Press. Gärdenfors, P. (1988). Knowledge in Flux: Modeling the Dynamics minimal units. The second source is mathematics where it of Epistemic States. Cambridge, MA: MIT Press. refers to constant relations between elements regardless of Gold, E. M. (1967). Language identification in the limit. Informa- what the set in question is. Lévi-Strauss admits that this tion and Control 10: 447–474. kind of stable relations only appears in certain objects and Horwich, P. (1982). Probability and Evidence. Cambridge: Cam- under certain conditions. In other words, in the field of bridge University Press. social sciences, a structural analysis is productive and legiti- Howson, C., and P. Urbach. (1993). Scientific Reasoning: The mate only in such cases as phonology, kinship, taxonomies, Bayesian Approach. 2nd ed. La Salle, IL: Open Court. “totemic” phenomena, rituals, mythical narratives, and cer- Jain, S., E. Martin, D. Osherson, J. Royer, and A. Sharma. (Forth- tain artifacts. It is better avoided in domains where probabi- coming). Systems That Learn. 2nd ed. Cambridge, MA: MIT listic factors prevail over the mechanical order. Press. Kelly, K. T. (1996). The Logic of Reliable Inquiry. New York: From a cognitivist point of view, Lévi-Strauss’s most Oxford University Press. interesting contribution is linked to his conviction that there Koslowski, B. (1996). Theory and Evidence: The Development of is a continuity between forms of organization of external Scientific Reasoning. Cambridge, MA: MIT Press. reality (matter, living organisms, social groups, artifacts) Kuhn, D. (1996). Children and adults as intuitive scientists. Psy- and the human mind. To understand a specific field of chological Review 96(4). objects is to show how that field produces its own rational- Langley, P., H. A. Simon, G. L. Bradshaw, and Z. M. Zytkow. ity, that is, how it is regulated by a spontaneous intelligible (1987). Scientific Discovery. Cambridge, MA: MIT Press. order. This is the epistemological presupposition behind Martin, E., and D. Osherson. (1998). Elements of Scientific Lévi-Strauss’s analyses of kinship systems. This aim domi- Inquiry. Cambridge, MA: MIT Press. nates the following inquiries he has conducted on the tradi- Popper, K. (1959). The Logic of Scientific Discovery. London: Hutchinson. tional forms of classification of objects in the natural world. Pullum, G. (1996). Learnability, hyperlearning, and the poverty of He began his research by going back to an old controversial the stimulus. In J. Johnson, M. L. Juge, and J. L. Moxley, Eds., and seemingly insoluble problem, that of so-called Proceedings of the Twenty-second Annual Meeting: General totemism, and by showing that it was a nonissue, first Session and Parasession on the Role of Learnability in Gram- because it was badly stated. In fact, totemism is not a one- matical Theory. Berkeley, CA: Berkeley Linguistics Society, to-one correspondence between the human and the natural pp. 498–513. world but a way of establishing and expressing a system of Putnam, H. (1975). Probability and confirmation. In Mathemat- differences between humans (individuals and groups), with ics, Matter and Method. Cambridge: Cambridge University the help of a system of differences between things (animals, Press. plants, or artifacts). What resembles each other are not Rosenkrantz, R. (1977). Inference, Method and Decision. Dor- drect: Reidel. humans and things but their differential relations. Solomonoff, R. J. (1964). A formal theory of inductive inference. Of course, this presupposes that the human mind has the Information and Control 7: 1–22, 224–254. capacity and a disposition to recognize differences and to Wexler, K., and P. Culicover. (1980). Formal Principles of Lan- classify things themselves. In fact, traditional forms of guage Acquisition. Cambridge, MA: MIT Press. knowledge show that this spontaneous work of classifica- tion is very sophisticated. This means that it is not primarily guided by vital need (such as food or survival) but indeed by Lévi-Strauss, Claude the desire to understand and interpret the world; in short, as Lévi-Strauss reminds us, things are not just “good for eat- The most remarkable aspect of the French anthropologist ing,” but also “good for thinking.” From this perspective it Lévi-Strauss’s (1908–) undertaking is his ambition to take is therefore important not to underestimate the power of seriously the very idea of anthropology. His aim has been to “untamed thinking” (literally: la pensée sauvage). There, develop anthropology not just as an inventory of human cul- the flourishing symbolic systems are based on the differen- tures or of types of institutions (kinship, myths, rituals, arts, tial values themselves that have come out of the operations technologies, knowledge systems), but also as an investiga- of classification of the observed world. tion of the mental equipment common to all humans. This What is finally the difference between “untamed think- has not always been understood. It has been seen as an over- ing” and “domesticated thinking,” between traditional ambitious philosophical project when in fact it is better forms of knowledge and modern reason? According to understood as a cognitivist project. Lévi-Strauss, it stems from the progressive branching out of 464 Lexical Access two types of society at the end of the neolithic period: some Lévi-Strauss, C. (1974). Tristes Tropiques. Translated by John and Doreen Weightman. New York: Athenaeum. evolved toward the pursuit and preservation of a stable equi- Lévi-Strauss, C. (1963). Structural Anthropology, vol. 1. Trans- librium between the human and the natural world, while lated by C. Jacobson and B. Graundfest Schoepf. New York: others turned toward change by developing technologies for Basic Books. mastery over nature, which involved the explicit recognition Lévi-Strauss, C. (1963). Totemism. Translated by Rodney and formalization of abstract representations, as was the Needham. Boston: Beacon. case of the civilizations of writing, particularly those where Lévi-Strauss, C. (1966). The Savage Mind. Chicago: University of alphabetical writing developed. Chicago Press. This exercise of traditional knowledge appears particu- Lévi-Strauss, C. (1969). Mythologiques I: The Raw and the larly in the production of mythical narratives. According to Cooked. Translated by John and Doreen Weightman. New Lévi-Strauss, myths are a complex expression of forms of York: Harper and Row. Lévi-Strauss, C. (1973). Mythologiques II: From Honey to Ashes. thought inherent in a culture or a group of cultures (which Translated by John and Doreen Weightman. London: Cape. refers back to an empirical corpus); at the same time they Lévi-Strauss, C. (1978). Mythologiques III: The Origin of Table reveal mental processes that are verifiable everywhere (this Manners. Translated by John and Doreen Weightman. London: concerns operations that are part of the basic equipment of Cape. every human mind). In the interpretation of myths it is Lévi-Strauss, C. (1981). Mythologiques IV: The Naked Man. therefore impossible to maintain a purely functionalist Translated by John and Doreen Weightman. New York: Harper approach (which seeks an explanation of narrative based on and Row. need alone), or a symbolist approach (which seeks for keys Lévi-Strauss, C. (1976). Structural Anthropology, vol. 2. Trans- of universal interpretation), or a psychological approach lated by M. Layton. New York: Basic Books. (which seeks archetypes). To be sure, myths do refer back to Lévi-Strauss, C. (1985). The View from Afar. Translated by J. Neu- groschel and P. Hoss. New York: Basic Books. an empirical environment (geographical, technical, social) Lévi-Strauss, C. (1988). The Jealous Potter. Translated by Bene- and express it directly or not, but above all they construct dicte Chorier. Chicago: University of Chicago Press. representations where through the categorial use of sensory Lévi-Strauss, C. (1995). The Story of Lynx. Chicago: University of elements (such as diversity of species, places, forms, colors, Chicago Press. materials, directions, sounds, temperatures) emerges a sym- bolic order of things and humans (cosmogony, sociogony) Further Readings and where, above all, the logical faculties of the mind are at work—for example, opposition, symmetry, contradiction, Badcock, C. R. (1975). Lévi-Stauss, Structuralism and Sociologi- disjunction, negation, inclusion, exclusion, complementar- cal Theory. London: Hutchinson. Hénaff, M. (1998). Lévi-Strauss and the Making of Structural ity. Hence the surprising character of certain myths that do Anthropology. University of Minnesota Press. not correspond to any etiology, that is, to any specific refer- Leach, E. (1970). Claude Lévi-Strauss. New York: Viking Press. ential situation, but seem to arise and develop just for the Sperber, D. (1973). Le Structuralisme en Anthropologie. Paris: sake of pure speculative play. Therefore a narrative cannot Seuil. be interesting in itself. Some elements reappear from one Sperber, D. (1985). On Anthropological Knowledge. Cambridge: myth to another (mythemes or segments); there are clusters Cambridge University Press. of narratives related in various ways (symmetrical, opposi- tional, etc.); and finally there are whole cycles with groups Lexical Access of myths that are linked in networks and constitute complete systems of representation. Lévi-Strauss’s most original and ambitious theoretical contribution has been to demonstrate See COMPUTATIONAL LEXICONS; LEXICON; SPOKEN WORD that those networks consist of transformation groups (in the RECOGNITION; VISUAL WORD RECOGNITION mathematical sense of the term). The recourse to the structural model stemming from lin- Lexical Functional Grammar guistics and mathematics allowed Lévi-Strauss to bring to the fore invariants hitherto only seen as simple empirical recurrences, if not residues of lost history—invariants that Lexical Functional Grammar (LFG) is a theory of the he attributes to the propensitites of the human mind. This is structure of natural language and how different aspects of unquestionably pioneering work for a cognitivist approach linguistic structure are related. The name of the theory to traditional societies and for the elaboration of a global expresses two ways in which it differs from other theories theory of the human mind. of linguistic structure and organization. LFG is a lexical theory: relations between linguistic forms, such as the See also CATEGORIZATION; CULTURAL SYMBOLISM; relation between an active and passive form of a verb, are HUMAN UNIVERSALS; MAGIC AND SUPERSTITION generalizations about the structure of the lexicon, not —Marcel Hénaff transformational operations that derive one form on the basis of another one. And LFG is a functional theory: Works by Lévi-Strauss GRAMMATICAL RELATIONS such as subject and object are basic, primitive constructs, not defined in terms of phrase- Lévi-Strauss, C. (1969). The Elementary Structures of Kinship. structure configurations or of semantic notions such as Translated by J. Bell, J. von Sturmer, and R. Needham. Boston: agent or patient. Beacon Press. Lexical Functional Grammar 465 Distinguishing the outer, crosslinguistically variable con- stituent structure from the functional structure, which encodes relations that hold at a more abstract level in every language, allows for the incorporation of a comparative assessment of grammatical structures as proposed in OPTI- MALITY THEORY (Bresnan 1997; Choi 1996). The formal generative properties of LFG grammars are fairly well described. Like context-free languages, LFG lan- Figure 1. Simplified constituent structure and functional structure guages are closed under union, concatenation, and Kleene for the prepositional phrase with a book. closure (Kelly Roach, unpublished work). The recognition problem (whether a given string is in the language generated Two aspects of syntactic structure are copresent in the by a given LFG grammar) was shown to be decidable by LFG analysis of a sentence or phrase. The concrete, percep- Kaplan and Bresnan (1982). The emptiness problem tible relations of dominance, precedence, and phrasal group- (whether a given LFG grammar generates a nonempty set of ing are represented by a phrase structure tree, the strings) was shown to be undecidable in further unpublished constituent structure or c-structure. More abstract functional work by Roach. A synopsis of Roach’s results is given by syntactic information such as the relation of subjecthood or Dalrymple et al. (1995). objecthood is represented by an attribute-value matrix, the The efficient processing of LFG grammars, both from a functional structure or f-structure. Each node of the constit- psycholinguistic and a computational standpoint, is a central uent structure is related to its corresponding functional concern of the theory. It has been shown that LFG recogni- structure by a functional correspondence Φ, illustrated in tion is NP-complete (Berwick 1982): for some string (not Figure 1 by dotted lines. This functional correspondence necessarily a string in any natural language) and some LFG induces equivalence classes of structural positions that can grammar, no known algorithm exists to determine in polyno- be related to a particular grammatical function. There may mial time whether that string is in the language generated by be functional structures that are not related to any constitu- that grammar. Kaplan (1982) proposes that the NP-complete ent structure node (that is, the Φ correspondence function is class is actually a psycholinguistically plausible one for a not onto); for example, in so-called pro-drop languages, a linguistic model: the exponentially many candidate analyses verb may appear with no overtly expressed arguments. In of a string can be heuristially winnowed, and subsequent ver- such a case, the verb’s arguments are represented at func- ification of the correct analysis can be accomplished very tional structure but not at constituent structure. quickly. Strategies can also be devised to optimize the distri- Information about the constituent structure category of bution of processing work between the two syntactic struc- each word as well as its functional structure is contained in the tures (Maxwell and Kaplan 1993). In recent unpublished LEXICON, and the constituent structure is annotated with infor- work, Ronald M. Kaplan and Rens Bod explore the Data- mation specifying how the functional structures of the daugh- Oriented Parsing approach within the LFG framework; this ter nodes are related to the functional structure of the mother approach assumes that LANGUAGE ACQUISITION proceeds by node. The functional structure must also obey the well- forming generalizations over fragments of constituent struc- formedness conditions of completeness and coherence: all tures and functional structures of previously encountered grammatical functions required by a predicate must be utterances and by inducing the most likely structure of newly present, and no other grammatical functions may be present. encountered utterances based on these generalizations. For example, a transitive verb requires the presence of a sub- Besides constituent structure and functional structure, ject and an object (completeness) and no other grammatical LFG assumes other structures for other aspects of linguistic functions (coherence). Thus, the universally applicable princi- form. These structures are generally assumed to be related ples of completeness and coherence together with a language- by functional correspondence to the constituent structure, specific lexicon and principles for phrase-structure annotation the functional structure, and/or one another (Kaplan 1987). provide the criteria for determining the constituent-structure Argument structure encodes the THEMATIC ROLES of the tree and the functional structure that corresponds to it for a arguments of predicates and plays an important role in sentence or phrase of a language. LFG’s linking theory, principles for how the thematic role LFG adheres to the lexical integrity principle, which of an argument affects its grammatical function and realiza- states that morphological composition does not interact with tion in the functional structure (Bresnan and Kanerva 1989; syntactic composition: the minimal unit analyzed by the Alsina 1994). The meaning of a phrase or sentence is repre- constituent structure is the word, and the internal morpho- sented at semantic structure (Halvorsen 1983), related logical structure of the word is not accessible to syntactic directly to functional structure and indirectly to other lin- processes (Bresnan and Mchombo 1995). However, it is guistic structures. The relation between semantic structures possible for words and phrases to have similarly articulated and their corresponding functional structures is exploited in functional structure: a verb with a morphologically incorpo- recent deductive accounts of the SYNTAX-SEMANTICS INTER- rated pronominal object may have the same functional FACE; syntactic relations established at functional structure structure as a verb phrase with a verb and an independent interact with lexically specified information about the pronoun. A number of apparent paradoxes are explained by meaning of individual words in a logical deduction of the distinguishing between syntactic phenomena at the two dif- meaning of larger phrases and sentences (Dalrymple, Lamp- ferent syntactic levels. ing, and Saraswat 1993). 466 Lexical Functional Grammar Further information about LFG, including a continually Further Readings updated bibliography, is available at http://clwww.essex.ac. Alsina, A. (1992). On the argument structure of causatives. Lin- uk/LFG/. guistic Inquiry 23(4): 517–555. See also GENERATIVE GRAMMAR; HEAD-DRIVEN PHRASE Alsina, A. (1996). The Role of Argument Structure in Grammar: STRUCTURE GRAMMAR; MINIMALISM; RELATIONAL GRAMMAR Evidence from Romance. Stanford, CA: CSLI Publications. Andrews, A. D. (1990). Unification and morphological blocking. —Mary Dalrymple Natural Language and Linguistic Theory 8(4): 507–557. Andrews, A., and C. Manning. (1993). Information spreading and References levels of representation in LFG. Technical Report CSLI–93– 176 Stanford, CA: CSLI Publications. Alsina, A. (1994). Predicate Composition: A Theory of Syntactic Bresnan, J., Ed. (1982). The Mental Representation of Grammati- Function Alternations. Ph.D. diss., Stanford University. cal Relations. Cambridge, MA: MIT Press. Berwick, R. (1982). Computational complexity and Lexical Func- Bresnan, J. (1994). Locative inversion and the architecture of uni- tional Grammar. American Journal of Computational Linguis- versal grammar. Language 70(1): 2–31. tics 8: 97–109. Bresnan, J., and L. Moshi. (1990). Object asymmetries in compar- Bresnan, J. (1997). The emergence of the unmarked pronoun: ative Bantu syntax. Linguistic Inquiry 21(2): 147–185. Chichewa pronominals in Optimality Theory. Paper presented Reprinted in S. A. Mchombo, Ed., Theoretical Aspects of Bantu at the BLS 23 Special Session on Syntax and Semantics in Grammar 1. Stanford: CSLI Publications, pp. 47–91. Africa. http://www-csli.stanford.edu/bresnan/jb-bls-roa.ps. Bresnan, J., and S. A. Mchombo. (1987). Topic, pronoun, and Bresnan, J., and J. Kanerva. (1989). Locative inversion in Chich- agreement in Chichewa. Language 63(4): 741–782. Reprinted ewa: A case study of factorization in grammar. Linguistic in M. Iida, S. Wechsler, and D. Zec, Eds., Working Papers in Inquiry 20(1): 1–50. Reprinted in T. Stowell and E. Wehrli, Grammatical Theory and Discourse Structure: Interactions of Eds., Syntax and Semantics No. 26: Syntax and the Lexicon. Morphology, Syntax, and Discourse. Stanford, CA: CSLI Pub- New York: Academic Press, pp. 53–101. lications, pp. 1–59. Bresnan, J., and S. A. Mchombo. (1995). The lexical integrity prin- Bresnan, J., and A. Zaenen. (1990). Deep unaccusativity in LFG. In ciple: Evidence from Bantu. Natural Language and Linguistic K. Dziwirek, P. Farrell, and E. M. Bikandi, Eds., Grammatical Theory 13(2): 181–254. Relations: A Cross-Theoretical Perspective. Stanford CA: CSLI Choi, H.-W. (1996). Optimizing Structure in Context: Scrambling Publications/Stanford Linguistics Association, pp. 45–57. and Information Structure. Ph.D. diss., Stanford University. Butt, M. (1995). The Structure of Complex Predicates in Urdu. Dalrymple, M., R. M. Kaplan, J. T. Maxwell, and A. Zaenen. Stanford, CA: CSLI Publications. (1995). Mathematical and computational issues. In M. Dalrym- Dalrymple, M. (1993). The Syntax of Anaphoric Binding. Stanford, ple, R. M. Kaplan, J. T. Maxwell, and A. Zaenen, Eds., Formal CA: CSLI Publications. CSLI Lecture Notes, no. 36. Issues in Lexical Functional Grammar. Stanford, CA: CSLI Dalrymple, M., R. M. Kaplan, J. T. Maxwell, and A. Zaenen, Eds. Publications, pp. 331–338. (1995). Formal Issues in Lexical Functional Grammar. Stan- Dalrymple, M., J. Lamping, and V. Saraswat. (1993). LFG seman- ford, CA: CSLI Publications. tics via constraints. Proceedings of the 6th Meeting of the Fenstad, J.-E., P.-K. Halvorsen, T. Langholm, and J. van Ben- European Association for Computational Linguistics, Univer- them. (1987). Situations, Language, and Logic. Dordrecht: sity of Utrecht, April. ftp://ftp.parc.xerox.com/pub/nl/eacl93- Reidel. lfg-sem.ps. Johnson, M. (1988). Attribute-Value Logic and the Theory of Halvorsen, P.-K. (1983). Semantics for Lexical Functional Gram- Grammar. Stanford, CA: CSLI Publications. CSLI Lecture mar. Linguistic Inquiry 14(4): 567–615. Notes, no. 16. Kaplan, R. M. (1982). Determinism and nondeterminism in model- King, T. H. (1995). Configuring Topic and Focus in Russian. Stan- ling psycholinguistic processes. Paper presented to the Confer- ford, CA: CSLI Publications. ence on Linguistic Theory and Psychological Reality Revisited, Kroeger, P. (1993). Phrase Structure and Grammatical Relations Princeton University. in Tagalog. Stanford, CA: CSLI Publications. Kaplan, R. M. (1987). Three seductions of computational psycho- Laczko, T. (1995). The Syntax of Hungarian Noun Phrases: A Lex- linguistics. In P. Whitelock, M. McGee Wood, H. L. Somers, ical Functional Approach. Frankfurt am Main: Peter Lang R. Johnson, and P. Bennett, Eds., Linguistic Theory and Com- GmbH. puter Applications. London: Academic Press, pp. 149–181. Levin, L. (1986). Operations on Lexical Forms: Unaccusative Reprinted in M. Dalrymple, R. M. Kaplan, J. Maxwell, and A. Rules in Germanic Languages. Ph.D. diss., Massachusetts Zaenen, Eds., Formal Issues in Lexical Functional Grammar. Institute of Technology. Stanford, CA: CSLI Publications, pp. 337–367. Levin, L. S., M. Rappaport, and A. Zaenen, Eds. (1983). Papers in Kaplan, R. M., and J. Bresnan. (1982). Lexical Functional Gram- Lexical Functional Grammar. Bloomington, IN: Indiana Uni- mar: A formal system for grammatical representation. In J. versity Linguistics Club. Bresnan, Ed., The Mental Representation of Grammatical Rela- Matsumoto, Y. (1996). Complex Predicates in Japanese: A Syntac- tions. Cambridge, MA: MIT Press, pp. 173–281. Reprinted in tic and Semantic Study of the Notion “Word”. Stanford and M. Dalrymple, R. M. Kaplan, J. Maxwell, and A. Zaenen, Eds., Tokyo: CSLI Publications and Kuroiso Publishers. Formal Issues in Lexical Functional Grammar. Stanford, CA: Mohanan, K. P. (1983). Functional and anaphoric control. Linguis- CSLI Publications, pp. 29–130. tic Inquiry 14(4): 641–674. Maxwell, J. T., III, and R. M. Kaplan. (1993). The interface Neidle, C. (1988). The Role of Case in Russian Syntax. Dordrecht: between phrasal and functional constraints. Computational Lin- Kluwer. guistics 19(4): 571–590. Reprinted in M. Dalrymple, R. M. Pinker, S. (1984). Language Learnability and Language Develop- Kaplan, J. Maxwell, and A. Zaenen, Eds., Formal Issues in Lex- ment. Cambridge, MA: Harvard University Press. ical Functional Grammar. Stanford, CA: CSLI Publications, Simpson, J. (1991). Warlpiri Morphology and Syntax: A Lexicalist pp. 403–429. Approach. Dordrecht: Reidel. Lexicon 467 ment organization, such as active/passive and causative/ Zaenen, A. (1994). Unaccusativity in Dutch: Integrating syntax and lexical semantics. In J. Pustejovsky, Ed., Semantics and the change-of-state mentioned above. Others include the Lexicon. Dordrecht: Kluwer, pp. 129–161. “dative alternation”: They gave a present to the teachers/ They gave the teachers a present; morphological causativ- Lexicon ization: large > enlarge; and nominalization: legislate > legislator, legislation. Certain properties of arguments are important in explain- The lexicon is a list of the morphemes in a language, con- ing their behavior. Arguments can be usefully classified taining information that characterizes the SYNTAX (hence according to their semantic role as agent, theme, or goal distribution), MORPHOLOGY (hence PHONOLOGY), and mean- (Jackendoff 1972; see THEMATIC ROLES). Williams (1981) ing for every morpheme. Thus, for a word such as forget, the argued that the “external argument” is singled out for spe- lexical entry states its meaning, that it is a verb and takes a cial grammatical and lexical treatment. Thus the argument nominal or clause complement (I forgot the picnic, I forgot structure of give might be (1), where the first argument is that the picnic was at 3:00), and that its phonological repre- marked as external, the others as internal. sentation is /forgεt/. However, the study of the lexicon in linguistics and in (1) give (ext, int, int) cognitive science saw a major shift in perspective in the Further work on morphology and syntax suggests that some 1980s and early 1990s stemming from the realization that verbs have no external argument, including unaccusatives and lexicons are much more than this: they are studded with perhaps some psychological predicates, and that this has impor- principled phenomena, which any lexical theory must expli- tant morphological and syntactic consequences (Burzio 1986; cate (Wasow 1977 is a classic study). In particular, the lexi- Belletti and Rizzi 1988; see also RELATIONAL GRAMMAR). con is the domain of a set of linguistic operations or The existence of strong semantic regularities within the regularities that govern the formation of complex words alternation system sheds light on the semantic representation from lexical components. A much-studied example is pas- of substantives, most prominently verbs. For example, the sivization (e.g., arrest > arrested as in The thief was alternation of argument realization seen in the suffocate arrested by the police). Why does the passive form of examples above, where the object of the transitive verb corre- arrest, unlike the active, require that its “logical subject” sponds to the subject of the intransitive, is a completely regu- (see GRAMMATICAL RELATIONS) the police appear in a prep- lar phenomenon but is found only with causative verbs. The ositional phrase beginning with by, instead of in object posi- pillow suffocated Mary can be paraphrased roughly as “The tion, or indeed subject position? Why does a verb like pillow caused Mary to suffocate”. Verbs that lack this inter- suffocate have both a causative and a change-of-state mean- pretation do not show the alternation, thus The thief robbed ing (The pillow suffocated Mary, Mary suffocated), and why the bank does not have a counterpart *The bank robbed. is the object of one verb the subject of the other? Work on This constellation of facts, which is crosslinguistically the lexicon has aimed to uncover the principles underlying remarkably stable, can be explained by lexical theory, if the regular lexical phenomena and to examine their implica- verb meanings can be broken down or decomposed into tions for learning and processing. smaller meaning components, one of which (represented Most lexical research has centered on the words of obvi- here by CAUSE) is part of the meaning of causative verbs. ous semantic importance in a sentence, primarily substan- A verb like suffocate has two semantic representations, tives such as nouns and verbs, treating words like often called “lexical conceptual structures,” one with the determiners and complementizers (e.g., that in I think that it CAUSE component and one without. is raining) as peripheral to linguistic structure. Recent work argues, however, that these words determine properties of (2) suffocate entire phrases, hence their lexical properties are now of con- a. (x CAUSE (y suffocate)); siderable interest. b. (y suffocate) In sum, the theory of the lexicon must encompass two Because the CAUSE component has its own argument, subtheories: one (the primary focus of this article) govern- the causative version has one more argument than the non- ing words that are “meaningful” in the obvious if pretheo- causative: this holds mutatis mutandis for all causative verbs retic sense, and the other governing words that are in all languages. Because the argument of CAUSE is higher functional, or grammatical in character. in the argument structure than the other, it occupies the syn- tactic subject position when it is present; when it is absent, The Lexicon for Substantives the other argument occupies this position in English, which The lexicon determines the ability of nouns, verbs and requires that the subject position be filled in all sentences. adjectives to combine with arguments. For example give Hence we find the alternation in argument realization with takes three arguments (She gave the box to Peter), eat takes these verbs. Verbs like rob do not have an intransitive coun- two (She ate the sandwich) and rise takes one (The balloon terpart because their semantic analysis does not have the rose rapidly). The representation that includes this informa- required properties. tion is often called an “argument structure” (Williams 1981; A related line of reasoning allows us to explain why verbs Grimshaw 1990; Levin and Rappaport Hovav 1995). like apologize occur only intransitively (*He apologized the The most important source of evidence concerning the impolite clerk). Argument structures are subject to a very representation of argument structure is alternations in argu- general constraint: only one argument of a given semantic 468 Lexicon type is allowed in each one. Analysis shows that apologize can appear in it, the grammatical structure of the entire and CAUSE have arguments of the same type, with the the- clause is determined by properties of the function words, matic role agent. Hence the two cannot be combined into a previously labeled misleadingly as “minor categories.” A single argument structure, hence the causative cannot exist. noun along with its satellite expressions forms a noun Thus generalizations and gaps in the system can be explained, phrase of which it is the “head,” the element that deter- not merely described within this system of representation. mines the properties of the phrase as a whole. Remarkably, Further quite remarkable evidence comes from the discovery it turns out that the same is true for the “minor” syntactic by Talmy (1985) that there are crosslinguistic differences in categories, like complementizer and determiner: they also what meaning components can be combined. He showed that head phrases. Thus the head of the entire complement French systematically lacks verbs like English float, which clause in I think that it is raining is that, and not, as previ- encode motion and manner in a single morpheme. ously thought, raining or a phrase of some kind. It follows Despite the progress that has been made in understanding then that the primary item determining the grammatical lexical alternations, they still pose some powerful chal- properties of the clause is the complementizer itself (see lenges. On the one hand, many verbs participate in them and HEAD MOVEMENT and X-BAR THEORY). Since this discov- many languages show similar patterns. On the other hand, ery, the field has faced a completely new question: what not all verbs participate in them, and some fail to show the properties does the lexicon of functional morphemes have? alternation, apparently for no very good reason. For exam- Clearly, notions like thematic role are entirely irrelevant for ple, donate appears in only the first of the two configura- words like that and the. These morphemes code properties tions that give occurs in above, and similarly, rise does not like “type” (e.g., interrogative versus propositional) or defi- occur in the causative (The balloon rose into the sky, *They niteness. How much crosslinguistic variation can be shown rose the balloon into the sky). Yet rise does not have an to follow from differences in the functional lexicon? argument that is plausibly an agent, unlike apologize. Gaps Related work on learning attempts to establish whether such as these pose an important problem for learning: we these morphemes and associated phrases are present in can explain the existence of lexical generalizations only if early child language (Déprez and Pierce 1993; Clahsen learners generalize heavily, but then how can we explain the 1990/91; Radford 1990; see also PARAMETER-SETTING existence of exceptions? (See WORD MEANING, ACQUISITION APPROACHES TO ACQUISITION, CREOLIZATION, AND DIACH- OF; Pinker 1989, Gleitman and Landau 1994.) RONY; and MINIMALISM). A very appealing answer holds that there are no entirely The content-function word distinction has a long history arbitrary exceptions once the generalizations themselves are in cognitive science (see Halle, Bresnan, and Miller 1978). properly understood. This is probably a fair way to charac- The lexical representation of both types of morpheme is terize the general direction of the field, but it is not an easy now an issue of central importance within linguistic theory. commitment to make good on. One idea is that the meaning See also COMPUTATIONAL LEXICONS; LANGUAGE ACQUI- of the verb in one configuration is systematically different SITION; PSYCHOLINGUISTICS; SEMANTICS; SENTENCE PRO- from the meaning of the verb in others, as the result of a lex- CESSING ical rule changing the verb’s semantic representation (see —Jane Grimshaw Levin and Rappaport Hovav 1995). Each such lexical rule carries out a specified meaning change on any verb with the References appropriate semantics, with syntactic changes emerging as a consequence. Belletti, A., and L. Rizzi. (1988). Psych verbs and theta theory. However, consider a marginal but interpretable example Natural Language and Linguistic Theory 6: 291–352. Burzio, L. (1986). Italian Syntax: A Government-Binding like He sang his mother out of the house, where sang, under Approach. Dordrecht: Reidel. this line of analysis, would mean “to cause Y to move by Clahsen, H. (1990/91). Constraints on parameter setting: a gram- singing.” Why couldn’t there be a verb which meant only matical analysis of some acquisition stages in German child this and did not also mean what sang usually does? The best language. Language Acquisition 1: 361–391. answer is that this is not a possible word meaning, but then Déprez, V., and A. Pierce. (1993). Negation and functional projec- it cannot be the meaning of sang as used here. Some current tions in early grammar. Linguistic Inquiry 24: 25–67. work suggests that effects such as this extended use of sang Gleitman, L., and B. Landau, Eds. (1994). Lexical acquisition. Lin- are due to an interaction between the verb meaning and the gua 92: 1. syntactic and semantic context the verb appears in, which Goldberg, A. E. (1994). Constructions: A Construction Grammar makes it possible to construe the verb in a certain way. From Approach to Argument Structure. Chicago: University of Chi- cago Press. this perspective, related to that developed by Pustejovsky Grimshaw, J. (1990). Argument Structure. Linguistic Inquiry (1995) and in Construction Grammar (Goldberg 1994), Monograph 18. Cambridge, MA: MIT Press. verbs that do not alternate have some special property pro- Halle, M., J. Bresnan, and G. Miller, Eds. (1978). Linguistic The- hibiting alternation; massive alternation is the norm rather ory and Psychological Reality. Cambridge, MA: MIT Press. than the exception. Jackendoff, R. (1972). Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. The Functional Lexicon Levin, B., and M. Rappaport Hovav (1995). Unaccusativity: at the Syntax-Semantics Interface. Cambridge, MA: MIT Press. Although the argument structure of the predicate of a Pinker, S. (1989). Learnability and Cognition: The Acquisition of clause determines the number and kind of arguments that Argument Structure. Cambridge, MA: MIT Press. Lexicon, Neural Basis of 469 make semantic errors (e.g., they might produce “table” in Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, MA: MIT Press. naming a chair or “tastes good, a fruit” in naming a pear) in Radford, A. (1990). Syntactic Theory and the Acquisition of all lexical processing tasks. Patients with selective damage English Syntax. Oxford: Blackwell. to the semantic component of the lexicon typically have Talmy, L. (1985). Lexicalization patterns: semantic structure in extensive left hemisphere damage involving the temporal, lexical forms. In T. Shopen, Ed., Language Typology and Syn- parietal, and frontal lobes. Converging evidence in support tactic Description 3: Grammatical Categories and the Lexicon. of the view that semantic information is distributed widely Cambridge: Cambridge University Press, pp. 57–149. in the left hemisphere has been obtained in functional neu- Wasow, T. (1977). Transformations and the Lexicon, in Formal roimaging studies with PET. Syntax. In P. Culicover, T. Wasow, and A. Akmajian, Eds. New Some brain-damaged patients are selectively impaired in York: Academic Press, pp. 327–360. retrieving only the orthographic form (e.g., the spelling of Williams, E. (1981). Argument structure and morphology. Linguis- tic Review 1: 81–114. the word chair) or only the phonological form of words (e.g., the sound of the word chair). Patients of this type can Further Readings be entirely normal in their ability to understand and define words, but fail to retrieve the correct word form in one, but Baker, M. (1988). Incorporation: A Theory of Grammatical Func- not the other, modality of output. These patterns of perfor- tion Changing. Chicago: University of Chicago Press. mance attest to the autonomy of phonological and ortho- Dowty, D. (1991). Thematic proto-roles and argument selection. graphic lexical forms from each other and from meaning. Language 67: 547–619. Converging evidence for this conclusion comes from func- Hale, K., and J. Keyser. (1993). On argument structure and the lex- tional neuroimaging studies which have shown that distinct ical expression of syntactic relations. In K. Hale and J. Keyser, Eds., The View from Building 20. Cambridge, MA: MIT Press. brain regions are activated when neurologically intact partic- Grimshaw, J. (1979). Complement selection and the lexicon. Lin- ipants are engaged in processing the phonological (frontal- guistic Inquiry 10: 279–326. temporal) versus the orthographic (parietal-occipital) forms Jackendoff, R. (1990). Semantic Structures. Cambridge, MA: MIT of words. Press. Damage to the semantic system can lead to dispropor- tionate difficulties with specific semantic categories. The Lexicon, Neural Basis of most frequently observed category-specific deficits have concerned the contrast between living and nonliving things. However, the deficits can be quite selective, affecting (or The lexical processing system is the collection of mecha- sparing) only animals or only plant life. The lesion sites typ- nisms that are used to store and retrieve our knowledge of the ically associated with these deficits include the left temporal words of the language. Knowing a word means knowing its lobe, the posterior frontal lobe, and the inferior junction of meaning, its phonological and orthographic forms, and its the parietal and occipital lobes. The existence of semantic grammatical properties. How is this knowledge organized category-specific deficits was originally interpreted as and represented in the brain? Two types of evidence have reflecting a modality-based organization of conceptual been used to answer this question. The major source of evi- knowledge in the brain. It was proposed that visual and dence has been the patterns of lexical deficits associated with functional/associative properties are represented in distinct brain damage in aphasic patients. More recently, functional areas of the brain and that these two sets of properties are neuroimaging methods—POSITRON EMISSION TOMOGRAPHY differentially important in distinguishing between living and (PET) and functional MAGNETIC RESONANCE IMAGING nonliving things, respectively. On this view, selective dam- (fMRI)—have played an increasingly important role. Evi- age to one of the modality-specific knowledge subsystems dence from neuropsychological and neuroimaging studies would result in a semantic category-specific deficit. How- has converged on one widely shared conclusion: the mental ever, recent results have shown that category-specific defi- LEXICON is organized into relatively autonomous neural sub- cits are not the result of damage to modality-specific but systems in the left hemisphere, each dedicated to processing rather to modality-independent knowledge systems. These a different aspect of lexical knowledge. results, and the fact that the reliable categories of category- One of the classic syndromes of APHASIA, anomia—a specific deficits are those of animals, plant life, artifacts, deficit in retrieving words for production—provides prima and conspecifics, have led to the proposal that conceptual facie evidence for distinct representation of meaning and of knowledge is organized into broad, evolutionarily deter- lexical forms in the brain. Studies of anomic patients have mined domains of knowledge. Functional neuroimaging shown that they are unable to produce the names of objects results with neurologically intact participants have con- despite normal ability to recognize and define them, indicat- firmed that the inferior temporal lobe and parts of the occip- ing a selective deficit in processing lexical forms. These ital lobe are activated in response to animal pictures and patients tend to have more narrowly circumscribed damage, words, whereas more dorsal areas of the temporal lobe and involving most often the left temporal lobe, but sometimes parts of the frontal lobe are activated in response to artifacts. the parietal or frontal lobe or both. There is also evidence One of the classic features of the speech of some aphasic that the semantic system can be damaged independently of patients is agrammatic production—a form of speech charac- knowledge of lexical forms. The latter evidence has been terized by a relative paucity of function or closed-class obtained both with patients who have sustained focal brain words (articles, prepositions, auxiliaries, etc.). The dispro- damage due to strokes and patients with degenerative disor- portionate difficulty in producing closed-class words in some ders such as Alzheimer’s disease. Both types of patients 470 Lexicon, Neural Basis of patients is in contrast to patients who show the reverse pat- Further Readings tern of dissociation—selective difficulty with open-class Badecker, W., and A. Caramazza. (1991). Morphological composi- words (nouns, verbs, and adjectives). But, the dissociations tion in the lexical output system. Cognitive Neuropsychology of lexical processing deficits can be even more fine-grained 8(5): 335–367. than that: some patients are disproportionately impaired in Buckingham, H. W., and A. Kertesz. (1976). Neologistic Jargon producing verbs while others are disproportionately Aphasia. Amsterdam: Swets and Zeitlinger. impaired in producing nouns, and some patients can be dis- Butterworth, B., and D. Howard. (1987). Paragrammatism. Cogni- proportionately impaired in comprehending one or the other tion 26: 1–37. class of words. Grammatical class effects can even be Caplan, D., L. Keller, and S. Locke. (1972). Inflection of neolo- restricted to one modality of output or input. For example, gisms in aphasia. Brain 95: 169–172. Caramazza, A. (1997). How many levels of processing are there in there are patients who are impaired in producing verbs only lexical access? Cognitive Neuropsychology 14: 177–208. in speaking (they can write verbs and can produce nouns Caramazza, A., and A. Hillis. (1990). Where do semantic errors both in speaking and in writing) and patients who are come from? Cortex 16: 95–122. impaired in producing nouns only in speaking; and there are Caramazza, A., and A. E. Hillis. (1991). Lexical organization of patients who fail to understand written but not spoken verbs. nouns and verbs in the brain. Nature 249: 788–790. The fact that grammatical class effects can also be modality- Caramazza, A., and J. Shelton. (1998). Domain-specific knowl- specific implies a close link between word form and gram- edge systems in the brain: The animate/inanimate distinction. matical information. These results challenge the view that Journal of Cognitive Neuroscience 10: 1–34. there exists a modality-neutral lexical node mediating Chertkow, H., D. Bub, and D. Caplan. (1992). Constraining theo- between modality-specific lexical representations and word ries of semantic memory processing: Evidence from dementia. Cognitive Neuropsychology 9: 327–365. meaning. Damage to the left frontal lobe is typically associ- Damasio, A. R., and D. Tranel. (1993). Verbs and nouns are ated with disproportionate difficulty in processing verbs and retrieved from separate neural systems. Proceedings of the closed-class words, while damage to the left temporal lobe is National Academy of Sciences 90: 4957–4960. associated with disproportionate difficulty in producing and Gainotti, G., and M. C. Silveri. (1996). Cognitive and anatomical comprehending nouns. Recent investigations with PET and locus of lesion in a patient with a category-specific semantic event-related potentials (ERPs) have confirmed this general impairment for living beings. Cognitive Neuropsychology 13: characterization of the roles of the frontal and temporal lobes 357–389. in processing words of different grammatical classes. Garrett, M. F. (1992). Disorders of lexical selection. Cognition 42: Brain damage can also selectively affect different parts 143–180. of words, revealing their internal structure. It is now well Goodglass, H. (1976). Agrammatism. In N. H. Whitaker and H. A. Whitaker, Eds., Studies in Neurolinguistics. New York: Aca- established that some aphasic patients have no difficulty in demic Press. processing the stem of words (e.g., walk in walked) but fail Hart, J., R. S. Brendt, and A. Caramazza. (1985). Category-specific to retrieve their correct inflectional suffixes (e.g. the -ed in naming deficit following cerebral infarction. Nature 316: 439– walked), and that some patients can process normally the 440. morphological affixes of words but not their stems. This Hillis, A. E., and A. Caramazza. (1991). Category specific naming double dissociation in processing different types of mor- and comprehension impairment: A double dissociation. Brain phemes implies that the units of lexical representation in the 110: 613–629. brain are stems and inflectional affixes, and not whole Kay, J., and A. W. Ellis. (1987). A cognitive neuropsychological words. Detailed single-case studies of aphasic patients have case study of anomia: Implications for psychological models of confirmed this conclusion, and have shown that difficulties word retrieval. Brain 110: 613–629. McCarthy, R., and E. W. Warrington. (1985). Category specificity in processing inflectional morphology tend to be associated in an agrammatic patient: The relative impairment of verb with damage to more frontal areas of the left hemisphere, retrieval and comprehension. Neuropsychologia 23: 709–727. while difficulties in processing the stems of words are more Martin, A., J. V. Haxby, F. M. Lalonde, C. L. Wiggs, and L. G. likely to be associated with temporal lobe damage. Ungerleider. (1995). Discrete cortical regions associated with Although we still do not have a detailed understanding of knowledge of color and knowledge of action. Science 270: the neural substrates of the lexicon, its general outlines are 868–889. beginning to emerge, and it looks to be as follows: (1) the Martin, A., C. L. Wiggs, L. G. Ungerleider, and J. V. Haxby. lexical processing system is distributed over a large area of (1996). Neural correlates of category-specific knowledge. the left hemisphere, involving the temporal, frontal, and Nature 379: 649–652. parietal lobes; (2) different parts of the left hemisphere are Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintem, and M. E. Raichle. (1989). Positron emission tomographic studies of the dedicated to the storage and computation of different processing of single words. Journal of Cognitive Neuroscience aspects of lexical knowledge—meaning, form, and gram- 1: 153–170. matical information are represented autonomously; and (3) Rapp, B., and A. Caramazza. (1997). The modality-specific orga- within each of the major components of the lexicon, the nization of grammatical categories: Evidence from impaired semantic and lexical form components, there are further spoken and written sentence production. Brain and Language fine-grained functional and neural distinctions. 56: 248–286. See also BILINGUALISM AND THE BRAIN; GRAMMAR, NEU- Rumsey, J. M., B. Horwitz, B. C. Donohue, K. Nace, J. M. Maisog, RAL BASIS OF; LANGUAGE, NEURAL BASIS OF; SEMANTICS and P. Andreason. (1970). Phonologic and orthographic com- ponents of word recognition: A PET-rCFB study. Brain 120: —Alfonso Caramazza 729–760. Lightness Perception 471 annulus. And he observed that, even for complex images, Shallice, T. (1988). From Neuropsychology to Mental Structure. Oxford: Oxford University Press. luminance ratios, unlike absolute luminance values, tend Vanderberghe, R., C. Price, R. Wise, D. Josephs, and R. S. J. to remain invariant when the illumination changes. Others Frackowiak. (1996). Functional anatomy of a common seman- (Hurvich and Jameson 1966; Cornsweet 1970), lured by tic system for words and pictures. Nature 282: 254–256. the prospect of a physiological account of lightness, have Warrington, E. K., and R. A. McCarthy. (1987). Categories of sought to reduce Wallach’s ratio findings to the neural knowledge: Further fractionations and an attempted integration. mechanism of lateral inhibition. Brain 110: 1269–1273. Subsequent research has suggested that what gets Warrington, E. K., and T. Shallice. (1984) Category specific encoded are luminance ratios at edges, not the absolute semantic impairments. Brain 107: 829–853. luminances of points, and that lateral inhibition plays a key Zingeser, L., and R. S. Berndt. (1990). Retrieval of nouns and role in the neural encoding of these ratios. But both verbs in agrammatism and anomia. Brain and Language 39: 14–32. Wallach’s ratio principle and its physiological reduction are now viewed as much too simplistic an account of lightness. LFG Recent work has dealt with three important limitations: (1) the computation is too local; (2) it produces large errors when applied to illuminance edges; and (3) only relative See LEXICAL FUNCTIONAL GRAMMAR lightness values can be determined, unless an anchoring rule is also given. Life 1. The first of these can be illustrated in simultaneous lightness contrast, the familiar textbook illusion, shown in figure 1. On its face this contrast illusion appears to provide SeeARTIFICIAL LIFE; EVOLUTION; SELF-ORGANIZING SYS- further evidence of the relational determination of light- TEMS ness. But if lightness depended simply on local luminance ratios, the two targets should look as different as black and Lightness Perception white. So, in fact, the very weakness of the illusion shows that lightness is not tied to background luminance as The term lightness refers to the perceived whiteness or strongly as Wallach’s ratio rule implies. Quantitative work blackness of an opaque surface. The physical counterpart of (Gilchrist 1988) has shown that lightness is just as indepen- lightness is reflectance, or the percentage of light a surface dent of background luminance as it is of illumination level reflects. A good white surface reflects about ninety percent and this appears to require the ability to compute lumi- of the light that illuminates it, absorbing the rest; black nance ratios between remote regions of the image. In the reflects only about three percent. The eye, having no reflec- early 1970s several writers (Land and McCann 1971; tance detectors, must compute lightness based only on the Arend, Buehler, and Lockhead 1971; Whittle and Chal- light reflected from the scene. But the intensity of light lands 1969) proposed a process of edge integration by reflected from any given surface, called luminance, is a which all edge ratios along a path between two remote product of both its reflectance and the level of illumination. regions are mathematically combined. And although luminance thus varies with every change of 2. Gilchrist (1979), noting that many edges in the reti- illumination, lightness remains remarkably stable, an nal image represent changes in the illumination (such as achievement referred to as lightness constancy. shadow boundaries), not changes in reflectance, argued There is a perceptual quality that corresponds to lumi- that edge integration cannot work without some prior pro- nance called brightness. Brightness is to lightness as per- cess of edge classification (see figure 2). Recent computa- ception of visual angle is to perception of object size (see tional models (Bergström 1977; Adelson 1993; Gilchrist 1979) have relied on concepts like edge integration and SPATIAL PERCEPTION). While brightness might be said to edge classification to decompose the retinal image into refer to our sensation of light intensity, lightness refers to a component images—called intrinsic images—that repre- perceived property of the object itself and is essential to sent the physical values of surface reflectance and illumi- object recognition (see SURFACE PERCEPTION). nation. This approach has the advantage of providing an Despite the remarkable correlation between lightness and physical reflectance, neither the stimulus variable on which it is based nor the computation that finally pro- duces lightness has been agreed upon. HELMHOLTZ (1866), recognizing that lightness cannot be based simply on luminance, argued that the level of illumination is unconsciously taken into account, but this approach has remained vague and unconvincing. Wallach (1948) avoided the whole issue of computing the illumination with the dramatically simple proposal that lightness depends on the ratio between the luminance of a surface Figure 1. Simultaneous lightness contrast. Although this illusion and the luminance of its background. He demonstrated shows an influence of the target/background luminance ratio, the that such a local edge ratio predicts perceived lightness in weakness of the illusion shows that lightness is no simple product very simple displays such as a disk surrounded by an of that ratio. 472 Limbic System References Adelson, E. (1993). Perceptual organization and the judgment of brightness. Science 262: 2042–2044. Arend, L. E., J. N. Buehler, and G. R. Lockhead. (1971). Differ- ence information in brightness perception. Perception and Psy- chophysics 9: 367–370. Bergström, S. S. (1977). Common and relative components of reflected light as information about the illumination, colour, and three-dimensional form of objects. Scandinavian Journal of Psychology 18: 180–186. Cataliotti, J., and A. L. Gilchrist. (1995). Local and global pro- cesses in lightness perception. Perception and Psychophysics 57(2): 125–135. Cornsweet, T. N. (1970). Visual Perception. New York: Academic Press. Figure 2. Lightness could not be based on local edge ratios unless Gilchrist, A. (1979). The perception of surface blacks and whites. the edges were first classified. (After Adelson 1993.) Scientific American 240: 112–123. Gilchrist, A. (1988). Lightness contrast and failures of constancy: a common explanation. Perception and Psychophysics 43(5): account of our perception of the illumination, not just of 415–424. surface lightness. Helmholtz, H. von. (1866/1924). Helmholtz’s Treatise on Physio- 3. Computing absolute or specific shades of gray logical Optics. New York: Optical Society of America. requires an anchoring rule, a rule that ties some locus on the Helson, H. (1964). Adaptation-Level Theory. New York: Harper scale of perceived grays to some feature of the retinal and Row. image. One candidate rule, endorsed by Wallach (1948) and Hurvich, L., and D. Jameson. (1966). The Perception of Brightness by Land and McCann (1971), says that the highest lumi- and Darkness. Boston: Allyn and Bacon. nance is white, with lower luminances scaled relative to this Land, E. H., and J. J. McCann. (1971). Lightness and retinex the- standard. An alternative rule, implicit in concepts like the ory. Journal of the Optical Society of America 61: 1–11. Li, X., and A. Gilchrist. (Forthcoming). Relative area and relative gray world assumption and Helson’s adaptation-level theory luminance combine to anchor surface lightness values. Percep- (1964), says that the average luminance is middle gray, with tion and Psychophysics. higher and lower values scaled relative to the average. When Wallach, H. (1948). Brightness constancy and the nature of achro- these rules are tested by presenting a display consisting of a matic colors. Journal of Experimental Psychology 38: 310–324. very restricted range of grays, it is found that the highest Whittle, P., and P. D. C. Challands. (1969). The effect of back- luminance appears white, but the average does not appear ground luminance on the brightness of flashes. Vision Research middle gray (Cataliotti and Gilchrist 1995; Li and Gilchrist 9: 1095–1110. in press). But relative area also plays an important role. An increase in the area of the darker regions at the expense of Further Readings the lighter causes the darker regions to lighten in gray value and the lightest region to appear self-luminous. Gilchrist, A., Ed. (1994). Lightness, Brightness, and Transparency. These rules of anchoring by highest luminance and rela- Hillsdale, NJ: Erlbaum. Gilchrist, A., C. Kossyfidis, F. Bonato, T. Agostini, J. Cataliotti, X. tive area apply to both simple visual displays and to frame- Li, B. Spehar, V. Annan, and E. Economou. (Forthcoming). An works or groups embedded within complex images. In anchoring theory of lightness perception. Psychological complex images, however, perceived lightness can be pre- Review. dicted by a compromise between lightness values computed Hurlbert, A. (1986). Formal connections between lightness algo- within these relatively local frameworks and lightness val- rithms. Journal of the Optical Society of America A. Optics and ues computed across the entire visual field. The weighting Image Science 3: 1684–1693. in this compromise increasingly shifts to the local frame- Koffka, K. (1935). Principles of Gestalt Psychology. New York: work as that becomes larger and more highly articulated. If Harcourt, Brace, and World, pp. 240–264. the study of anchoring has undermined the portrait of a MacLeod, R. B. (1932). An experimental investigation of bright- highly ratiomorphic lightness computation recovering ver- ness constancy. Archives of Psychology 135: 5–102. Wallach, H. (1976). On Perception. New York: Quadrangle/The idical values of reflectance and illumination, it has neverthe- New York Times Book Co. less provided a remarkable account of perceptual errors. Emerging anchoring models portray a more rough-and- Limbic System ready system (see MID-LEVEL VISION) that, while subject to apparently unnecessary errors, is nevertheless quite robust in the face of a wide variety of challenges to perceptual ver- Much as other systems with a historic origin (e.g., the retic- idicality. ular system), the limbic system (LS) is difficult to define as See also COLOR VISION; DEPTH PERCEPTION; GESTALT it has gone through numerous modifications, adaptations, PERCEPTION; STEREO AND MOTION PERCEPTION; TEXTURE; refinements, and expansions during the more than 100 years TRANSPARENCY of its existence. Furthermore, problems with its description arise from the facts that it is frequently composed of only —Alan Gilchrist Limbic System 473 portions of larger units (e.g., only a minority of the thalamic structures of this system expand and increase in differenti- nuclei are included), and that it varies considerably among ation (e.g., Stephan 1975; Armstrong 1986). (This expan- species (e.g., the olfactory system, a portion of the LS, first sion is, however, less prominent than that of neocortical expands considerably in mammals as opposed to nonmam- areas.) mals such as birds, but then shrinks again in whales and pri- Based on comparative anatomy and evolutionary theory, mates). Basically the LS constitutes an agglomerate of brain MacLean (1970) divided the brain into three general com- structures with a cortical core around the corpus callosum partments: (1) a protoreptilian portion (spinal cord, parts of and within the medial temporal lobe and with a number of the midbrain and diencephalon, BASAL GANGLIA); (2) a subcortical structures extending from the hindbrain to the paleomammalian one—in principle the LS—, and (3), a forebrain (figure 1). Many of these structures are central to neomammalian one, largely the neocortical mantle. The LS the processing of EMOTIONS and MEMORY, including the therefore constitutes a link between the oldest and the new- evaluation of sensory functions such as PAIN. est attributes of the mammalian, in particular the human, The term le grand lobe limbique was coined by BROCA brain. It includes (1) the limbic cortex, a circumscribed (1878) as an anatomical structure. Broca and his contem- cortical region along the medial rim of the cerebrum; (2) poraries nevertheless thought that the limbic structures the limbic nuclei of the tel-, di-, and mesencephalon; and were largely olfactory and might therefore be subsumed (3) the fiber tracts interconnecting these structures. The under the term rhinencephalon (cf. Laurent 1997 for olfac- limbic cortex is further subdividable into an inner (“allo- tory processing). Later research shifted the dominant func- cortical,” i.e., constituted of the phylogenetically oldest, tional implications of the LS to the processing of emotions three-layered cortex) and an outer ring (“juxtallocortical,” and memory and modified the regions to be subsumed i.e., constituted of transitional, four- or five-layered cortex; under this term. This discussion has continued until today Isaacson 1982). (Papez 1937; MacLean 1952, 1970; Nauta 1979; LeDoux The core of the LS is included in the so-called Papez cir- 1996; Nieuwenhuys 1996). While the LS (the term was cuit (Papez 1937) or medial limbic circuit (see figure 1). introduced by MacLean 1952) is frequently regarded as an This circuit is primarily engaged in the transfer of informa- ancient brain system which regresses during phylogeny, tion from short-term to long-term memory. Another circuit numerous more recent studies have shown that, on the con- that is more closely related to emotional processing but still trary—with the exception of the olfactory regions—most relevant to mnemonic information processing as well is the Figure 1. Schematic section through the forebrain (i.e., without the thalamus to the cingulate gyrus and to portions of the hippocampal brain stem) showing the ringlike arrangement of the limbic region, and the cingulum fibers that run near the indusium griseum, structures around the corpus callosum and below it. The Papez an extension of the hippocampal formation (Irle and Markowitsch circuit is formed principally by the HIPPOCAMPUS, mammillary 1982). All other structures mentioned are usually regarded as bodies, anterior thalamus, cingulate gyrus, and is interconnected via belonging to the limbic system as well. Some brain stem nuclei the fornix, mammillothalamic tract, fibers from the anterior might be added. 474 Limbic System basolateral limbic circuit, or lateral limbic circuit (Sarter Broca, P. (1878). Anatomie comparée des circonvolutions cérébrales. Le grand lobe limbique et la scissure limbique and Markowitsch 1985). It is constituted by the triangle dans le série des mammifières. Revue Anthropologique 2: formed by the amygdala, the mediodorsal thalamic nucleus, 385–498. and the basal forebrain regions. Interconnecting fibers are Irle, E., and H. J. Markowitsch. (1982). Connections of the hippoc- the ventral amygdalofugal pathway, the anterior thalamic ampal formation, mamillary bodies, anterior thalamus and cin- peduncle, and the bandeletta diagonalis (the circuit is gulate cortex. A retrograde study using horseradish peroxidase depicted in figure 48.4 of Markowitsch 1995). in the cat. Experimental Brain Research 47: 79–94. As mentioned above, there have been various attempts to Isaacson, R. L. (1982). The Limbic System. 2nd ed. New York: Ple- expand the LS (Isaacson 1982; Nauta 1979; Nieuwenhuys num Press. 1996). All of these nevertheless agree in principle with Laurent, G. (1997). Olfactory processing: maps, time and codes. MacLean’s (1970) proposal to see this system as the media- Current Opinion in Neurobiology 7: 547–553. LeDoux, J. E. (1996). The Emotional Brain. New York: Simon and tor between the neocortical mantle (dealing with sensory Schuster. processing, memory storage, and the initiation and supervi- MacLean, P. D. (1952). Some psychiatric implications of physio- sion of behavior) and the “lower,” largely motoric regions of logical studies of frontotemporal portion of limbic system the brain stem and the basal ganglia. (visceral brain). Electroencephalography and Clinical Neuro- Though the structures of the LS are predominantly physiology 4: 407–418. involved in emotional, motivational, and memory-related MacLean, P. D. (1970). The triune brain, emotion and scientific aspects of behavior, some subclustering should be noted: the bias. In F. O. Schmitt, Ed., The Neurosciences: Second Study septal region, the amygdala, and the cingulate cortex are Program. New York: Rockefeller University Press, pp. 336– largely engaged in the control of emotions ranging from 348. heightened ATTENTION and arousal to rage and aggression. Markowitsch, H. J. (1995). Anatomical basis of memory disorders. In M. S. Gazzaniga, Ed., The Cognitive Neurosciences. Cam- In evaluating emotions the septum may partly act in opposi- bridge, MA: MIT Press, pp. 665–679. tion to amygdala and cingulate cortex. The amygdala is fur- Nauta, W. J. H. (1979). Expanding borders of the limbic system thermore involved in motivational regulations and in concept. In T. Rasmussen and R. Marino, Eds., Functional evaluating information of biological or social significance Neurosurgery. New York: Raven Press, pp. 7–23. (and therefore indirectly in memory processing). Damage to Nieuwenhuys, R. (1996). The greater limbic system, the emotional the amygdala may result in conditions of tameness, hyper- motor system and the brain. Progress in Brain Research 107: sexuality, amnesia, agnosia, aphagia, and hyperorality 551–580. (Klüver-Bucy syndrome). The hippocampal formation and Papez, J. W. (1937). A proposed mechanism of emotion. Archives surrounding structures are principally engaged in transfer- of Neurology and Psychiatry 38: 725–743. ring memories from short-term to long-term storage, but do Sarter, M., and H. J. Markowitsch. (1985). The amygdala’s role in human mnemonic processing. Cortex 21: 7–24. have additional functions (e.g., in the spatial and possibly Stephan, H. (1975). Allocortex. Handbuch der mikroskopischen also in the time domains). Anterior and medial nuclei of the Anatomie des Menschen, vol. 4, part 9. Berlin: Springer-Verlag. thalamus and the mammillary body of the hypothalamus control memory transfer as well (“bottleneck structures”; Further Readings Markowitsch 1995). Also, these nuclear configurations (to which nonspecific thalamic nuclei belong as well) control Cahill, L., R. Babinsky, H. J. Markowitsch, and J. L. McGaugh. further forms of behavior ranging from sleep to possibly (1995). Involvement of the amygdaloid complex in emotional consciousness. Between different species, functional shifts memory. Nature 377: 295–296. of limbic structures have been noted. Cramon, D.Y. von, H. J. Markowitsch, and U. Schuri. (1993). The possible contribution of the septal region to memory. Neuropsy- There is consequently both functional unity and diversity chologia 31: 1159–1180. within the LS. As an example, it is still largely unknown Groenewegen, H. J., C. I. Wright, and A. V. J. Beijer. (1996). The whether the medial temporal lobe structures (with the hip- nucleus accumbens: gateway for limbic structures to reach the pocampus as core) and the medial diencephalic structures motor system? Progress in Brain Research 107: 485–511. (medial and anterior thalamus, mammillary bodies) consti- Lilly, R., J. L. Cummings, D. F. Benson, and M. Frankel. (1983). tute one or two memory systems. One reason for this uncer- The human Klüver-Bucy syndrome. Neurology 33: 1141–1145. tainty can be sought in the multitude of fiber bundles Macchi, G. (1989). Anatomical substrate of emotional reactions. In interconnecting LS structures in an extensive network. L. R. Squire and G. Gainotti, Eds., Handbook of Neuropsychol- High-resolution dynamic imaging research (e.g., POSITRON ogy, vol. 3. Amsterdam: Elsevier, pp. 283–304. EMISSION TOMOGRAPHY) may provide answers in the near Markowitsch, H. J., P. Calabrese, M. Würker, H. F. Durwen, J. Kessler, R. Babinsky, D. Brechtelsbauer, L. Heuser, and W. future. Gehlen. (1994). The amygdala’s contribution to memory—a See also EMOTION AND THE ANIMAL BRAIN; EMOTION PET-study on two patients with Urbach-Wiethe disease. Neu- AND THE HUMAN BRAIN; THALAMUS roreport 5: 1349–1352. —Hans J. Markowitsch Mesulam, M.-M. (1985). Patterns in behavioral neuroanatomy: association areas, the limbic system, and behavioral specializa- tion. In M.-M. Mesulam, Ed., Principles of Behavioral Neurol- References ogy. Philadelphia: F. A. Davis, pp. 1–70. Armstrong, E. (1986). Enlarged limbic structures in the human Reep, R. (1984). Relationship between prefrontal and limbic cor- brain: the anterior thalamus and medial mamillary body. Brain tex: a comparative anatomical review. Brain, Behavior and Research 362: 394–397. Evolution 25: 5–80. Linguistic Relativity Hypothesis 475 Despite enduring philosophical interest in the question Schneider, F., R. E. Gur, L. H. Mozley, R. J. Smith, P. D. Mozley, D. M. Censits, A. Alavi, and R. C. Gur. (1995). Mood effects (e.g., Quine 1960), there has been little empirical research on limbic blood flow correlate with emotional self-rating: a that both compares linguistic meaning structures and then PET study with oxygen-15 labeled water. Psychiatry Research: independently assesses thought (Lucy 1992a). This stems Neuroimaging 61: 265–283. partly from the interdisciplinary nature of the problem and Scoville, W. B., and B. Milner. (1957). Loss of recent memory partly from concern about the implications of relativism and after bilateral hippocampal lesions. Journal of Neurology, Neu- determinism. Empirical efforts fall into three broad types. rosurgery and Psychiatry 20: 11–21. Structure-centered approaches begin with an observed Tulving, E., and H. J. Markowitsch. (1997). Memory beyond difference between languages, elaborate the interpretations the hippocampus. Current Opinion in Neurobiology 7: 209– of reality implicit in them, and then seek evidence for their 216. influence on thought. The approach remains open to unex- pected interpretations of reality but often has difficulty Linguistic Relativity Hypothesis establishing a neutral basis for comparison. The classic example of a language-centered approach is Whorf’s pio- The linguistic relativity hypothesis is the proposal that the neering comparison of Hopi and English (1956) in which he particular language one speaks influences the way one argues for different conceptions of ‘time’ in the two lan- thinks about reality. The hypothesis joins two claims. First, guages as a function of whether cyclic experiences are languages differ significantly in their interpretations of classed as like ordinary objects (English) or as recurrent experience—both what they select for representation and events (Hopi). The most extensive recent effort to extend how they arrange it. Second, these interpretations of experi- and improve the comparative fundamentals in a structure- ence influence thought when they are used to guide or sup- centered approach has sought to establish a relation between port it. Because the first claim is so central to the hypothesis, variations in grammatical number marking and attentiveness demonstrations of linguistic differences in the interpretation to number and form (Lucy 1992b). of experience are sometimes mistakenly regarded as demon- Domain-centered approaches begin with a domain of strations of linguistic relativity and demonstrations of some experienced reality, typically characterized independently of commonalities are taken as disproof, but the assessment of language(s), and ask how various languages select from and the hypothesis necessarily requires evaluating the cognitive organize it. The approach facilitates controlled comparison influence of whatever language differences do exist. but often at the expense of regimenting the linguistic data Accounts vary in the proposed mechanisms of influence and rather narrowly. The classic example of this approach shows in the power attributed to them—the strongest version being that some colors are more lexically encodable than others a strict linguistic determinism (based, ultimately, on the and that more codable colors are remembered better (Brown identity of language and thought). Linguistic relativity pro- and Lenneberg 1954). This approach was later extended to posals should be distinguished from more general concerns argue that there are cross linguistic universals in the encod- about how speaking any natural language whatsoever influ- ing of the color domain such that a small number of “basic” ences thinking (e.g., the general role of language in human color terms emerge in languages as a function of biological intellectual functioning) and discourse-level concerns with constraints (Berlin and Kay 1969). This research has been how using language in a particular way influences thinking widely accepted as evidence against the linguistic relativity (e.g., schooled versus unschooled). Ultimately, however, all hypothesis, although it actually deals with constraints on these levels interrelate in determining how language influ- linguistic diversity. Subsequent research has challenged the ences thought. universal claim and shown that different color term systems Interest in the intellectual significance of the diversity of do influence COLOR CATEGORIZATION and memory. The language categories has deep roots in the European tradition most successful effort to improve the quality of the linguis- (Aarsleff 1982). Formulations related to contemporary ones tic comparison in a domain-centered approach has sought to appeared in England (Locke), France (Condillac, Diderot), show cognitive differences in the spatial domain between and Germany (Hamman, Herder) near the beginning of the languages favoring the use of body coordinates to describe eighteenth century. They were stimulated by opposition to arrangements of objects (e.g., the man is left of the tree) and the universal grammarians, by concerns about the reliability those favoring systems anchored in cardinal direction terms of language-based knowledge, and by practical efforts to or topographic features (e.g., the man is east/uphill of the consolidate national identities and cope with colonial tree; Levinson 1996). expansion. Work in the nineteenth century, notably that of Behavior-centered approaches begin with a marked dif- Humboldt in Germany and SAUSSURE in Switzerland and ference in behavior that the researcher comes to believe has France, drew heavily on this earlier tradition and set the its roots in a pattern of thinking arising from language prac- stage for contemporary approaches. The linguistic relativity tices. The behavior at issue typically has clear practical proposal received new impetus and reformulation in Amer- consequences (either for theory or for native speakers), but ica during the early twentieth century in the work of anthro- because the research does not begin intending to address pological linguists SAPIR (1949) and Whorf (1956) (hence the linguistic relativity question, the theoretical and empiri- the common designation as “the Sapir-Whorf hypothesis”). cal analyses of language and reality are often weak. An They emphasized direct study of diverse languages and example of a behavior-centered approach is the effort to rejected the hierarchical rankings of languages and cultures account for differences in Chinese and English speakers’ characteristic of many European approaches. facility with counterfactual or hypothetical reasoning by 476 Linguistic Stress reference to the marking of counterfactuals in the two lan- Lakoff, G. (1987). Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chi- guages (Bloom 1981). cago Press. The continued relevance of the linguistic relativity issue Lee, P. (1997). The Whorf Theory Complex: A Critical Reconstruc- seems assured by the same impulses found historically: the tion. Amsterdam: John Benjamins. patent relevance of language to human sociality and intel- Levinson, S. C. (1997). From outer to inner space: linguistic cate- lect, the reflexive concern with the role of language in gories and nonlinguistic thinking. In J. Nuyts and E. Pederson, intellectual method, and the practical encounter with Eds., The Relationship between Linguistic and Conceptual diversity. Representation. Cambridge: Cambridge University Press. See also CONCEPTS; CULTURAL VARIATION; LANGUAGE Lucy, J. A. (1996). The scope of linguistic relativity: an analysis and review of empirical research. In J. J. Gumperz and S. C. AND CULTURE; LANGUAGE AND THOUGHT; LINGUISTIC UNI- Levinson, Eds., Rethinking Linguistic Relativity. Cambridge: VERSALS AND UNIVERSAL GRAMMAR; LINGUISTICS, PHILO- Cambridge University Press. SOPHICAL ISSUES Lucy, J. A. (1997). Linguistic relativity. Annual Review of Anthro- —John A. Lucy pology 26:291-312. Lucy, J. A., and R. A. Shweder. (1979). Whorf and his critics: lin- guistic and nonlinguistic influences on color memory. Ameri- References can Anthropologist 81: 581–615. Aarsleff, H. (1982). From Locke to Saussure. Minneapolis, MN: Putnam, H. (1981). Philosophical Papers. Cambridge: Cambridge University of Minnesota Press. University Press. Berlin, B., and P. Kay. (1969). Basic Color Terms. Berkeley and Schultz, E. A. (1990). Dialogue at the Margins: Whorf, Bakhtin, Los Angeles: University of California Press. and Linguistic Relativity. Madison: University of Wisconsin Bloom, A. H. (1981). The Linguistic Shaping of Thought. Hills- Press. dale, NJ: Erlbaum. Silverstein, M. (1979). Language structure and linguistic ideology. Brown, R. W., and E. H. Lenneberg. (1954). A study in language In P. Clyne, W. Hanks, and C. Hofbauer, Eds., The Elements: A and cognition. Journal of Abnormal and Social Psychology 49: Parasession on Linguistic Units and Levels. Chicago: Chicago 454–462. Linguistic Society. Levinson, S. C. (1996). Relativity in spatial conception and Wierzbicka, A. (1992). Semantics, Culture, and Cognition: Uni- description. In J. J. Gumperz and S .C. Levinson, Eds., Rethink- versal Human Concepts in Culture-Specific Configurations. ing Linguistic Relativity. Cambridge: Cambridge University Oxford: Oxford University Press. Press. Lucy, J. A. (1992a). Language Diversity and Thought. Cambridge: Linguistic Stress Cambridge University Press. Lucy, J. A. (1992b). Grammatical Categories and Cognition. Cam- bridge: Cambridge University Press. See STRESS, LINGUISTIC Quine, W. (1960). Word and Object. Cambridge, MA: MIT Press. Sapir, E. (1949). The selected writings of Edward Sapir. In D.G. Linguistic Theory Mandelbaum, Ed., Language, Culture, and Personality. Berke- ley and Los Angeles: University of California Press. Whorf, B. L. (1956). Language, Thought, and Reality: Selected See INTRODUCTION: LINGUISTICS AND LANGUAGE Writings of Benjamin Lee Whorf, J. B. Carroll, Ed. Cambridge, MA: MIT Press. Linguistic Universals and Further Readings Universal Grammar Aarsleff, H. (1988). Introduction. In W. von Humboldt, Ed., On Language: The Diversity of Human Language-Structure and its A child’s linguistic system is shaped to a significant degree Influence on the Mental Development of Mankind, trans. by P. by the utterances to which that child has been exposed. That Heath. Cambridge: Cambridge University Press. is why a child speaks the language and dialect of his family Grace, G. W. (1987). The Linguistic Construction of Reality. Lon- and community. Nonetheless, there are aspects of the lin- don: Croom Helm. guistic system acquired by the child that do not depend on Gumperz, J. J., and S. C. Levinson, Eds. (1996). Rethinking Lin- input data in this way. Some cases of this type, it has been guistic Relativity. Cambridge: Cambridge University Press. argued, reflect the influence of a genetically prespecified Hardin, C. L., and L. Maffi, Eds. (1997). Color Categories in body of knowledge about human language. In the literature Thought and Language. Cambridge: Cambridge University Press. on GENERATIVE GRAMMAR, the term Universal Grammar— Hill, J. H., and B. Mannheim. (1992). Language and world view. commonly abbreviated UG—refers to this body of “hard- Annual Review of Anthropology 21: 381–406. wired” knowledge. Hunt, E., and F. Agnoli. (1991). The Whorfian Hypothesis: a cog- Questions concerning the existence and nature of UG nitive psychology perspective. Psychological Review 98: 377– arise in all areas of linguistics (see discussion in SYNTAX, 389. PHONOLOGY, and SEMANTICS). Research on these questions Kay, P., and C. K. McDaniel. (1978). The linguistic significance of constitutes a principal point of contact between linguistics the meanings of basic color terms. Language 54: 610–646. and the other cognitive sciences. Koerner, E. F. K. (1992). The Sapir-Whorf Hypothesis: a prelimi- Three streams of evidence teach us about the existence and nary history and a bibliographic essay. Journal of Linguistic nature of UG. One stream of evidence comes from crossling- Anthropology 2: 173–178. Linguistic Universals and Universal Grammar 477 uistic investigation of linguistic universals, discussed in the tionality.” One example is a set of restrictions specific to article on TYPOLOGY. Crosslinguistic investigations help us how and why questions. How did you think Mary solved the learn whether a property found in one language is also found problem? can be a question about Mary’s problem-solving in other unrelated languages, and, if so, why. Another stream methods, but the sentence How did you ask if Mary solved of evidence concerning UG comes from investigation of LAN- the problem? cannot. (It can only be a question about meth- GUAGE ACQUISITION and learnability, especially as these ods of asking.) The restriction concerns the domains from investigations touch on issues of POVERTY OF THE STIMULUS which WH-MOVEMENT may apply, which in turn correlates ARGUMENTS. Work on acquisition and learnability helps us with sentence meaning. The restriction appears to be a gen- understand whether a property found in the grammar of an uine universal, already detected in a wide variety of lan- individual speaker is acquired by imitation of input data or guages whose grammars otherwise diverge in a number of whether some other reason for the existence of this property ways. Crucially, the restriction makes no evident contribu- must be sought. Finally, evidence bearing on the specially lin- tion to usability. Just the opposite: it prevents speakers from guistic nature of UG comes from research on MODULARITY posing perfectly sensible questions, except through circum- AND LANGUAGE. Features of language whose typological and locution—for example, I know that you asked if Mary had acquisitional footprint suggests an origin in UG may be con- solved the problem in some particular way. What was that firmed as reflections of UG if they reflect aspects of cognition way? The study of such seemingly dysfunctional aspects of that are to some degree language-specific and “information- language has provided an especially clear path to a prelimi- ally encapsulated.” If a fact about an individual speaker’s nary understanding of UG. This fact also explains why the grammar turns out to be a fact about grammars of all the data of generative grammar stray so often from “everyday” world’s languages, if it is demonstrably not a fact acquired in linguistic facts—a central difference between the concerns imitation of input data, and if it appears to be specific to lan- of generative grammarians and those researchers more con- guage, then we are warranted to suspect that the fact arose cerned with “everyday” language use (a group that includes from a specific feature of UG. some sociolinguists as well as computational linguists inter- The questions one asks in the process of building the the- ested in practical language technologies). ory of UG are varied and complex. Suppose the linguist dis- The existence of “Universal Grammar” (uppercase) does covers that a property P of one language is present in a not necessarily entail the existence of a “universal grammar” variety of other languages. It is often possible that P arises (lowercase)—in the sense of a usable linguistic system from some more general property of cognition. For example, wholly determined by genetic factors. UG must allow for the repertoire of linguistically relevant THEMATIC ROLES language variation, though by its very nature it restricts the such as “experiencer” or “agent” may reflect language- range of variation. This is why certain nonuniversal proper- independent facts about the categorization of events —and ties of language nonetheless recur in widely scattered, unre- therefore fall outside of UG. On the other hand, while the lated languages, while other equally imaginable properties repertoire of thematic roles may be language-independent, are never found. For example, the placement of the finite the opposite is true of the apparently universal mapping of verb in “second” position characteristic of the Germanic lan- specific thematic roles onto specific designated syntactic guages (see HEAD MOVEMENT) is also found in Vata (Ivory positions—for example, the fact that agents are mapped Coast; Koopman 1983), Kashmiri (Bhatt 1995), and Kariti- universally onto a structurally more prominent position ana (Brazil; Storto 1996). By contrast, in no known lan- than patients (e.g., subject position). This specifically lin- guage are verbs obligatorily placed in third position. In other guistic mapping thus constitutes one of the properties words, UG allows languages to vary—but only up to a point. attributed to UG. There are several theories of how variation is built into It is also important to try to distinguish UG-based univer- UG. One proposal holds that the principles of UG define the sals from apparent universals that merely reflect usability parameters of possible variation. Language acquisition conditions on languages that must serve a communicative involves “setting” these parameters (see PARAMETER-SETTING function (functional universals). For example, there is prob- APPROACHES TO ACQUISITION, CREOLIZATION, AND DIACH- ably a lower bound to the size of a language’s phonemic RONY). Another suggestion, advanced by Borer (1981) and inventory. Does this restriction form part of UG? Not neces- Borer and Wexler (1987), holds that true variation is limited sarily. It is equally possible that the limitation merely to the LEXICON (one aspect of language that we know must reflects a consequence of usability conditions for linguistic vary from language to language). Apparent syntactic varia- systems. The words of a language whose only phonemes are tion on this view arises from the differing syntactic require- /m/ and /a/ would be extraordinarily long and hard to distin- ments of lexical items (see also SYNTAX, ACQUISITION OF). guish. Such a language might fall within the range permitted Another proposal, developed within OPTIMALITY THEORY, by UG, yet never occur in nature because of its dysfunction- attributes variation to differences in the ability of particular ality. grammatical principles to nullify the action of other princi- Because of the ever-present possibility that a universal ples with which they conflict (i.e., differences in “constraint may have a functional explanation, researchers interested in ranking”). discovering properties of language that derive from UG It is not entirely clear what aspects of UG are subject to often focus on those universals for which functional expla- variation. In particular, though no one doubts that syntactic nations are the least likely. For example, syntactic research and phonological systems vary across languages, the ques- has paid particular attention to a number of limitations on tion of variation in semantics is more contested. The details form-meaning pairs that have just this property of “dysfunc- of semantic interpretation are probably less obvious to 478 Linguistics, Philosophical Issues young children acquiring language than are the details of References word positioning and word pronunciation that provide evi- Bhatt, R. (1995). Verb movement in Kashmiri. In R. Izvorski and dence about syntax and phonology. Consequently, it is con- V. Tredinnick, Eds., U. Penn Working Papers in Linguistics, ceivable (though not inevitable) that the laws governing vol. 2. Philadelphia: University of Pennsylvania Department of compositional semantic interpretation of syntactic structures Linguistics. are wholly determined by UG—hence invariant across lan- Bickerton, D. (1981). Roots of Language. Ann Arbor, MI: Karoma. guages. In fact, the notion and the term was borrowed by Borer, H. (1983). Parametric Syntax. Dordrecht: Foris. Chomsky (1965, 1966) from an earlier grammatical tradi- Borer, H., and K. Wexler. (1987). The maturation of syntax. In T. tion that explicitly sought universal semantic roots of syntax Roeper and E. Williams, Eds., Parameter Setting. Dordrecht: (for example, the 1660 Port-Royal Grammaire génerale et Reidel. Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, raisonée). Semantic universals do exist, of course. Not only MA: MIT Press. basic laws of semantic composition (Heim and Kratzer Chomsky, N. (1966). Cartesian Linguistics. New York: Harper and 1998), but also many details recur in language after lan- Row. guage. For example, the classification of predicates into Heim, I., and A. Kratzer. (1997). Semantics in Generative Gram- something like “states” versus “events,” and the interaction mar. Oxford: Blackwell. of this classification with such properties as quantifier inter- Koopman, H. (1983). The Syntax of Verbs: from Verb Movement pretation, seems to be invariant (or nearly so) across the lan- Rules in the Kru Language to Universal Grammar. Dordrecht: guages that have been studied. On the other hand, other Foris. facts might cause one to doubt whether all languages com- Marler, P. (1991). Differences in behavioural development in mand exactly the same semantic resources. For example, closely related species: birdsong. In P. Bateson, Ed., The Devel- opment and Integration of Behaviour. Cambridge: Cambridge although “multiple questions” such as Who bought what? University Press, pp. 41–70. receive similar interpretations in many languages (as ques- Marler, P. (1996). Song Learning. http://www.hip.atr.co.jp/~bate- tions whose answer provides a list of pairs; e.g., John son/hawaii/abstracts/marler_ms.html. bought the wine and Mary bought the dessert), this semantic Mufwene, S. S. (1996). The Founder Principle in creole genesis. possibility is entirely absent in some languages, including Diachronica 13: 83–134. Italian and Irish. To native speakers of these languages, the Mufwene, S. (1999). On the language bioprogram hypothesis: counterparts to Who bought what? (e.g., Italian *Chi a com- Hints from Tazie. In M. DeGraff, Ed., Language Creation and prato che cosa?) seem quite uninterpretable. Whether such Language Change: Creolization, Diachrony, and Development. facts indicate the existence of semantic variation, or merely Cambridge, MA: MIT Press. reveal lexical or syntactic differences with a predictable Storto, L. (1996). Verb Raising and Word Order Variation in Kari- tiana. Unpublished manuscript, Massachusetts Institute of impact on semantics, remains an open question. Technology. The fact that variation is “built into” some aspects of UG does not preclude the possibility that UG might characterize a usable “default” grammar on its own. This is also a matter Linguistics, Philosophical Issues of considerable controversy. Bickerton (1981), for example, has suggested that CREOLES represent the spontaneous flow- ering of a purely UG-based grammar, but this view is con- As with any rapidly developing science, GENERATIVE troversial (Mufwene 1996, 1999). Furthermore, a precedent GRAMMAR has given rise to a number of interesting philo- for “usable UG” is provided elsewhere in the animal king- sophical puzzles and controversies. These controversies dom by songbird species whose song is partly learned rather range from disputes about the object of study in linguistics, than totally innate. Researchers have identified a “UG” for to issues about the relation between the language faculty the song of several such species. When birds of these spe- and the external world, to questions about the legitimacy of cies are reared in isolation, they spontaneously develop a appeal to rules and representations, to questions about song that falls recognizably within the parameters of their proper empirical methodology. UG, though rudimentary in many ways (Marler 1991, 1996; One of the prominent philosophical debates in generative see also ANIMAL COMMUNICATION). The UG of songbirds is grammar has centered around the question of what sort of of importance for another reason. Among those who have objects languages and grammars are. Katz (1985) distin- not made a study of the relevant evidence, theories of UG guishes three general approaches to the question, roughly are often thought to require special pleading, as if the paralleling three traditional approaches to the nature of hypothesis of species-specific innate knowledge constituted abstract objects: platonism, conceptualism, and nominalism. a violation of Occam’s razor. The evidence from songbirds The platonist view would take the object of study in linguis- makes it clear that a priori objections to UG are nothing tics to be an abstract mathematical object outside of space more than a prejudice. The nature of human UG remains, and time, the conceptualist position would be a position like however, a topic of lively debate and continued research. Chomsky’s in which the object of study is a mental object of some form, and the nominalist view would hold that the See also CONNECTIONIST APPROACHES TO LANGUAGE; object of study is a corpus of inscriptions or utterances. HUMAN UNIVERSALS; INNATENESS OF LANGUAGE; LAN- The platonist view has been advanced most visibly by GUAGE AND CULTURE; LANGUAGE VARIATION AND Katz (1981), although it may be that the position rests on a CHANGE; NATIVISM confusion. For example, Higginbotham (1983) has observed —David Pesetsky that even if grammars are abstract objects, there is still the Linguistics, Philosophical Issues 479 empirical question of which grammar a particular agent is In short, the worry is this: Putnam (1975) and many OF). employing. George (1989) has further clarified the issue, other philosophers have held that we need referential seman- holding that we need to distinguish between a grammar, tics to characterize linguistic MEANING—that meanings which is the abstract object that we know, a psycho-gram- “ain’t in the head.” But if this is right, then it is hard to see mar, which is the cognitive state that constitutes our knowl- how semantics can be part of the language faculty, which is edge of the grammar, and a physio-grammar, which is the supposed to be individualistic (and hence “in the head”). physical manifestation of the psycho-grammar in the brain. This tension between I-language and referential seman- If this picture is right, then the platonist/conceptualist dis- tics has led commentators such as Hornstein (1984) and pute in linguistics may be trading on a failure to distinguish Chomsky (1993, 1995) to be skeptical of the possibility of a between grammars and psycho-grammars. referential semantics. However, Ludlow (forthcoming) has Perhaps more pressing is the dispute between the nomi- argued that the tension in these views is apparent only, nalist and conceptualist positions, a dispute that Chomsky because the connection between I-languages and referential (1986) has characterized as being between E-language and semantics would parallel the connection between individu- I-language conceptions of language. From the E-language alistic and relational sciences in other domains (for exam- perspective, a natural language is a kind of social object, ple, it would be similar to the connection that holds between the structure of which is purported to be established by the studies of primate physiology and primate ecology— convention (see Lewis 1975), and persons may acquire facts about physiology can shed light on the primate’s rela- varying degrees of competence in their knowledge and use tion to its environment, and vice versa). Inferences between of that social object. On Chomsky’s views, such objects individualistic and relational sciences are imperfect, but would be of little scientific interest if they did exist (since data from one domain can nevertheless be relevant to the they would not be “natural” objects), but in any case such other. objects don’t exist. Alternatively, an I-language is not an The idea that linguistic theory involves the investigation external object but is rather a state of an internal system of RULES AND REPRESENTATIONS (or principles and parame- that is part of our biological endowment. An agent might ters) of an internal computational system has also led to have I-language representations of English sentences, but philosophical questions about the nature of these rules and those internal representations are not to be confused with representations. For example, Quine (1970) has argued that spoken or written English sentences. They are rather data because many possible grammars may successfully describe structures in a kind of internal computational system. an agent’s linguistic behavior, there is no way in principle Chomsky understands the I-language computational sys- for us to determine which grammar an agent is using. For tem to be individualistic (see INDIVIDUALISM). That means his part, Chomsky (1980) has argued that if we consider the that the properties of the system can be specified indepen- explanatory adequacy of a grammar in addition to its dently of the environment that the agent is embedded in. descriptive adequacy, then the question of which grammar Thus, it involves properties like the agent’s rest mass and is correct is answerable in principle. That is, if we consider genetic make-up (and unlike relational properties like the that a grammar must be consistent with the theory of LAN- agent’s weight and IQ). GUAGE ACQUISITION, acquired language deficits, and more By itself, the dispute between I-language and E-language generally with cognitive psychology, then there are many approaches has little philosophical traction; the actual direc- constraints available to rule out competing grammatical the- tion of the field presumably settles the issue as to which is ories. the object of study. Nevertheless, some normative claims Another set of worries about rule following have stemmed have been offered. For example, Soames (1984) has sug- from Kripke’s (1982) reconstruction of arguments in Wit- gested that if we attend to the leading questions of linguis- tgenstein (1953, 1956). The idea is that there can be no brute tics in the past, then linguistics has been (and ought to be) fact about what rules and representations a system is running concerned with E-language. Of course, one might wonder apart from the intentions of the designer of the system. why past investigations should restrict the direction (and Because, when studying humans, we have no access to the leading questions) of current research. Chomsky (1993, intentions of the designer, there can be no fact of the matter 1995) not only disputes this historical story but has argued about what rules and representations underlie our linguistic that E-languages are not suitable for naturalistic inquiry, abilities. The conclusion drawn by Kripke is that “it would because they constitute artifacts rather than natural objects. seem that the use of the idea of rules and of competence in In other words, it is fine to talk about E-languages as long as linguistics needs serious reconsideration, even if these one doesn’t think one is doing science. notions are not rendered meaningless.” (1982: 31 fn 22) As we will see, the choice between these two general Chomsky (1986) appears to argue that one can know cer- approaches to language is very rich in consequences. Both tain facts about computers in isolation, but Chomsky’s cur- the claim that I-language is individualistic and the claim that rent position (1995) is that computers, unlike the human it is computational have led to a number of philosophical language faculty, are artifacts and hence the product of skirmishes. human intentions. The language faculty is a natural object One of the immediate questions raised by the idea that I- and embedded within human biology, so the facts about its language is individualistic has to do with the nature of structure are no more grounded in human intentions than are SEMANTICS, and in particular referential semantics—con- facts about the structure of human biology. strued as theories of the relation between linguistic forms If the language faculty is an internal computational/rep- and aspects of the external world (see REFERENCE, THEORIES resentational system, a number of questions arise about how 480 Linguistics, Philosophical Issues to best go about investigating and describing it. For exam- References ple, there has been considerable attention paid to the role of Bresnan, J., and R. Kaplan. (1982). Introduction: Grammars as formal rigor in linguistic theory. On this score, a number of mental representations of language. In J. Bresnan, Ed., The theorists (e.g., Gazdar, Klein, Pullum, and Sag 1985; Bre- Mental Representation of Grammatical Relations. Cambridge, snan and Kaplan 1982; Pullum, 1989) have argued that the MA: MIT Press, pp. xvii–lii. formal rigor of their approaches—in particular, their use of Chomsky, N. (1975). The Logical Structure of Linguistic Theory. well-defined recursive procedures—counts in their favor. New York: Plenum. However, Ludlow (1992) has argued that this sort of Chomsky, N. (1980). Rules and Representations. New York: approach to rigorization would be out of synch with the Columbia University Press. development of other sciences (and indeed, branches of Chomsky, N. (1986). Knowledge of Language. New York: Praeger. Chomsky, N. (1993). Explaining language use. In J. Tomberlin, mathematics) where formalization follows in the wake of Ed., Philosophical Topics 20. pp. 205–231. the advancing theory. Chomsky, N. (1995). Language and nature. Mind 104: 1–61. A second methodological issue relates to the use of PAR- Chomsky, N., and M. Halle. (1968). The Sound Pattern of English. SIMONY AND SIMPLICITY in the choice between linguistic New York: Harper and Row. theories. Although tight definitions of simplicity within a Devitt, M. (1995). Coming to Our Senses: A Naturalistic Program linguistic theory seem to be possible (see Halle 1961; for Semantic Localism. Cambridge: Cambridge University Chomsky and Halle 1968; Chomsky 1975), finding a notion Press. of simplicity that allows us to chose between two competing Devitt, M., and K. Sterelny. (1987). Language and Reality: An theoretical frameworks is another matter. Some writers Introduction to the Philosophy of Language. Cambridge, MA: (e.g., Postal 1972; Hornstein 1995) have argued that genera- MIT Press. Gazdar, G., E. Klein, G. Pullum, and I. Sag. (1985). Generalized tive semantics and the minimalist program (see MINIMAL- Phrase Structure Grammar. Cambridge, MA: Harvard Univer- ISM), respectively, are simpler than their immediate sity Press. competitors because they admit fewer levels of representa- George, A. (1989). How not to become confused about linguistics. tion. In response, Ludlow (1998) has maintained that there In A. George, Ed., Reflections on Chomsky. Oxford: Blackwell, is no objective criterion for evaluating the relative amount of pp. 90–110. theoretical machinery across linguistic theories. Ludlow Halle, M. (1961). On the role of simplicity in linguistic descrip- offers that the only plausible definition of simplicity would tion. Proceedings of Symposia in Applied Mathematics 12 be one that appealed to “simplicity of use,” suggesting that (Structure of Language and its Mathematical Aspects), pp. 89– simplicity in linguistics may not be a feature of the object of 94. Providence: American Mathematical Society. study itself but rather our ability to easily grasp and utilize Higginbotham, J. (1983). Is grammar psychological? In L. Cau- man, I. Levi, C. Parsons, and R. Schwartz, Eds., How Many certain kinds of theories. Questions: Essays In Honor of Sydney Morgenbesser. Indianap- Finally, there is the matter of the nature of evidence olis, IN: Hackett. available for investigating the language faculty. Evidence Hornstein, N. (1984). Logic as Grammar. Cambridge, MA: MIT from a written or spoken corpus is at best twice removed Press. from the actual object of investigation, and given the pos- Hornstein, N. (1995). Logical Form: From GB to Minimalism. sibility of performance errors, is notoriously unreliable at Oxford: Blackwell. that. Much of the evidence adduced in linguistic theory has Katz, J. (1981). Language and Other Abstract Objects. Totowa, therefore been from speakers’ intuitions of acceptability, NJ: Rowman and Littlefield. as well as intuitions about possible interpretations. This Katz, J., Ed. (1985). The Philosophy of Linguistics. Oxford: raises a number of interesting questions about the reliabil- Oxford University Press. Kripke, S. (1982). Wittgenstein on Rules and Private Language. ity of introspective data (see INTROSPECTION) and the kind Cambridge: Harvard University Press. of training required to have reliable judgements. There is Lewis, D. (1975). Language and languages. In K. Gunderson, Ed., also the question of why we should have introspective Language, Mind, and Knowledge. Minneapolis: University of access to the language faculty at all. It is fair to say that Minnesota Press, pp. 3–35. these questions have not been adequately explored to date Ludlow, P. (1992). Formal Rigor and Linguistic Theory. Natural (except in a critical vein; see Devitt 1995; Devitt and Ster- Language and Linguistic Theory 10: 335–344. elny 1987). Ludlow, P. (1998). Simplicity and generative grammar. In R. Stain- Katz (1985: Introduction) offers that the philosophy of ton and K. Murasugi, Eds., Philosophy and Linguistics. Boul- linguistics could soon emerge as a domain of inquiry in its der, CO: Westview Press. own right, on the model of the philosophy of physics and Ludlow, P. (Forthcoming). Referential semantics for I-languages? In N. Hornstein and L. Antony, Eds., Chomsky and His Critics. the philosophy of biology. Given the number of interesting Oxford: Blackwell. questions and disputes that have arisen in the interim, it is Postal, P. (1972). The best theory. In S. Peters, Ed., Goals of Linguis- fair to say that Katz’s prediction is coming true. The issues tic Theory. Englewood Cliffs, NJ: Prentice-Hall, pp. 131–179. canvassed above provide a mere sketch of the current topics Pullum, G. (1989). Formal linguistics meets the boojum. Natural under discussion and point to a rich field of investigation in Language and Linguistic Theory 7: 137–143. the years to come. Putnam, H. (1975). The meaning of meaning. In K. Gunderson, See also FREGE; INNATENESS OF LANGUAGE; LOGICAL Ed., Language, Mind, and Knowledge. Minneapolis: University FORM, ORIGINS OF; SENSE AND REFERENCE of Minnesota Press, pp. 131–193. Quine, W. V. O. (1970). Methodological reflections on current lin- —Peter Ludlow guistic theory. Synthese 21: 368–398. Literacy 481 On this view, the properties of speech available for INTRO- Soames, S. (1984). Linguistics and psychology. Linguistics and Philosophy 81: 155–179. SPECTION, such as words, sentences, syllables, and phone- Wittgenstein, L. (1953). Philosophical Investigations. Translated mic segments, are the consequence of literacy, of applying by G. E. M. Anscombe. New York: MacMillan. written models to speech. A major problem in learning to Wittgenstein, L. (1956). Remarks on the Foundations of Mathe- read is learning to “hear” speech in a new way, that is, in a matics. Translated by G. E. M. Anscombe. Cambridge, MA: way compatible with the items—words and letters—com- MIT Press. posing the script. Studies of nonliterate adults’ (Morais, Alegria, and Content 1987) and of prereading children’s Literacy beliefs about writing (Ferreiro and Teberosky 1982) as well as the vast literature on metalinguistic awareness (Goswami Literacy is competence with a written language, a script. and Bryant 1990) tend to support this view. This competence includes not only an individual’s ability to Writing and the mind Although the mind as a biologi- read and write a script but also one’s access to and compe- cal organ is common to all humans, mind as a conscious, tence with the documentary resources of a literate society. conceptual system is in part the product of culture. In a Literacy holds a prominent place in the political goals of modern bureaucratic culture, mind is closely linked to lit- both developed and developing nations as manifest in uni- eracy but just how remains the subject of research and the- versal, compulsory education where literacy is seen as a ory. Although the theory linking forms of writing with means to personal, social, and economic fulfillment. Liter- levels of culture and thought is to be found in such eigh- acy is a more general concept than READING and writing, teenth-century writers as Vico and Condorcet, modern including not only competence with and uses of reading and theory is more clearly traced to Levy-Bruhl’s theory of writing but also the roles that reading and writing play in the “primitive thought,” now widely criticized, and the theo- formation and accumulation of the procedures, laws, and ries that first appeared in the 1960s by Goody (1968), texts that serve as the primary embodiment of historical cul- McLuhan (1962), Havelock (1963), and later Ong (1982), ture. Literate, bureaucratic, or “document” societies are which contrasted “orality” and “literacy” both as modes of those in which such archival texts and documents play a communication and modes of thought. Writing allowed, it central and authoritative role. Such societies depend on was argued, a particular form of study and contemplation, highly literate specialists. the formation of logics and dictionaries, a focus on “ver- Writing and communication Writing has obvious batim” interpretation and memorization with an interpre- advantages over speech for communication across space and tive bias to literalism. Although the increasing and through time, factors which various media, including the pervasive reliance on written records and other written book, the printing press, the telegraph, and computer tech- documents in many societies is undeniable (Clanchy nologies, exploit and extend in various ways. Writing 1992; Thomas 1992), the relations between “orality” and played an essential role in the formation and operation of “literacy” continue to be debated. CULTURAL PSYCHOL- the first large-scale societies, whether as cities, nations, or OGY attempts to understand the cognitive implications of empires in ancient China, Sumer, Egypt, and Mesoamerica, such developments. Although writing never replaces where it played a critical role in record keeping (Nissen, speaking but rather preserves aspects of speech and other Damerow, and Englund 1993), codification and publication forms of information as permanent visible artifacts, these of law (Harris 1989), the development of literature (Have- literate artifacts may in turn alter the very linguistic and lock 1963), and the accumulation of knowledge whether as conceptual practices of a social group, activities that blur, history or science (Eisenstein 1969). almost to the point of obliterating, the distinction between Writing and representation Not only does writing alter orality and literacy. patterns of communication, written texts and commentaries Literate thought Although mind reflects as well as on texts build up a tradition of scholarship. Such accumula- invents culture and although human competence must be tions tend to lose their connections with personal authorship analyzed in terms of the available technologies (Clark 1996: and may come to be treated as objects in their own rights, as 61), the technology of greatest importance for understand- Scripture, as Law, or as Science. Consequently, writing ing conceptual and intellectual advance in the arts and sci- comes to serve as a mode of representation of what is taken ences is the invention of writing and other notational as “known.” Three aspects of this problem have been taken systems (Donald 1991; Olson 1994). up in the cognitive sciences: the relation between speech Conceptual development in children is, in part, the con- and writing, the acquisition of literacy, and the effects of lit- sequence of the acquisition of these systems for represent- erate representations on the formation of mind. ing thought (VYGOTSKY 1986). Furthermore, writing is Speech and writing Although scripts are not designed instrumental to thinking in general as a form of metalinguis- according to fixed principles, WRITING SYSTEMS may be tic knowledge—that is, knowledge about the lexical, gram- classified according to type, each type bearing a particular matical, and logical properties of the language. Vocabulary relation to the structure of speech. Each type of script, con- knowledge, for example, is greatly extended by reading sequently, requires a reader to carve up the stream of speech (Anglin 1993; Anderson 1985), and reflective knowledge in a distinctive, graphically determined way. The problem about words serves as a major aspect of measured intelli- for the learner is to analyze, that is, conceive of, oral speech gence in a literate society (Stanovich 1986). in terms of the categories offered by the script (Shankweiler Literacy and social development Because literacy plays and Liberman 1972; Faber 1992; Harris 1986; Olson 1994). such a prominent role in modern societies, it is often 482 Local Representation assumed that the route to social development is through Stanovich, K. E. (1986). Matthew effects in reading: some conse- quences of individual differences in the acquisition of literacy. teaching people to read and write (UNESCO 1985). Current Reading Research Quarterly 21: 360–407. research and practice has shown that in order to bring about Street, B. (1985). Literacy in Theory and Practice. Cambridge: cultural and social transformation, literacy must be seen as Cambridge University Press. an activity embedded in social and cultural practice. Liter- Thomas, R. (1992). Literacy and Orality in Ancient Greece. Cam- acy, bureaucratic institutional structures with explicit proce- bridge: Cambridge University Press. dures and accountability, and democratic participation are UNESCO. (1985). The Current Literacy Situation in the World. mutually reinforcing. Rather than being seen simply as a Paris: UNESCO. goal, literacy has come to be seen as a means to fuller par- Vygotsky, L. (1986). Thought and Language, A. Kozulin, Ed. ticipation in the institutions of the society, whether in law, Cambridge, MA: MIT Press. science, or literature (Street 1985) as well as a means for Further Readings their transformation. See also LANGUAGE AND COMMUNICATION; LANGUAGE Bruner, J. S. (1996). The Culture of Education. Cambridge, MA: AND CULTURE; LEXICON; NUMERACY AND CULTURE Harvard University Press. Cole, M. (1996). Cultural Psychology. Cambridge, MA: Harvard —David Olson University Press. Condorcet, M. de. (1802). Outlines of an Historical View of the References Progress of the Human Mind, Being a Posthumous Work of the Late M. De Condorcet. Baltimore, MD: G. Fryer. Anderson, R. C. (1985). Becoming a Nation of Readers: The Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmental Report of the Commission on Reading. Pittsburgh, PA: National Perspective on Cognitive Science. Cambridge, MA: MIT Press. Academy of Education. Levy-Bruhl, L. (1923). Primitive Mentality. London: George Allen Anglin, J. M. (1993). Vocabulary development: a morphological and Unwin. analysis. Monographs of the Society for Research in Child Nelson, K. (1996). Language in Cognitive Development: The Development 58, no. 10: 1–66. Emergence of the Mediated Mind. Cambridge: Cambridge Uni- Clanchy, M. (1992). From Memory to Written Record. Oxford: versity Press. Blackwell. Vico, G. (1744/1984). The New Science of Giambattista Vico, T. Clark, A. (1996). Being There: Putting Brain, Body, and World Bergin and M. Fish, Eds. Ithaca, NY: Cornell University Press. Together Again. Cambridge, MA: Bradford/MIT Press. Donald, M. (1991). Origins of the Modern Mind. Cambridge, MA: Harvard University Press. Local Representation Eisenstein, E. (1979). The Printing Press as an Agent of Change. Cambridge: Cambridge University Press. Faber, A. (1992). Phonemic segmentation as epiphenomenon: evi- See CONNECTIONISM, PHILOSOPHICAL ISSUES; DISTRIBUTED dence from the history of alphabetic writing. In P. Dowling, S. VS LOCAL REPRESENTATION; KNOWLEDGE REPRESENTATION D. Lima, and M. Noonan, Eds., The Linguistics of Literacy. Amsterdam: John Benjamins. Ferreiro, E., and A. Teberosky. (1982). Literacy before Schooling. Logic Exeter, NH: Heinemann. Goody, J. (1968). Literacy in Traditional Societies. Cambridge: Cambridge University Press. All the children in Lake Woebegone are above average. Goswami, U., and P. Bryant. (1990). Phonological Skills and Most people in Lake Woebegone are children. Learning to Read. Hillsdale, NJ: Erlbaum. Therefore, most people in Lake Woebegone are above Harris, R. (1986). The Origin of Writing. London: Duckworth. average. Harris, W. V. (1989). Ancient Literacy. Cambridge, MA: Harvard University Press. No matter what “Lake Woebegone” refers to, what time is Havelock, E. (1963). Preface to Plato. Cambridge: Cambridge being talked about, exactly what kind of majority “most” University Press. refers to, or what sense of “above average” is meant, this lit- McLuhan, M. (1962). The Gutenberg Galaxy. Toronto: University tle proof is valid—as long as the meaning of these terms is of Toronto Press. held constant. If we replace “most” by the determiner “all,” Morais, J., J. Alegria, and A. Content. (1987). The relationships “some,” or “at least three,” the proof remains valid, but if between segmental analysis and alphabetic literacy: an interac- we replace it by “no” or “few,” the proof becomes invalid. tive view. Cahiers de Psychologie Cognitive 7: 415– 538. As the science of reasoning, logic attempts to understand Nissen, H. J., P. Damerow, and R. K. Englund. (1993). Archaic Bookkeeping: Early Writing and Techniques of Economic such phenomena. Administration in the Ancient Near East. Chicago: University It is instructive to compare logic with linguistics, the sci- of Chicago Press. ence of language. Reasoning and using language have a Olson, D. R. (1994). The World on Paper. Cambridge: Cambridge number of properties in common. Both characterize human University Press. cognitive abilities. Both exhibit INTENTIONALITY—they Ong, W. (1982). Orality and Literacy: The Technologizing of the refer to objects, events, and other situations typically out- Word. London: Methuen. side the skin of the agent. And both involve an interaction of Shankweiler, D., and I. Liberman. (1972). Misreading: a search for SYNTAX, SEMANTICS, and PRAGMATICS. Given these simi- causes. In J. Kavanaugh and I. Mattingly, Eds., Language by larities, one might expect logic and linguistics to occupy Ear and Language by Eye: The Relationships between Speech similar positions vis-à-vis cognitive science, but while lin- and Reading. Cambridge, MA: MIT Press, pp. 293–317. Logic 483 division of labor involving linguistics since it has long been guistics is usually considered a branch of cognitive science, interested in mechanism.) This division has resulted in a logic is not. To understand why, we must recognize that, distance between the fields of logic and cognitive science. apart from the properties noted above, logic and linguistics Still, ideas and results from logic have had a profound influ- are strikingly dissimilar. What constitutes a proper sentence ence on cognitive science. varies from language to language. Linguistics looks for Late-nineteenth- and early-twentieth-century logicians what is common to the world’s many languages as a way to (e.g., Hilbert, FREGE, Russell, and Gentzen, before the shift say something about the nature of the human capacity for language. By contrast, what constitutes a proper (i.e., valid) to semantics noted above), developed FORMAL SYSTEMS, piece of reasoning is thought to be universal. Twentieth- mathematical models of reasoning based on the syntactic century logic holds that no matter what language our sample manipulation of sentencelike representations. The import of argument is couched in, it will remain valid, not because of “formal” in this context is that the acceptability of an infer- some cognitive property of human beings, but because valid ence step should be a function solely of the shape or “form” reasoning is independent of how it was discovered, pro- of the representations, independent of what they mean. duced, or expressed in natural language. Within linguistics, this has led to the view that a sentence Since antiquity Euclidean geometry, with its system of has an underlying LOGICAL FORM that represents its mean- postulates and proofs, was taken as the shining example of a ing, and that reasoning involves computations over logical logical edifice, having its foundations in what we would forms. now call “cognitive science.” Under the influence of KANT, One might postulate that the logical forms involved in nineteenth-century mathematicians and philosophers had our sample argument are something like assumed that the truth of Euclid’s postulates was built into All C are A human perceptual abilities, and that methods of proof Most P are C embodied laws of thought. For example, the great mathe- Most P are A matical logician David Hilbert (1928 p. 475) wrote, “The (Early work did not treat determiners such as “most” and fundamental idea of my proof theory is none other than to “few” at all—only “every,” “some,” and others that could be describe the activity of our understanding, to make a proto- defined in terms of them.) In this view, recognizing the col of the rules according to which our thinking actually validity of the argument would be a matter of computing the proceeds.” logical forms of the natural language sentences and then Given this historical relationship between logic, geome- recognizing the validity of the inference in terms of these try, and cognition, it is not surprising that logic was pro- forms (see, for example, Rips 1994). foundly influenced by the discovery of non-Euclidean As models of valid reasoning, formal systems have geometries, which challenged the Kantian view, and which important uses within mathematical logic and computer brought with them both a deep distrust of psychology and science, but as models of human performance, they have an urgent need to understand the differences between valid been frequently criticized for their poor predictions of suc- and invalid reasoning. What were the valid principles of rea- cesses, errors, and difficulties in human reasoning. soning? What made a principle of reasoning valid or Johnson-Laird and Byrne (1991) have argued that postulat- invalid? Refocusing on such normative questions led logi- ing more imagelike MENTAL MODELS make better predic- cians, following Gödel and Tarski in the mid-twentieth cen- tions about the way people actually reason. Their proposal, tury, to the semantic aspects of logic—truth, reference, and applied to our sample argument, might well help to explain meaning—whose relationships must be honored in valid the difference in difficulty in the various inferences men- reasoning. In particular, a valid proof clearly demonstrates tioned earlier, because it is easier to visualize “some peo- that whenever the premises of an argument are true, its con- ple” and “at least three people” than it is to visualize “most clusion is also true. That is why our sample argument is people.” Cognitive scientists have recently been exploring valid, not because of cognitive abilities of humans, but computational models of reasoning with diagrams. Logi- because, if its premises are true, so is its conclusion. cians, with the notable exceptions of Euler, Venn, and Cognitive science, on the other hand, is concerned with Peirce, have until the past decade paid scant attention to mechanism, with how humans reason. Why do they make spatial forms of representation, but this is beginning to the reasoning errors they do? Systematic reasoning errors change (Hammer 1995). are at least as interesting as valid reasoning if one is looking In the 1930s, Alan TURING, a pioneer in computability for clues as to how people reason. Why are some inferences theory, developed his famous machine model of the way harder than others? For example, people seem to take longer people carry out routine computations using symbols, a on average to process the inference in our original sample model exploited in the design of modern-day digital com- argument than they do the one in the variant where “most” puters (Turing 1936). Combining Turing’s machines and the is replaced by “some” or “at least three,” even though all formal system model of reasoning, cognitive scientists (e.g., three versions are valid. (This claim is based on informal Fodor 1975) have proposed formal symbol processing as a surveys; I know of no careful work on this question.) metaphor for all cognitive activity: the so-called COMPUTA- The logician’s distrust of psychological aspects of rea- TIONAL THEORY OF MIND. Indeed, some cognitive scientists soning has led to a de facto division of labor. The relation- go so far as to define cognitive science in terms of this met- ship between mind and representation is considered the aphor. This suggestion has played a very large role in cogni- subject matter of psychology, that between representation tive science, some would say a defining role, and it is and the world the subject matter of logic. (There is no such 484 Logic Programming implicit in Turing’s original work. Still the idea is highly LOGIC PROGRAMMING; LOGICAL OMNISCIENCE, PROBLEM OF; controversial; connectionists, for example, reject it. POSSIBLE WORLDS SEMANTICS A third contribution of logic to cognitive science arose from research in logic on semantics. Most famously, the —K. Jon Barwise logician Montague (1974), borrowing ideas from modern logic, developed the first serious account of the semantics of References natural languages; known as “Montague grammar”, it has Barwise, J. (1987). The Situation in Logic. Stanford, CA: CSLI proven quite fruitful. One successful development in this Publications. area has been use of generalized QUANTIFIERS to interpret Barwise, J., and R. Cooper. (1981). Generalized quantifiers and natural language determiners and their associated noun natural language. Linguistics and Philosophy 4: 137–154. phrases (Barwise and Cooper 1981). The meaning of each Fodor, J. A. (1975). The Language of Thought. New York: determiner is modeled by a binary relation between sets; the Crowell. relations themselves have very different properties, proper- Feferman, S. (1996). Penrose’s Goedelian argument. Psyche 2: ties that can be used in accounting for associated logical and 21–32. processing differences. Glasgow, J. I., N. H. Narayan, and B. Chandrasekaran, Eds. (1991). Diagrammatic Reasoning: Cognitive and Computa- Our final application has to do with cognitive interpreta- tional Perspectives. Cambridge, MA: AAAI/MIT Press. tions of the first of GÖDEL’S THEOREMS. This theorem, one Hammer, E. (1995). Logic and Visual Information. Studies in Logic, of the most striking achievements of logic, demonstrates Language, and Computation. Stanford, CA: CSLI Publications. strict limitations on what can be done using formal systems Hilbert, D. (1928/1967). Die Grundlagen der Mathematik. Eng. and hence digital computers. Various writers, most title (The foundations of mathematics). Abhandlugen aus dem famously Penrose (1991), have attempted to use Gödel’s mathematischen Seminar der Hamburgischen Universität 6: first theorem to argue that because there are things people 65–85. In J. van Heijenoort, Ed., From Frege to Gödel. Cam- can do that computers in principle cannot do, the formal bridge, MA: Harvard University Press, pp. 464–479. systems of logic are irrelevant to understanding human cog- Johnson-Laird, P. N., and R. Byrne. (1991). Deduction. Essays in nition, although this argument is very controversial (see, for Cognitive Psychology. Mahwah, NJ: Erlbaum. Montague, R. (1974). Formal Philosophy: Selected Papers of Rich- example, Feferman 1996). ard Montague. Edited with an introduction by Richmond Tho- If it is to be the science of full-fledged reasoning, logic mason. New Haven: Yale University Press. still has much to accomplish. What features might this more Penrose, R. (1991). The Emperor’s New Mind: Concerning Com- complete logic have? The logician C. S. Peirce suggested puters, Minds, and the Laws of Physics. New York: Penguin. that the relationship between mind, language, and the world Rips, L. (1994). The Psychology of Proof. Cambridge, MA: MIT was irreducibly ternary, that one could not give an adequate Press. account of the binary relation between mind and language, Turing, A. (1936). On computable numbers, with an application to or between language and the world, without giving an the Entscheidungsproblem. Proc. London Math. Soc. Series 2, account of the relationship among all three. According to 42: 230–265. this view, the division of labor depicted above, and with it Further Readings the divorce of logic from cognition, is misguided. Peirce’s thinking has been reincarnated in the situated cognition Barwise, J., Ed. (1977). Handbook of Mathematical Logic. movement, which argues that any adequate cognitive theory Amsterdam: Elsevier. must take account of the agent’s physical embeddedness in Barwise, J., and J. Etchemendy. (1993). The Language of First- its environment and its exploitation of regularities in that Order Logic. 3rd ed. Stanford, CA: CSLI Publications. environment (see SITUATEDNESS/EMBEDDEDNESS). Barwise, J., and J. Perry. (1983). Situations and Attitudes. Cam- Situatedness infects reasoning and logic (Barwise 1987). bridge, MA: MIT Press. The ease or difficulty of an inference, for example, depends Devlin, L. (1991). Logic and Information. Cambridge: Cambridge on the agent’s context in many ways. Even the validity of an University Press. Gabbay, D., and F. Guenther, Eds. (1983). Handbook of Philosoph- inference is in a limited sense a situated matter because ical Logic, 4 vols. Dordrecht: Kluwer. validity depends not just on the sentences used, but on how Gabbay, D., Ed. (1994). What is a Logical System? Studies in they are used by the agent. This arises in our sample argu- Logic and Computation. Oxford: Oxford University Press. ment in the requirement that the meaning of the terms be Haugland, J. (1985). Artificial Intelligence: The Very Idea. Cam- held constant. The way the agent is situated in the world in bridge, MA: MIT Press. part determines whether this caveat is satisfied. For exam- Stenning, K., and P. Yule. (1997). Image and language in human ple, if the agent uttered the two premises in different years, reasoning: A syllogistic illustration. Cognitive Psychology 34: the conclusion would not follow. 109–159. Logic has had a profound impact on cognitive science, as Van Benthem, J. and A. ter Meulen, Eds. (1997). Handbook of the above examples show. The impact in the other direction Logic and Language. Amsterdam: Elsevier. has been less than one might have expected, due to the dis- trust of cognitive aspects of reasoning by the logic commu- Logic Programming nity. One hopes that the synergy between the two fields will be greater in years to come. Logic programming is the use of LOGIC to represent pro- See also CAUSAL REASONING; DEDUCTIVE REASONING; grams and of deduction to execute programs in LOGICAL INDEXICALS AND DEMONSTRATIVES; LANGUAGE OF THOUGHT; Logic Programming 485 first procedure. So Prolog tries to find a path from a to some To this end, many different forms of logic and many FORM. Z' using the second procedure. Continuing in this way, it varieties of deduction have been investigated. The simplest goes into an infinite loop, looking for paths from a to Z, to form of logic, which is the basis of most logic program- Z' to Z'', . . . . ming, is the Horn clause subset of logic, where programs The alternative to rejecting logic programming because consist of sets of implications: A0 if A1 and A2 and ... and of such problems or of restricting it to niche applications, is An. Here each Ai is a simple (i.e., atomic) sentence. for the programmer to take responsibility for both the Because deduction by backward reasoning interprets declarative correctness and the procedural efficiency of pro- such implications as procedures, it is usual to write them grams. The following is such a correct and efficient logic backward: to solve the goal A0, solve the subgoals A1 and program for the path-finding problem. It incrementally con- A2 and . . . and An. The number of conditions, n, can be 0, in structs a path of nodes already visited and ensures that no which case the implication is simply a fact, which behaves step is taken that revisits a node already in the path. The as a procedure which solves goals directly without introduc- goal of showing there is a path from X to Y is reformulated ing subgoals. as the goal of extending the path consisting of the single The procedural interpretation of implications can be used node X to a path ending at Y. For simplicity, a path is for declarative programming, where the programmer regarded as a trivial extension of itself. describes information in logical form and the deduction mechanism uses backward reasoning to achieve problem The path P can be extended to a path ending at Y solving behavior. Consider, for example, the implication if P ends at Y. X is a citizen of the United States The path P can be extended to a path ending at Y if X was born in the United States. if P ends at X and there is a step from X to Z Backward reasoning treats the sentence as a procedure: and Z is not in P and P' extends P by adding Z to the end of P To show X is a citizen of the United States, and the path P' can be extended to a path ending at Y. show that X was born in the United States. It is usual to interpret the negation in a condition, such as In fact, backward reasoning can be used, not only to show a “Z is not in P” above, as negation by failure. A subgoal of the particular person is a citizen, but also to find people who are form “not A” is deemed to hold if and only if the positive citizens by virtue of having been born in the United States. subgoal “A” cannot be shown to hold. Programs containing Logic programs are nondeterministic in the sense that such negative conditions, extending the Horn clause subset many different procedures might apply to the same goal. For of logic programming, are called “normal logic programs.” example, naturalization and being the child of a citizen pro- The use of negation as failure renders logic programming vide alternative procedures for obtaining citizenship. The a NONMONOTONIC LOGIC, where addition of new informa- nondeterministic exploration of alternative procedures, to tion may cause a previously justified conclusion to be with- find one or more which solve a given goal, can be per- drawn. A simple example is the sentence formed by many different search strategies. In the logic pro- gramming language Prolog, search is performed depth-first, X is innocent if not X is guilty. trying procedures one at a time, in the order in which they are written, backtracking in case of failure. The condition “not X is guilty” is interpreted as “it cannot Declarative programming is an ideal. The programmer be shown that X is guilty.” specifies what the problem is and what knowledge should Many extensions of normal logic programming have be used in solving problems, and the computer determines been investigated. Among the most important of these is the how the knowledge should be used. The declarative pro- extension that includes METAREASONING. For example, the gramming ideal works in some cases where the knowledge following implication is a fragment of a metalevel logic pro- has a particularly simple form. But it fails in many others, gram that can be used to reason about the interval of time where nondeterminism leads to an excessively inefficient for which a conclusion holds: search. This failure has led many programmers to reject “R” holds for interval I logic programming in favor of conventional, imperative pro- if “R if S” holds for interval I1 gramming languages. and “S” holds for interval I2 The following example shows the kind of problem that and I is the intersection of I1 and I2. can arise with declarative programming: Similar metalevel programs are used to construct explana- There is a path from X to Y if there is a step from X to Y. tions and to implement resource-bounded reasoning and There is a path from X to Y if there is a path from X to Z reasoning with UNCERTAINTY. and there is a path from Z to Y. Among the other extensions of logic programming Given no other information and the goal of showing being investigated are extensions to incorporate constraint whether there is a path from a node a to a node b, Prolog processing, a second “strong” form of negation, disjunctive fails to find a step from a to b using the first procedure. It conclusions, and abductive reasoning. Methods are being therefore tries to find a path from a to some Z using the sec- developed both to execute programs efficiently and to ond procedure. There is no step from a to any Z using the transform inefficient programs into more efficient ones. 486 Logical Form in Linguistics Applications range from natural language processing and Logical analysis—that is, the specification of logical legal reasoning to commercial knowledge management forms for sentences of a language—presumes that some dis- systems and parts of the Windows NT operating system. tinction is to be made between the grammatical form of sen- tences and their logical form. In LOGIC, of course, there is See also BOUNDED RATIONALITY; CONSTRAINT SATIS- no such distinction to be drawn. By design, the grammatical FACTION; DEDUCTIVE REASONING; INDUCTIVE LOGIC PRO- form of a sentence specified by the syntax of, for instance, GRAMMING; LOGICAL REASONING SYSTEMS; SITUATION first-order predicate logic simply is its logical form. In the CALCULUS case of natural language, however, the received wisdom of —Robert Kowalski the tradition of FREGE, Russell, Wittgenstein, Tarski, Car- nap, Quine, and others has been that on the whole, gram- Further Readings matical form and logical form cannot be identified; indeed, their nonidentity has often been given as the raison d’être Apt, K., and F. Turini, Eds. (1995). Meta-Logics and Logic Pro- for logical analysis. Natural languages have been held to be gramming. Cambridge, MA: MIT Press. insufficiently specified in their grammatical form to reveal Bratko, I. (1988). Prolog Programming for Artificial Intelligence. directly their logical form, and that no mere paraphrase Reading, MA: Addison-Wesley. Clark, K. L. (1978). Negation by failure. In H. Gallaire and J. within the language would be sufficient to do so. This led to Minker, Eds., Logic and Databases. New York: Plenum Press, the view that, as far as natural languages were concerned, pp. 293–322. logical analysis was a matter of rendering sentences of the Colmerauer, A., H. Kanoui, R. Pasero, and P. Roussel. (1973). Un language in some antecedently defined logical (or formal) système de communication homme-machine en français. language, where the relation between the sentences in the Research report, Groupe d’Intelligence Artificielle. Université languages is to be specified by some sort of contextual defi- d’Aix-Marseilles II, Luminy, France. nition or rules of translation. Flach, P. (1994). Simply Logical: Intelligent Reasoning by Exam- In contemporary linguistic theory, there has been a con- ple. New York: Wiley. tinuation of this view in work inspired largely by Mon- Gabbay, D., C. Hogger, and J. A. Robinson. (1993). Handbook of tague (especially, Montague 1974). In large part because Logic in Artificial Intelligence and Logic Programming. Vol. 1, Logic Foundations. Oxford: Clarendon Press. of technical developments in both logic (primarily in the Gabbay, D., C. Hogger, and J. A. Robinson. (1997). Handbook of areas of type theories and POSSIBLE WORLDS SEMANTICS) Logic in Artificial Intelligence and Logic Programming. Vol. 5, and linguistics (with respect to categorial rule systems), Logic Programming. Oxford: Clarendon Press. this approach has substantially extended the range of phe- Gillies, D. (1996). Artificial Intelligence and Scientific Method. nomena that could be treated by translation into an inter- New York: Oxford University Press. preted logical language, shedding the pessimism of prior Kowalski, R. (1974). Predicate logic as programming language. In views as to how systematically techniques of logical anal- Proceedings IFIP Congress, Stockholm. Amsterdam: Elsevier, ysis can be formally applied to natural language (see Par- pp. 569–574. tee 1975; Dowty, Wall, and Peters 1981; Cooper 1983). Kowalski, R. (1979). Logic for Problem Solving. Amsterdam: Within linguistic theory, however, the term “logical form” Elsevier. Kowalski, R. (1992). Legislation as logic programs. In G. Comyn, has been much more closely identified with a different N. E. Fuchs, and M. J. Ratcliffe, Eds., Logic Programming in view that takes natural language to be in an important Action. New York: Springer, pp. 203–230. sense logical, in that grammatical form can be identified Lloyd J. W. (1987). Foundations of Logic Programming. 2nd ed. with logical form. The hallmark of this view is that the New York: Springer. derivation of logical forms is continuous with the deriva- tion of other syntactic representations of a sentence. As this idea was developed initially by Chomsky and May Logical Form in Linguistics (with precursors in generative semantics), the levels of syntactic representation included Deep Structure, Surface The logical form of a sentence (or utterance) is a formal rep- Structure, and Logical Form (LF), with LF—the set of resentation of its logical structure—that is, of the structure syntactic structures constituting the “logical forms” of the that is relevant to specifying its logical role and properties. language—derived from Surface Structure by the same There are a number of (interrelated) reasons for giving a sorts of transformational rules that derived Surface Struc- rendering of a sentence’s logical form. Among them is to ture from Deep Structure. obtain proper inferences (which otherwise would not fol- As with other approaches to logical form, quantification low; cf. Russell’s theory of descriptions), to give the proper provides a central illustration. This is because, since Frege, form for the determination of truth-conditions (e.g., Tarski’s it has been generally accepted that the treatment of quantifi- method of truth and satisfaction as applied to quantifica- cation requires a “transformation” of a sentence’s surface tion), to show those aspects of a sentence’s meaning that form. On the LF approach, it was hypothesized (originally follow from the logical role of certain terms (and not from in May 1977) that the syntax of natural languages contains a the lexical meaning of words; cf. the truth-functional rule—QR, for Quantifier Raising—that derives representa- account of conjunction), and to formalize or regiment the tions at LF for sentences containing quantifier phrases, language in order to show that it is has certain metalogical functioning syntactically essentially as does WH-MOVEMENT properties (e.g., that it is free of paradox or that there is a (the rule that derives the structure of “What did Max sound proof procedure). read?”). By QR, (1) is derived as the representation of Logical Form in Linguistics 487 “Every linguist has read Syntactic Structures” at LF, and that can be analyzed. Constant in these discussions has been because QR may iterate, the representations in (2) for the assumption that logical form is integrated into syntactic “Every linguist has read some book by Chomsky”. description generally, and hence that the thesis that natural languages are logical is ultimately an empirical issue within (1) [ every linguist1 [ t1 has read Syntactic Structures]] the general theory of syntactic rules and principles. See also COMPOSITIONALITY; LOGICAL FORM, ORIGINS OF; (2) a. [ every linguist1 [ some book by Chomsky2 [ t1 has MORPHOLOGY; QUANTIFIERS; SYNTAX-SEMANTICS INTERFACE read t1]]] b. [ some book by Chomsky2 [ every linguist1 [ t1 has —Robert C. May read t2]]] With the aid of the syntactic notions of “trace of move- References ment” (t1, t2) and “c-command” (both of which are inde- Aoun, J., N. Hornstein, and D. Sportiche (1981). Some aspects of pendently necessary within syntactic theory), the logically wide scope quantification. Journal of Linguistic Research 1: significant distinctions of open and closed sentence, and of 69–95. relative scope of quantifiers, can be easily defined with Beghelli, F., and T. Stowell (1997). Distributivity and negation: respect to the sort of representations in (1) and (2). Inter- The syntax of each and every. In A. Szabolcsi, Ed., Ways of preting the trace in (1) as a variable, “t1 has read Syntactic Taking Scope. Dordrecht: Kluwer. Structures” stands as an open sentence, falling within the Chomsky, N. (1976). Conditions on rules of grammar. Linguistic scope of the c-commanding quantifier phrase “every Analysis 2: 303–351. linguist1;” similar remarks hold for (2), except that (2a) Cooper, R. (1983). Quantification and Syntactic Theory. Dor- and (2b) can be recognized as representing distinct scope drecht: Reidel. Dowty, D., R. E. Wall, and S. Peters. (1981). Introduction to Mon- orderings of the quantifiers (see Heim 1982; May 1985, tague Semantics. Dordrecht: Reidel. 1989; Hornstein and Weinberg 1990; Fox 1995; Beghelli Fox, D. (1995). Economy and scope. Natural Language Semantics and Stowell 1997; and Reinhart 1997 for further discus- 3: 283–341. sion of the treatment of quantification within the LF Higginbotham, J. (1980). Pronouns and bound variables. Linguis- approach). A wide range of arguments have been made for tic Inquiry 11: 679–708. the LF approach to logical form. Illustrative of the sort of Higginbotham, J., and R. May. (1981). Questions, quantifiers, and argument presented is the argument from antecedent-con- crossing. The Linguistic Review 1: 41–79. tained deletion (May 1985). A sentence such as “Dulles Heim, I. (1982). The Semantics of Definite and Indefinite Noun suspected everyone that Angleton did” has a verb phrase Phrases. Ph.D. diss., University of Massachusetts, Amherst. elided (its position is marked by the pro-form “did”). If, Hornstein, N., and A. Weinberg. (1990). The necessity of LF. The Linguistic Review 7: 129–168. however, the ellipsis is to be “reconstructed” on the basis Huang, C.-T. J. (1982). Logical Relations in Chinese and the The- of the surface form, the result will be a structural regress, ory of Grammar. Ph.D. diss., Massachusetts Institute of Tech- as the “antecedent” verb phrase, “suspected everyone that nology. Angleton did” itself contains the ellipsis site. However, if Larson, R., and G. Segal. (1995). Knowledge of Meaning. Cam- the reconstruction is defined with respect to a structure bridge, MA: MIT Press. derived by QR: May, R. (1977). The Grammar of Quantification. Ph.D. diss., Mas- sachusetts Institute of Technology. (Facsimile edition published (3) everyone that Angleton did [Dulles suspected t], by Garland Publishing, New York, 1991.) the antecedent is now the VP “suspected t,” obtaining, prop- May, R. (1985). Logical Form: Its Structure and Derivation. Cam- bridge, MA: MIT Press. erly, an LF-representation comparable in form to that which May, R. (1989). Interpreting logical form. Linguistics and Philoso- would result if there had been no deletion: phy 12: 387–435. (4) everyone that Angleton suspected t [Dulles suspected t]. Montague, R. (1974). The proper treatment of quantification in ordinary English. In R. Thomason, Ed., Formal Philosophy: Among other well-known arguments for LF are weak Selected Papers of Richard Montague. New Haven, CT: Yale crossover (Chomsky 1976), the interaction of quantifier University Press. scope and bound variable anaphora (Higginbotham 1980; Partee, B. (1975). Montague grammar and transformational gram- Higginbotham and May 1981), superiority effects with mul- mar. Linguistic Inquiry 6: 203–300. tiple wh-constructions (Aoun, Hornstein, and Sportiche Reinhart, T. (1997). Quantifier scope: How labor is divided between QR and choice functions. Linguistics and Philosophy 1981) and wh-complementation in languages without overt 20: 399–467. wh-movement (Huang 1982). Over the past two decades, there has been active discussion in linguistic theory of the Further Readings precise nature of representations at LF, in particular with respect to the representation of binding (see BINDING THE- Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: ORY) as this pertains to quantification and ANAPHORA, and MIT Press. of the semantic interpretation of such representations (cf. Fiengo, R., and R. May (1994). Indices and Identity. Cambridge, Larson and Segal 1995). This has taken place within a MA: MIT Press. milieu of evolving conceptions of SYNTAX and SEMANTICS Frege, G. (1892). On Sense and Reference, trans. by M. Black. In and has led to considerable refinement in our conceptions of P. Geach and M. Black, Eds., Translations from the Philosophi- the structure of logical forms and the range of phenomena cal Writings of Gottlob Frege. Oxford: Blackwell. 488 Logical Form, Origins of ence to infinitesimals, and reanalyzed it in terms of a nota- Hornstein, N. (1984). Logic as Grammar. Cambridge, MA: MIT Press. tion that exploited the limit concept, and made no reference Hornstein, N. (1995). Logical Form. Oxford: Blackwell. to infinitesimals. It subsequently emerged that the limit con- Jaeggli, O. (1980). On Some Phonologically Null Elements of Syn- cept was analyzable in terms of logical notions, such as tax. Ph.D. diss., Massachusetts Institute of Technology. quantification, together with unproblematic numerical con- Koopman, H., and D. Sportiche. (1982). Variables and the bijec- cepts. Thus progress in nineteenth-century mathematics tion principle. The Linguistic Review 2: 139–161. involved replacing a notation that made explicit reference to Lakoff, G. (1972). On Generative Semantics. In D. Steinberg and undesirable entities (viz., infinitesimals) with a notation in L. Jakobovits, Eds., Semantics. Cambridge: Cambridge Univer- which reference to such entities was replaced by logical and sity Press, pp. 232–296. numerical operations on more acceptable ones (noninfini- May, R. (1991a). Syntax, semantics, and logical form. In A. tesimal real numbers). Kasher, Ed., The Chomskyan Turn. Oxford: Blackwell. May, R. (1991b). Linguistic theory and the naturalist approach to Bertrand Russell (e.g., 1903) was clearly affected by the semantics. In D. J. Napoli and J. Kegl, Eds., Bridges Between developments in nineteenth-century mathematics, though Psychology and Linguistics: A Swarthmore Festschrift for Lila his proposals (1905) have both revisionary and descriptive Gleitman. Hillsdale, NJ: Erlbaum. aspects. In them Russell treated the problem of negative Pesetsky, D. (1987). Wh-in-situ: movement and selective binding. existential sentences, such as “Pegasus does not exist.” The In E. Reuland and A. ter Meulen, Eds., The Representation of difficulty with this sentence is that its grammatical form (In)definiteness. Cambridge, MA: MIT Press. suggests that endorsing its truth commits one to the exist- Quine, W. V. O. (1960). Word and Object. Cambridge, MA: MIT ence of a denotation for the proper name “Pegasus,” which Press. could be none other than that winged horse. Yet what the Russell, B. (1905). On denoting. Mind 14: 479–493. sentence seems to assert is precisely the nonexistence of Wittgenstein, L. (1922). Tractatus Logico-Philosophicus, trans. by D. F. Pears and B. F. McGuiness. London: Routledge and such a being. Kegan Paul. According to Russell, the problem lies in taking the grammatical form to be a true guide to the structure of the proposition it expresses, that is, in taking the surface gram- Logical Form, Origins of matical form to exhibit the commitments of its endorse- ment. In Russell’s view, the structure of the proposition expressed by “Pegasus does not exist” differs markedly When philosophers use the expression “the Logical Form of from the grammatical form. Rather than being “about” a sentence,” they refer to a linguistic representation whose Pegasus, its structure is more adequately reflected by a sen- desirable properties the surface grammatical form of the tence such as “It is not the case that there exists a unique sentence either masks or lacks completely. Because philoso- thing having properties F, G, and H,” where F, G, and H are phers have found ever so many desirable properties hidden the properties we typically associate with the fictional or absent in grammatical form, there are ever so many dif- winged horse. Once this is accepted, we may endorse the ferent notions of Logical Form. truth of “Pegasus does not exist” without fear of admitting According to one tradition, which one might call the Pegasus into the realm of being. “descriptive conception” of Logical Form, the Logical Russell took himself not as proposing a special level of Form of a sentence is something like the “deep structure” linguistic representation, but rather as proposing what the of that sentence (e.g., Harman 1972), and we may discover true structure of the (nonlinguistic) proposition expressed that the “real” structure of a natural language sentence is in by the sentence was. Russell’s early writings are therefore fact quite distinct from its surface grammatical form. Talk no doubt the origin of the occasional usage of “Logical of Logical Form in this sense involves attributing hidden Form” as referring to the structure of a nonlinguistic entity complexity to natural language, complexity that may be such as a proposition or a fact (e.g., Wittgenstein 1922, revealed by philosophical, and indeed empirical, inquiry 1929; Sainsbury 1991, 35). However, Russell’s proposal (see LOGICAL FORM IN LINGUISTICS). However, perhaps could just as easily be adopted as a claim about a special more common in the recent history of philosophy is what sort of linguistic representation, one that allows us to say one might call the “revisionary conception” of Logical what we think is true, while freeing us from the absurd con- Form. According to it, natural language is defective in sequences endorsement of the original grammatical form some fundamental way. Appeals to Logical Form are would entail (Russell, after his rejection of propositions, appeals to a kind of linguistic representation which is himself construed it in this way). Thus arises a conception intended to replace natural language for the purposes of of Logical Form as a level of linguistic representation at scientific or mathematical investigation (e.g., Frege 1879, which the metaphysical commitments of the sentences are preface; Whitehead and Russell 1910, introduction; Rus- completely explicit. This conception of Logical Form was sell 1918, 58). later to achieve full articulation in the works of Quine (e.g., Nineteeth-century debates concerning the foundations of 1960, chaps. 5, 7). the calculus are one source of the revisionary flavor of some Nineteenth-century mathematics not only provided suc- contemporary conceptions of Logical Form (e.g., Quine cessful examples of notational revision; it also provided 1960, 248ff.). Perhaps the most vivid example is the over- many examples of notational confusion (Wilson 1994 throw of the infinitesimal calculus, which began with the argues that this was not necessarily a tragic situation). One work of Cauchy in the 1820s. Cauchy took the notation of of Frege’s central purposes (1879, Preface) was to provide a the infinitesimal calculus, which contained explicit refer- Logical Omniscience, Problem of 489 notation free of such confusion, devoid of vagueness, ambi- should be interested in arriving at an ideal representational guity, and context dependence, whose purpose was “to pro- system, one that may abstract from the defects of human vide us with the most reliable test of the validity of a chain natural language. Among such thinkers, the revisionary of inferences and to point out every presupposition that tries project of replacing natural language by a representation to sneak in unnoticed . . . .” Frege’s remarks here suggest a more suitable for scientific purposes is fundamental to the test of the validity of arguments in terms of the syntax of his aims of cognitive science. idealized language. According to this criterion, a chain of See also EXTENSIONALITY, THESIS OF; FREGE; LANGUAGE reasoning is logically valid if and only if its translation into AND THOUGHT; REFERENCE, THEORIES OF; SYNTAX-SEMAN- his idealized “Begriffsschrift” proceeds by transitions, each TICS INTERFACE of which is of the right syntactic form. Thus arises a concep- —Jason Stanley tion of Logical Form as a linguistic representation for which it is possible to designate certain syntactic forms such that References all and only basic logical transitions are translatable into instances of those forms (see LOGIC). Frege, G. (1879). Begriffsschrift: A formula language, modeled The purpose of formalization into Logical Form is to upon that of arithmetic, for pure thought. In Van Heijenoort, replace one notation by another, which has desirable proper- Ed., From Frege to Godel: A Sourcebook in Mathematical ties which the original notation lacked. In the case of the Logic, 1879–1931. Cambridge: Harvard University Press, 1967, pp. 5–82. Quinean conception of Logical Form, the purpose of for- Harman, G. (1972). Deep structure as logical form. In D. Davidson malization is to replace a notation (in the usual case, natural and G. Harman, Eds., Semantics of Natural Language. Dor- language) by one which more accurately reveals ontological drecht: Reidel. commitments. In the case of the Fregean conception, the Montague, R. (1973). The proper treatment of quantification in purpose of formalization is to replace notations which ordinary English. In J. Hintikka, H. Moravcsik, and P. Suppes, obscure the logical structure of sentences by one which Eds., Approaches to Natural Language: Proceedings of the makes this structure explicit (these are not necessarily con- 1970 Stanford Workshop on Grammar and Semantics. Dor- flicting goals). Many other conceptions of Logical Form drecht: Reidel, pp. 221–242. have been proposed. For example, Richard Montague’s Quine, W. V. O. (1960). Word and Object. Cambridge, MA: MIT favored way of giving an interpretation to a fragment of nat- Press. Russell, B. (1903). Principles of Mathematics. Cambridge: Cam- ural language involved translating that fragment into an arti- bridge University Press. ficial language, for example, the language of “tensed Russell, B. (1905). On denoting. Mind 14: 479–493. intensional logic” (Montague 1973), and then giving a for- Russell, B. (1918/1985). The Philosophy of Logical Atomism. mal semantic interpretation to the artificial language. This LaSalle: Open Court. produces a conception of Logical Form as a level of linguis- Sainsbury, M. (1991). Logical Forms: An Introduction to Philo- tic representation at which a compositional semantics is best sophical Logic. London: Routledge and Kegan Paul. given (see COMPOSITIONALITY and SEMANTICS). Whitehead, A., and B. Russell. (1910). Principia Mathematica. Strictly revisionary conceptions of Logical Form involve Cambridge: Cambridge University Press. abstracting from features of the original notation that are Wilson, M. (1994). Can we trust Logical Form? Journal of Philos- problematic in various ways. Because notations may be ophy October 91: 519–544. Wittgenstein, L. (1922). Tractatus Logico-Philosophicus. Trans- problematic in some ways and not in others, in such a use of lated by C. K. Ogden. London: Routledge and Kegan Paul. “Logical Form,” there is no issue about what the “correct” Wittgenstein, L. (1929). Some remarks on Logical Form. Vol 9, notion of Logical Form is. Descriptive conceptions, on the Proceedings of the Aristotelean Society. other hand, involve claims about the “true” structure of the original notation, structure that is masked by the surface grammatical form. Someone who makes a proposal, in the Logical Omniscience, Problem of purely descriptive mode, about the “true” Logical Form of a natural language sentence thus runs the risk that her claims Knowers or believers are logically omniscient if they know will be vitiated by linguistic theory. To avoid this danger, or believe all of the consequences of their knowledge or most philosophers vascillate between a revisionary and beliefs. That is, x is a logically omniscient believer (knower) descriptive use of the expression “Logical Form.” if and only if the set of all of the propositions believed The tension between descriptive and revisionary concep- (known) by x is closed under logical consequence. It is obvi- tions of Logical Form mirrors a tension in the cognitive sci- ous that if belief and knowledge are understood in their ordi- ences generally. According to some, cognitive science nary sense, then no nonsupernatural agent, real or artificial, should be interested in explaining the possession of abstract will be logically omniscient. Despite this obvious fact, many cognitive abilities such as thinking and language use in formal representations of states of knowledge and belief, humans, and we should be interested, not in the most ideal and some explanations of what it is to know or believe, have representations of thought and language, but rather in how the consequence that agents are logically omniscient. This is humans think and speak. If so, we should be interested in why there is a problem of logical omniscience. Logical Form only insofar as it is plausibly associated with There are a number of different formal representations of natural language on some level of empirical analysis. knowledge and belief that face a problem of logical omni- According to others, cognitive sciences should be interested science. POSSIBLE WORLDS SEMANTICS for knowledge and in the abstract properties of thinking and speaking, and we 490 Logical Omniscience, Problem of belief, first developed by Jaakko Hintikka, represent a state the beliefs that are accessible or available to the agent, and of knowledge by a set of possible worlds—the worlds that to do this, we need an account of what it means for a belief are epistemically possible for the knower. According to this to be available. Because a belief might be available to influ- analysis, x knows that P if and only if P is true in all ence the rational behavior of an agent even if the agent is epistemically possible worlds. Epistemic models using this unable to produce or recognize a linguistic expression of the kind of analysis have been widely applied by theoretical belief, it will not suffice to distinguish articulate beliefs— computer scientists studying distributed systems (see MUL- those beliefs that an agent can express or to which he is dis- TIAGENT SYSTEMS), and by economists studying GAME THE- posed to assent. And because one’s beliefs about one’s ORY. As Hintikka noted from the beginning of the beliefs may themselves be merely implicit, and unavailable, development of semantic models for epistemic logic, this one cannot explain the difference between available and analysis implies that knowers are logically omniscient. merely implicit belief in terms of higher-order belief. Because all logical truths in any probability function It may appear to be an advantage of a LANGUAGE OF must receive probability one, and because any logical con- THOUGHT account of belief that it avoids the problem of log- sequences of a proposition P must receive at least as great a ical omniscience. If it is assumed that an agent’s explicit probability as P (at least if one holds fixed the context in beliefs are beliefs encoded in a mental language and stored which probability assessments are made; see RATIONAL in the “belief box,” then one’s theory will not imply that the DECISION MAKING), any use of probability theory to repre- consequences of the agent’s explicit beliefs are also explicit sent the beliefs and partial beliefs of an agent will face a beliefs. But explicit belief in this sense is neither necessary version of the problem of logical omniscience. nor sufficient for a plausible notion of accessible or available It is not only abstract formal representations, but also belief. Although the immediate and obvious consequences some philosophical explanations of the nature of belief and of one’s explicit beliefs may count intuitively as beliefs in knowledge that seem to imply that knowers and believers the ordinary sense, thus also as available beliefs even if they are logically omniscient. First, pragmatic or INTENTIONAL are not explicitly represented, beliefs that are explicitly rep- STANCE accounts of belief assume that belief and desire are resented may nevertheless remain inaccessible. If the set of correlative dispositions displayed in rational action. explicit beliefs is large, the search required to access an Roughly, according to such accounts, to believe that P is to explicit belief could be a nontrivial computational task. act in ways that would be apt in situations in which P A general problem for the analysis of available belief is (together with one’s other beliefs) is true. This kind of anal- that one can distinguish between beliefs that are available or ysis of belief will imply that believers are logically omni- accessible and those that are not only in relation to the par- scient. Second, because the logical consequences of any ticular uses of the belief. Consider the talented but inarticu- information carried by some state of a system are also infor- late chess player whose implicit knowledge of the strategic mation implicit in the state (see also INFORMATIONAL situation is available to guide her play, but not to explain or SEMANTICS and PROPOSITIONAL ATTITUDES), any account of justify her choices. knowledge based on INFORMATION THEORY will face a While it is obvious that real agents are never logically prima facie problem of logical omniscience. omniscient, it is not at all clear how to give a plausible There are two contrasting ways to reconcile a theory account of knowledge and belief that does not have the con- implying that agents are logically omniscient with the obvi- sequence that they are. ous fact that they are not. First, one may take the theory to See also COMPUTATIONAL COMPLEXITY; IMPLICIT VS. represent a special sense of knowledge or belief that EXPLICIT MEMORY; RATIONAL AGENCY diverges from the ordinary one. For example, one may take —Robert Stalnaker the theory to be modeling implicit knowledge, understood to include, by definition, all of the consequences of one’s Further Readings knowledge; ordinary knowers are logically omniscient with respect to their implicit knowledge, but have no extraordi- Dretske, F. (1981). Knowledge and the Flow of Information. Cam- nary computational powers. Alternatively, one may take the bridge, MA: MIT Press. theory to be modeling the knowledge (in the ordinary sense) Fagin, R., J. Y. Halpern, Y. Moses, and M. Y. Vardi. (1995). Rea- of an idealized agent, a fictional ideal agent with infinite soning about Knowledge. Cambridge, MA: MIT Press. computational capacities that enable her to make all her Hintikka, J. J. K. (1962). Knowledge and Belief. Ithaca, NY: Cor- implicit knowledge explicit. nell University Press. Either of these approaches may succeed in reconciling Levesque, H. J. (1984). A logic of implicit and explicit belief. In Proceedings of the Conference on Artificial Intelligence. Menlo the counterintuitive consequences of theories of belief and Park, CA: AAAI Press, pp. 188–202. knowledge with the phenomena, but there remains a prob- Lipman, B. L. (1994). An axiomatic approach to the logical omni- lem, for the first approach, of explaining what explicit science problem. In R. Fagin, Ed., Theoretical Aspects of Rea- knowledge is, and how it is distinguished from merely soning about Knowledge: Proceedings of the Fifth Conference. implicit knowledge. And the second approach must explain San Francisco: Morgan Kaufman, pp. 182–196. the relevance an idealized agent, all of whose implicit Parikh, R. (1987). Knowledge and the problem of logical omni- beliefs are available, to the behavior of real agents. If the science. In Z. W. Ras and M. Zemankova, Eds., Methodologies knowledge and beliefs of nonideal agents—agents who have of Intelligent Systems. The Hague: Elsevier, pp. 432–439. only BOUNDED RATIONALITY—are to contribute to an expla- Stalnaker, R. (1991). The problem of logical omniscience, I. Syn- nation of their behavior, we need to be able to distinguish these 89: 425–440. Logical Reasoning Systems 491 ble models of commonsense reasoning. The remaining Logical Reasoning Systems three approaches to symbolic computation all claim to be general purpose or domain independent. I will call these the “first-order approach,” the “higher-order approach,” and Logical reasoning systems derive sound conclusions from the “induction approach,” respectively, and discuss each in formal declarative knowledge. Such systems are usually turn. defined by abstract rules of inference. For example, the rule The first-order approach is based on making inferences of modus ponens states that given P, and “P implies Q” (usually written P → Q), we can infer Q. Logical systems from axioms written in first-order LOGIC, which includes a have a rich history, starting with Greek syllogisms and con- wide variety of resolution and term-rewriting systems (Fit- tinuing through the work of many prominent mathemati- ting 1996). For problems that can be naturally axiomatized cians such as DESCARTES, Leibniz, Boole, FREGE, Hilbert, in first-order logic, the first-order approach seems superior Gödel, and Cohen. (For a good discussion of the history of to other approaches. Unfortunately, most mathematical logical reasoning systems, see Davis 1983.) theorems and verification problems have no natural first- Logical reasoning provides a well-understood general order formulation. Although detailed discussion of first- method of symbolic COMPUTATION. Symbolic computation order logic and inference methods is beyond the scope of manipulates expressions involving variables. For example, a this entry, it is possible to give a somewhat superficial computer algebra system can simplify the expression x(x + description of its limitations. First-order logic allows us to 1) to x2+ x. The equation x(x + 1) = x2 + x is true under any state properties of concepts, but not generally to define interpretation of x as a real number. Unlike numerical com- concepts. For example, suppose we want to describe the putation, symbolic computation derives truths that hold concept of a finite set. We can write a formal expression under a wide variety of interpretations and can be used stating, “A set is finite if it is either empty or can be derived when only partial information is given, for example, that x is by adding a single element to some other finite set.” Unfor- some (unknown) real number. Logical inference systems tunately, this statement does not uniquely determine the can be used to perform symbolic computation with variables concept—it is also true that “a set is countable if it is either that range over conceptual domains such as sets, sequences, empty or can be derived by adding a single element to graphs, and computer data structures. Symbolic computa- some other countable set.” Statements true of a concept tion underlies essentially all modern efforts to formally ver- often fail to uniquely specify them. Finiteness is not defin- ify computer hardware and software. able in first-order logic. The bag of marbles inference men- There are at least two ways in which symbolic infer- tioned above implicitly relies on finiteness. Almost all ence is relevant to cognitive science. First, symbolic infer- program verfication problems involve concepts not defin- ence rules have traditionally been used as models of able in first-order logic. Methods of simulating more human mathematical reasoning. Second, symbolic compu- expressive logics in first-order logic are generally inferior tation also provides a model of certain commonsense in practice to systems specifically designed to go beyond inferences. For example, suppose one is given a bag of first-order logic. marbles and continues removing marbles from the bag for Higher-order systems allow quantification over predi- as long as it remains nonempty. People easily reach the cates (i.e., concepts; Gordon 1987). Finiteness is definable conclusion that, barring unusual or magical circum- in higher-order logics—by quantifying over predicates, we stances, the bag will eventually become empty. They reach can say that “finite” is the least predicate satisfying the con- this conclusion without being told any particular number dition given in the paragraph above. Unfortunately, because of marbles—they reach a conclusion about an arbitrary set higher-order logic makes automation difficult, computer s. Computer systems for drawing such conclusions based systems based on higher-order logic typically verify human- on “visualization” do not currently work as well as written derivations rather than attempt to find derivations approaches based on symbolic computation (McAllester automatically. 1991). Induction systems represent a middle ground between Here I will divide symbolic computation research into the expressively weak first-order resolution and term- five general approaches. First are the so-called symbolic rewriting systems and the expressively strong systems algebra systems such as Maple or Mathematica (Wolfram based on higher-order logic (Boyer and Moore 1979). 1996), designed to manipulate expressions satisfying cer- Induction systems are “first-order” in the sense that they do tain algebraic properties such as those satisfied by the real not typically allow quantification over predicates. But, numbers. Although they have important applications in the unlike true first-order systems, all objects are assumed to be physical sciences, these systems are not widely used for finite. A variable in a symbolic expression ranges over infi- hardware or software verification and seem too specialized nitely many different possible values, each of which is a to provide plausible models of commonsense reasoning. finite object. This is different from symbolic model check- Second are the symbolic model-checking systems (Burch ing, where each variable ranges over only a finite number of et al. 1990), which perform symbolic inference where the possible values. Also, induction systems allow well- variables range over finite sets such as the finite set of pos- founded recursive definitions of the kind used in functional sible states of a certain piece of computer hardware. programming languages. The underlying logic of an induc- Although very effective for the finitary problems where tion system is best viewed as a programming language, they apply, these systems are too restricted for software such as cons-car-cdr Lisp. But unlike traditional implemen- verification and also seem too restricted to provide plausi- tations of a programming language, an induction system 492 Long-Term Potentiation can derive symbolic equations that are true under any through activation of the postsynaptic N-methyl-D-aspartate allowed interpretation of the variables appearing in the (NMDA) type of glutamate receptor and consequent cal- expressions, such as the following: cium influx (Collingridge, Kehl, and McLennan 1983), while its expression is accompanied by an increase in append(x,append(y,z)) = append(append(x,y),z). postsynaptic current mediated by the AMPA (α-amino-3- hydroxy-5-methyl-4-isoxazole propionic acid) type of Induction systems seem most appropriate for program veri- glutamate receptor. In contrast, the induction of mossy fiber fication and for modeling commonsense reasoning about an LTP in area CA3 is not dependent on NMDA receptor activa- arbitrary (or unknown) finite set. tion, but is dependent on an increase in presynaptic See also CAUSAL REASONING; DEDUCTIVE REASONING; glutamate release (Castillo et al. 1997). FORMAL SYSTEMS, PROPERTIES OF; MENTAL MODELS; PROB- A primary focus of those involved in LTP research is to ABILISTIC REASONING; RULES AND REPRESENTATIONS elucidate the cellular and molecular mechanisms necessary for sustaining the increase in synaptic efficacy over long —David McAllester periods of time. Most of these studies are conducted in neu- ral tissue that is excised and maintained in an in vitro slice References environment for physiological recording. Using this tech- nique, it has been determined that the late phases of LTP Boyer, R. S., and J. S. Moore. (1979). A Computational Logic. maintenance are dependent on protein synthesis and there is ACM Monograph Series. Academic Press. some evidence that gene transcription is required. Although Burch, J. R., E. M. Clarke, K. L. McMillan, D. L. Dill, and J. Hwang. (1990). Symbolic model checking: 1020 states and controversial, it has been proposed that LTP in the dentate gyrus and CA1 is expressed as an increase in affinity or beyond. In Proceedings of the Fifth Annual IEEE Symposium number of postsynaptic AMPA receptors (Lynch and on Logic in Computer Science. IEEE Computer Society Press. Baudry 1991). Others postulate that the expression is medi- Davis, M. (1983). The prehistory and early history of automated ated by a persistent increase in the release of presynaptic deduction. In J. Seikmann and G. Wrightson, Eds., Automation glutamate, which is induced by a release of retrograde mes- of Reasoning, vol. 1. Springer. sengers from the postsynaptic neuron during LTP induction. Fitting, M. (1996). First-Order Logic and Automated Theorem In area CA1 and the dentate gyrus, the long-term expression Proving. 2nd ed. Springer. of LTP is likely to be mediated by a combination of postsyn- Gordon, M. (1987). Hol: A proof generating system for higher- aptic and presynaptic events, and a consequence of activa- order logic. In G. Birtwistle and P. A. Subrahmanyam, Eds., tion of various enzyme systems (Roberson, English, and VLSI Specification, Verification, and Synthesis. Kluwer. Sweatt 1996; Abel et al. 1997). There is some evidence that McAllester, D. (1991). Some observations on cognitive judgments. LTP induces structural changes at the synapse (Buchs and In AAAI-91 Kaufmann, pp. 910–915. Wolfram, S. (1996). Mathematica. 3rd ed. Wolfram Media and Muller 1996), as well as induction of new sites of synaptic Cambridge University Press. transmission (Bolshakov et al. 1997). Based on this cursory review, it should be clear that LTP is a complex phenome- non that involves the interaction of multiple cellular and molecular systems; the exact contribution of each has yet to Long-Term Potentiation be determined. In addition to being a robust example of persistent Long-term potentiation (LTP) is operationally defined as a changes in synaptic plasticity, LTP has been promoted as a long-lasting increase in synaptic efficacy in response to putative neural mechanism of associative MEMORY forma- high-frequency stimulation of afferent fibers. The increase tion or storage in the mammalian brain. It is generally in synaptic efficacy persists from minutes to days and is believed that memory formation occurs through a strength- thus a robust example of a long-term increase in synaptic ening of connections between neurons. In 1949, Donald strength. LTP was first observed in the rabbit hippocampus HEBB wrote that, “When an axon of cell A ... excite(s) cell B (Bliss and Lomo 1973), but has since been observed in and repeatedly or persistently takes part in firing it, some numerous brain structures, including the cortex, brain stem, growth process or metabolic change takes place in one or and amygdala. LTP is not limited to the mammalian brain, both cells so that A’s efficiency as one of the cells firing B is but occurs in other vertebrates such as fish, frogs, birds, and increased” (p. 62). This supposition, known as Hebb’s rule, reptiles, as well as in some invertebrates (Murphy and Glan- is similar to the operational definition of LTP and is often zman 1997). cited as theoretical support for the putative role of LTP in LTP occurs at all three major synaptic connections in the learning and memory. In addition to theoretical support, the HIPPOCAMPUS: the perforant path synapse to dentate gyrus biological characteristics of LTP are in some respects simi- granule cells, mossy fibers to CA3 pyramidal cells, and the lar to those of memory. First, LTP is prominent in the hip- Schaffer collaterals of CA3 cells to CA1 pyramidal cells. pocampus, a structure considered necessary for aspects of Based on its prevalence and initial discovery there, LTP is declarative and spatial memory (Squire 1992). Second, LTP most often studied in the hippocampus. Within the hippoc- is long lasting, as is memory. Most forms of electrophysio- ampus, the cellular and molecular mechanisms that underlie logical plasticity last milliseconds to seconds, while LTP the induction and expression of LTP are varied. In the den- persists from minutes to hours, even days (Staubli and tate gyrus and area CA1, the induction of LTP occurs Lynch 1987). In addition, LTP possesses physiological cor- Long-Term Potentiation 493 relates of associativity and cooperativity, both properties of tic efficacy, such as LTP, are necessary for memory storage, learning associated with classical CONDITIONING (Brown, whether they modify the rate and efficiency of memory for- Kairiss, and Keenan 1990). Finally, hippocampal LTP is mation (Shors and Matzel 1997), or do neither. optimally induced with a pattern of stimulation that mimics See also MEMORY, ANIMAL STUDIES; MEMORY STORAGE, “theta,” a naturally occurring brain rhythm (Larson and MODULATION OF; NEUROTRANSMITTERS Lynch 1989). Theta rhythms are most often associated with —Tracey J. Shors motor activity and dreaming (Vanderwolf and Cain 1994), although they have been reported to occur in the hippocam- References pus during learning (Otto et al. 1991) and stressful experi- ence (Vanderwolf and Cain 1994; Shors and Matzel 1997). Abel, T., P. V. Nguyen, M. Barad, T. Deuel, E. R. Kandel, and R. Bourtchouladze. (1997). Genetic demonstration of a role for Many behavioral studies addressing the role of LTP in PKA in the late phase of LTP and in hippocampus-based long- memory take advantage of the fact that most types of LTP term memory. Cell 88: 615–626. are dependent on NMDA receptor activation (Collingridge, Bannerman, D. M., M. A. Good, S. P. Butcher, M. Ramsay, and R. Kehl, and McLennan 1983). When these receptors are G. M. Morris. (1995). Distinct components of spatial learning blocked with competitive antagonists, rats are impaired in revealed by prior train and NMDA receptor blockade. Nature their ability to perform the Morris water maze, a spatial 378: 182–186. memory task that requires the hippocampus for acquisition. Bliss, T. V. P., and T. Lomo. (1973). Long-lasting potentiation of Recent evidence suggests that even during NMDA receptor synaptic transmission in the dentate gyrus of the anesthetized blockade, rats can learn the location of new spatial cues if rabbit following stimulation of the perforant path. Journal of they were previously trained on a similar spatial task (Sauc- Physiology 232: 331–356. Bolshakov, V. Y., H. Golan, E. R. Kandel, and S. A. Siegelbaum. ier and Cain 1995; Bannerman et al. 1995). Thus, NMDA (1997). Recruitment of new sites of synaptic transmission dur- receptor activation may not be necessary for learning about ing the cAMP-dependent late phase of LTP at CA3-CA1 syn- spatial cues per se, but may be involved in other aspects of apses in the hippocampus. Neuron 19: 635–651. performance necessary for successful completion of the task Brown, T. H., E.W. Kairiss, and C. L. Keenan. (1990). Hebbian (Morris and Frey 1997; Shors and Matzel 1997). In addition synapses: Biophysical mechanisms and algorithms. Annual to maze performance, NMDA receptor antagonists prevent Review of Neuroscience 13: 475–511. fear conditioning (Kim et al. 1991), fear-potentiated startle Buchs, P. A., and D. Muller. (1996). Induction of long-term poten- (Campeau, Miserendino, and Davis 1992) and classic eye- tiation is associated with major ultrastructural changes of acti- blink conditioning (Servatius and Shors 1996). These tasks vated synapses. Proceedings of the National Academy of are not dependent on the hippocampus but rather are depen- Sciences 93: 8040–8045. Campeau, S., M. J. Miserendino, and M. Davis. (1992). Intra- dent on the AMYGDALA and CEREBELLUM, respectively. amygdaloid infusion of the N-methyl-d-aspartate receptor Thus, if LTP does play a role in memory, it may not be lim- antagonist AP5 blocks acquisition but not expression of fear- ited to hippocampal-dependent memories. potentiated startle to a conditioned stimulus. Behavioral Neuro- The relationship between LTP and learning has also science 106: 569–574. been addressed using genetic techniques. Using a trans- Castillo, P. E., R. Janz, T. C. Sudhof, T. Tzounopoulos, R. C. genic mouse in which a mutated and calcium-independent Malenka, and R. A. Nicoll. (1997). Rab3A is essential for form of calmodulin (CaM) kinase II was expressed, mossy fiber long-term potentiation in the hippocampus. Nature researchers reported that LTP in response to theta-burst 388: 590–593. stimulation was reduced, as was the acquisition of spatial Collingridge, C. L., S. J. Kehl, and H. McLennan. (1983). Excita- memories. In addition, the mutant mice possessed unstable tory amino acids in synaptic transmission in the Schaffer-com- missural pathway of the rat hippocampus. Journal of and imprecise place cells in the hippocampus (Rotenberg et Physiology 334: 33–46. al. 1996). In another study, researchers expressed an inhibi- Hebb, D. O. (1949). The Organization of Behavior. New York: tory form of a protein kinase A regulatory subunit in mice Wiley. and observed deficits in the late phase of LTP as well as Kim, J. J., J. P. DeCola, J. Landeira-Fernandex, and M. S. deficits in hippocampal-dependent conditioning (Abel et al. Fanselow. (1991). N-methyl-D-aspartate receptor antagonist 1997). Because the genes are altered throughout the life APV blocks acquisition but not expression of fear conditioning. span, some deficits in plasticity and learning could be due Behavioral Neuroscience 1005: 160–167. to the abnormal developmental responses. Recently, tran- Larson, J., and G. Lynch. (1989). Theta pattern stimulation and the sient knockouts have become available, providing more induction of LTP: The sequence in which synapses are stimu- temporally and anatomically discrete lesions. Removal of a lated determines the degree to which they potentiate. Brain Research 489: 49–58. specific subunit of the NMDA receptor in the hippocampus Lynch, G., and M. Baudry. (1991). Reevaluating the constraints on after development disrupted LTP and spatial learning in the hypotheses regarding LTP expression. Hippocampus 1: 9–14. Morris water maze (Tsien, Huerta, and Tonegawa 1996). Morris, R. G., and U. Frey. (1997). Hippocampal synaptic plastic- A long-term increase in synaptic strength and efficacy is ity: Role in spatial learning or the automatic recording of considered by many to be the best candidate to date for attended experience? Philosophical Transactions of the Royal mediating the storage and retrieval of memories in the mam- Society of London. Series B: Biological Sciences 352: 1489– malian brain. This application of synaptic potentiation 1503. would constitute a memory system with massive storage Murphy, G. G., and D. L. Glanzman. (1997). Mediation of classi- capacity and fine resolution. Although appealing in princi- cal conditioning in Aplysia californica by long-term potentia- ple, it remains to be determined whether increases in synap- tion of sensorimotor synapses. Science 278: 467–471. 494 LOT neously. One hand is to be held steady while the other is Otto, T., H. Eichenbaum, S. I. Wiener, and C. G. Wible. (1991). Learning-related patterns of CA1 spike trains parallel stimula- used to press a key or squeeze a rubber bulb in response to tion parameters optimal for inducing hippocampal long-term verbal stimuli presented by the experimenter, to which the potentiation. Hippocampus 1: 181–192. subject is asked to respond verbally with the first word to Roberson, E. D., J. D. English, and J. D. Sweatt. (1996). A bio- come to mind. Preliminary trials are presented until a steady chemist’s view of long-term potentiation. Learning and Mem- baseline of coordination is established. At this point, “criti- ory 3: 1–24. cal” stimuli which the experimenter believes to be related to Rotenberg, A., M. Mayford, R. D. Hawkins, E. R. Kandel, and R. specific thoughts in the subject are presented. Evidence for U. Muller. (1996). Mice expressing activated CaMKII lack low- the ability to “read the subject’s mind” is the selective dis- frequency LTP and do not form stable place cells in the CA1 ruption of the previously established coordinated system by region of the hippocampus. Cell 87: 1351–1361. the critical test stimuli. This method was applied to a variety Saucier, D., and D. P. Cain. (1995). Spatial learning without NMDA receptor-dependent long-term potentiation. Nature 378: of naturally occurring and experimentally induced cases, 186–189. providing a model system for psychodiagnosis that won Servatius, R. J., and T. J. Shors. (1996). Early acquisition but not widespread attention in the West when it was published. The retention of the classically conditioned eyeblink response is N- book describing these studies has to date never been pub- methyl-d-aspartate (NMDA) receptor dependent. Behavioral lished in Russian, owing to its association with psychoana- Neuroscience 110: 1040–1048. lytic theorizing which was disapproved of by Soviet Shors, T. J., and L. D. Matzel. (1997). Long-term potentiation authorities. (LTP): What’s learning got to do with it? Behavioral and Brain In 1924 Luria met Lev Semionovich VYGOTSKY, whose Sciences 20: 597–655. influence was decisive in shaping his future career. Together Squire, L. (1992). Memory and the hippocampus: A synthesis with Vygotsky and Alexei Nikolaivitch Leontiev, Luria from findings in rats, monkeys, and humans. Psychological Review 99: 195–231. sought to establish an approach to psychology that would Staubli, U., and G. Lynch. (1987). Stable hippocampal long-term enable them to “discover the way natural processes such as potentiation elicited by “theta” pattern stimulation. Brain physical maturation and sensory mechanisms become inter- Research 435: 227–234. twined with culturally determined processes to produce the Tsien, J. Z., P. T. Huerta, and S. Tonegawa. (1996). The essential psychological functions of adults” (Luria 1979: 43). role of hippocampal CA1 NMDA receptor-dependent synaptic Vygotsky and his colleagues referred to this new approach plasticity in spatial memory. Cell 87: 264–297. variably as “cultural,” “historical,” and “instrumental” psy- Vanderwolf, C. H., and D. P. Cain. (1994). The behavioral neurobi- chology. These three labels all index the centrality of cul- ology of learning and memory: A conceptual reorientation. tural mediation in the constitution of specifically human Brain Research Reviews 19: 264–297. psychological processes, and the role of the social environ- ment in structuring the processes by which children appro- LOT priate the cultural tools of their society in the process of ontogeny. An especially heavy emphasis was placed on the role of language, the “tool of tools” in this process; the See LANGUAGE OF THOUGHT acquisition of language was seen as the pivotal moment when phylogeny and cultural history are merged to form LTP specifically human forms of thought, feeling, and action. From the late 1920s until his death, Luria sought to elab- orate this synthetic, cultural-historical psychology in differ- See LONG-TERM POTENTIATION ent content areas of psychology. In the early 1930s he led two expeditions to central Asia where he investigated Luria, Alexander Romanovich changes in perception, problem solving, and memory asso- ciated with historical changes in economic activity and Alexander Romanovich Luria (1902–1977) was born in schooling. During this same period he carried out studies of Kazan, an old Russian university city east of Moscow. He identical and fraternal twins raised in a large residential entered Kazan University at the age of 16 and obtained his school to reveal the dynamic relations between phylogenetic degree in 1921 at the age of 19. While still a student, he and cultural-historical factors in the development of LAN- established the Kazan Psychoanalytic Association and GUAGE AND THOUGHT. planned on a career in psychology. His earliest research In the late 1930s, largely to remove himself from public sought to establish objective methods of assessing Freudian view during the period of purges initiated by Stalin, Luria ideas about abnormalities of thought and the effects of entered medical school where he specialized in the study of fatigue on mental processes. aphasia, retaining his focus on the relation between lan- In 1923 Luria’s use of reaction time measures to study guage and thought in a politically neutral arena. The onset thought processes in the context of work settings won him a of World War II made his specialized knowledge of crucial position at the Institute of Psychology in Moscow where he importance to the Soviet war effort, and the widespread developed a psychodiagnostic procedure, the “combined availability of people with various forms of traumatic brain motor method,” for diagnosing individual subjects’ thought injury provided him with voluminous materials for develop- processes. In this method (described in detail in Luria ing his theories of brain function and methods for the reme- 1932), subjects are asked to carry out three tasks simulta- diation of focal brain lesions. It was during this period that Machiavellian Intelligence Hypothesis 495 he developed the systematic approach to brain and cognition ligence. The new “social” explanation for the evolution of which has come to be known as the discipline of neuropsy- INTELLIGENCE arose in the context of proliferating field chology. Central to his approach was the belief that “to studies of primate societies in the 1960s and 1970s. The understand the brain foundations for psychological activity, paper generally recognized as pivotal in launching this wave one must be prepared to study both the brain and the system of studies was Nicholas Humphrey’s “The Social Function of activity” (1979: 173). This insistence on linking brain of Intellect” (1976), the first to spell out the idea explicitly, structure and function to the proximal, culturally organized although important insights were offered by earlier writers, environment provides the thread of continuity between the notably Alison Jolly (see Whiten and Byrne 1988a for a early and later parts of Luria’s career. review). By 1988 the idea had inspired sufficient interesting Following the war Luria sought to continue his work in empirical work to produce the volume Machiavellian Intel- neuropsychology. His plans were interrupted for several ligence (Byrne and Whiten 1988; now see also Whiten and years when he was removed from the Institute of Neurosur- Byrne 1997), which christened the area. gery during a period of particularly virulent antisemitic The Machiavellian intelligence hypothesis has been rec- repression. During this time he pursued his scientific inter- ognized as significant beyond the confines of primatology, ests through a series of studies of the development of lan- however. On the one hand, it is relevant to all the various guage and thought in mentally retarded children. disciplines that study human cognitive processes. Because In the late 1950s Luria was permitted to return to the the basic architecture of these processes is derived from the legacy of our primate past, a more particular Machiavellian study of neuropsychology, which he pursued until his death subhypothesis is that developments in specifically human of heart failure in 1977. In the years just prior to his death, intelligence were also most importantly shaped by social he returned to his earliest dreams of constructing a unified complexities. On the other hand, looking beyond primates, psychology. He published two case studies, one of a man the hypothesis has been recognized as of relevance to any with an exceptional and idiosyncratic memory (Luria 1968), species of animal with sufficient social complexity. the other of a man who suffered a traumatic brain injury Why “Machiavellian” intelligence? Humphrey talked of (Luria 1972). These two case studies illustrate his blend of “the social function of intellect” and some authors refer to classic, experimental approaches with clinical and remedia- the “social intelligence hypothesis” (Kummer et al. 1997). tional approaches, a synthesis that stands as a model for late But “social” is not really adequate as a label for the hypoth- twentieth-century cognitive science. esis. Many species are social (some living in much larger groups than primates) without being particularly intelligent; See also COGNITIVE DEVELOPMENT; CULTURAL PSY- what is held to be special about primate societies is their CHOLOGY; PIAGET, JEAN; PSYCHOANALYSIS, HISTORY OF complexity, which includes the formation of sometimes —Michael Cole fluid and shifting alliances and coalitions. Within this con- text, primate social relationships have been characterized as Works By A. R. Luria manipulative and sometimes deceptive at sophisticated lev- els (Whiten and Byrne 1988b). Primates often act as if they Luria, A. R. (1932). The Nature of Human Conflicts. New York: were following the advice that Niccolo Machiavelli offered Liveright. to sixteenth-century Italian prince-politicians to enable them Luria, A. R. and F. A. Yudovich. (1959). Speech and the Develop- to socially manipulate their competitors and subjects ment of Mental Processes. London: Staples Press. Luria, A. R. (1960). The Role of Speech in the Regulation of Nor- (Machiavelli 1532; de Waal 1982). “Machiavellian intelli- mal and Abnormal Behavior. New York: Irvington. gence” therefore seemed an appropriate label, and it has Luria, A. R. (1966). Higher Cortical Functions in Man. New York: since passed into common usage. Basic Books. An important prediction of the hypothesis is that greater Luria, A. R. (1970). Traumatic Aphasia: Its Syndromes, Psychol- social intellect in some members of a community will exert ogy, and Treatment. The Hague: Mouton. selection pressures on others to show greater social exper- Luria, A. R. (1968). The Mind of Mnemonist. New York: Basic tise, so that over evolutionary time there will be an “arms Books. race” of Machiavellian intelligence. Indeed, one of the ques- Luria, A. R. (1972). The Man with a Shattered World. New York: tions the success of the hypothesis now begins to raise is Basic Books. why such escalation has not gone further than it has in many Luria, A. R. (1973). The Working Brain. New York: Basic Books. Luria, A. R. (1979). The Making of Mind. Cambridge, MA: Har- species. vard University Press. But the way in which the hypothesis highlights competi- tive interactions must not be interpreted too narrowly. “Machiavellianism” in human affairs is often taken to Machiavellian Intelligence Hypothesis include only a subset of social dealings characterized by their proximally selfish and exploitative nature (Wilson, The Machiavellian intelligence hypothesis takes several Near, and Miller 1996). Although animal behavior is forms, but all stem from the proposition that the advanced expected to be ultimately selfish in the face of natural selec- cognitive processes of primates are primarily adaptations to tion (by definition a competition), COOPERATION with some the special complexities of their social lives, rather than to individuals against others can be one means to that end. Pri- nonsocial environmental problems such as finding food, mate coalitions provide good examples. Indeed, because an which were traditionally thought to be the province of intel- important component of exploiting one’s social environment 496 Machiavellian Intelligence Hypothesis includes learning socially from others, primate “culture” Barton, R. A., and R. I. M. Dunbar. (1997). Evolution of the social brain. In A. Whiten and R. W. Byrne, Eds., Machiavellian also comes within the scope of Machiavellian intelligence. Intelligence. Vol. 2, Evaluations and Extensions. Cambridge: As noted at the outset, the Machiavellian hypothesis is not Cambridge University Press, pp. 240–263. so much a single hypothesis as a cluster of related hypothe- Byrne, R. W., and A. Whiten. (1988). Machiavellian Intelligence: ses about the power of social phenomena to shape cognitive Social Expertise and the Evolution of Intellect in Monkeys, processes. Two main variants may be distinguished here. In Apes and Humans. Oxford: Oxford University Press. one version of the hypothesis, intelligence is seen as a rela- Cheney, D. L., and R. M. Seyfarth. (1990). How Monkeys See the tively domain-general capacity, with degrees of intelligence World: Inside the Mind of Another Species. Chicago: University in principle distinguishable among different taxa of animals. of Chicago Press. In this case, the hypothesis proposes that different grades of Cosmides, L. (1989). The logic of social exchange: Has natural intelligence will be correlated significantly and most closely selection shaped how humans reason? Studies with the Wason selection task. Cognition 31: 187–276. with variations in the social complexity of the taxa con- de Waal, F. (1982). Chimpanzee Politics. London: Cape. cerned. This should apply to any taxon with the right kind of Dunbar, R. I. M. (1995). Neocortex size and group size in pri- social complexity. Although primate research was the arena mates: A test of the hypothesis. Journal of Human Evolution from which the ideas sprang, related kinds of complexity are 28: 287–296. now being described in other taxa, like the alliances of dol- Harcourt, A. H., and F. B. deWaal. (1992). Coalitions and Alli- phins and hyenas (Harcourt and de Waal 1992). ances in Humans and Other Animals. Oxford: Oxford Univer- Another version of the hypothesis proposes that the very sity Press. nature of the cognitive system will be shaped to handle social Humphrey, N. K. (1976). The social function of intellect. In P. P. phenomena: a domain-specific social intelligence (see G. Bateson and R. A. Hinde, Eds., Growing Points in Ethol- DOMAIN SPECIFICITY). This possibility has been examined in ogy. Cambridge: Cambridge University Press, 1976, pp. 303– 321. both human and nonhuman primates, with the bulk of work Kummer, H., L. Daston, G. Gigerenzer, and J. Silk. (1997). The done on humans. Influential research includes the work of social intelligence hypothesis. In P. Weingart, P. Richerson, S. D. Cosmides (1989) on the power of cheater detection mecha- Mitchell, and S. Maasen, Eds., Human by Nature: Between Biol- nisms to handle logical problems humans find difficult in ogy and the Social Sciences. Hillsdale, NJ: Erlbaum, pp. 157–179. equivalent nonsocial contexts (see EVOLUTIONARY PSYCHOL- Machiavelli, N. (1532). The Prince. Harmondsworth, England: OGY). Another well-documented case is our everyday THEORY Penguin, 1961. OF MIND, whose social domain specificity is highlighted by Whiten, A., and R. W. Byrne. (1988a). The Machiavellian intellect autistic individuals’ difficulty in reading other minds, despite hypotheses. In R. W. Byrne and A. Whiten, Eds., Machiavel- high levels of nonsocial intelligence (Baron-Cohen 1995; see lian Intelligence. Oxford: Oxford University Press, pp. 1–9. AUTISM). For nonhuman primates, Cheney and Seyfarth Whiten, A., and R. W. Byrne. (1988b). Tactical deception in pri- mates. Behavioral and Brain Sciences 11: 233–273. (1990) report the results of both observational and experimen- Whiten, A., and R. W. Byrne. (1997). Machiavellian Intelligence. tal studies in which vervet monkeys demonstrate social exper- Vol. 2, Evaluations and Extensions. Cambridge: Cambridge tise in excess of that operating in equivalent nonsocial University Press. contexts. For example, the monkeys may discriminate as tar- Wilson, D. S., D. Near, and R. R. Miller. (1996). Machiavellian- gets of aggression those individuals whose kin have fought ism: A synthesis of the evolutionary and psychological litera- their own kin, yet fail to read the signs of a recent python track tures. Psychological Bulletin 119: 285–299. entering a bush (see SOCIAL COGNITION IN ANIMALS). A different kind of test of the Machiavellian hypothesis is Further Readings based on examining the correlates of relative brain size Brothers, L. (1990). The social brain: A project for integrating pri- across primate and other animal taxa. Contrary to some ear- mate behavior and neurophysiology in a new domain. Concepts lier findings, the strongest predictors of encephalization that in Neuroscience 1: 27–51. have emerged consistently in recent studies are not measures Byrne, R. W. (1995). The Thinking Ape: Evolutionary Origins of of physical ecological complexity such as home range size, Intelligence. Oxford: Oxford University Press. but the size of the social group or clique, an indicator (even Byrne, R. W., and A. Whiten. (1990). Tactical deception in pri- if a crude one) of social complexity (Dunbar 1995; Barton mates: The 1990 database. Primate Report 27: 1–101. and Dunbar 1997). Although this approach conflates the two Crow, T. J. (1993). Sexual selection, Machiavellian intelligence, alternative versions of the hypothesis discriminated above and the origins of psychosis. Lancet 342: 594–598. (because we do not know how modular the mechanisms are Dunbar, R. I. M. (1992). Neocortex size as a constraint on group that contribute to greater encephalization) the results obvi- size in primates. Journal of Human Evolution 20: 469–493. Erdal, D., and A. Whiten. (1996). Egalitarianism and Machiavel- ously support the Machiavellian intelligence hypothesis in lian intelligence in human evolution. In P. Mellars and K. Gib- the general form stated at the start of this entry. son, Eds., Modelling the Early Human Mind. Cambridge, See also COGNITIVE ETHOLOGY; MODULARITY OF MIND; England: McDonnell Institute, pp. 139–150. PRIMATE COGNITION; SOCIAL COGNITION Sambrook, T., and A. Whiten. (1997). On the nature of complexity —Andrew Whiten in cognitive and behavioural science. Theory and Psychology 7: 191–213. Venables, J. (1993). What is News? Huntingdon, England: ELM. References Whiten, A. (1991). Natural Theories of Mind: Evolution, Develop- Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and ment and Simulation of Everyday Mindreading. Oxford: Black- Theory of Mind. Cambridge, MA: The MIT Press. well. Machine Learning 497 wise, it branches to the right child. The test at the child node Whiten, A. (1993). Deception in animal communication. In R. E. Asher, Ed., The Pergamon Encylopedia of Language and Lin- is then applied, recursively, until one of the leaf nodes of the guistics, vol. 2. Pergamon Press, pp. 829–832. tree is reached. The leaf node gives the predicted classifica- Whiten, A. (1996). Imitation, pretence and mindreading: Second- tion of the example. ary representation in comparative primatology and develop- C4.5 searches the space of decision trees through a con- mental psychology. In A. Russon, K. A. Bard, and S. T. Parker, structive search. It first considers all trees consisting of only Eds., Reaching into Thought: The Minds of the Great Apes. a single root node and chooses one of those. Then it consid- Cambridge University Press, pp. 300–324. ers all trees having that root node and various left children, Whiten, A. (Forthcoming). Primate culture and social learning. and chooses one of those, and so on. This process constructs Cognitive Science. the tree incrementally with the goal of finding the decision tree that minimizes the so-called pessimistic error, which is Machine Learning an estimate of classification error of the tree on new training examples. It is based on taking the upper endpoint of a con- fidence interval for the error of the tree (computed sepa- The goal of machine learning is to build computer systems rately for each leaf). that can adapt and learn from their experience. Different Although C4.5 constructs its classifier, other learning learning techniques have been developed for different perfor- algorithms begin with a complete classifier and modify it. mance tasks. The primary tasks that have been investigated For example, the backpropagation algorithm for learning are SUPERVISED LEARNING for discrete decision-making, neural networks begins with an initial neural network and supervised learning for continuous prediction, REINFORCE- computes the classification error of that network on the MENT LEARNING for sequential decision making, and UNSU- training data. It then makes small adjustments in the weights PERVISED LEARNING. of the network to reduce this error. This process is repeated The best-understood task is one-shot decision making: until the error is minimized. the computer is given a description of an object (event, situ- There are two fundamentally different theories of ation, etc.) and it must output a classification of that object. machine learning. The classical theory takes the view that, For example, an optical character recognizer must input a before analyzing the training examples, the learning algo- digitized image of a character and output the name of that rithm makes a “guess” about an appropriate space of classi- character (“A” through “Z”). A machine learning approach fiers to consider (e.g., it guesses that decision trees will be to constructing such a system would begin by collecting better than neural networks). The algorithm then searches training examples, each consisting of a digitized image of a the chosen space of classifiers hoping to find a good fit to character and the correct name of that character. These the data. The Bayesian theory takes the view that the would be analyzed by a learning ALGORITHM to produce an designer of a learning algorithm encodes all of his or her optical character recognizer for classifying new images. prior knowledge in the form of a prior probability distribu- Machine learning algorithms search a space of candidate tion over the space of candidate classifiers. The learning classifiers for one that performs well on the training exam- algorithm then analyzes the training examples and computes ples and is expected to generalize well to new cases. Learn- the posterior probability distribution over the space of clas- ing methods for classification problems include DECISION sifiers. In this view, the training data serve to reduce our TREES, NEURAL NETWORKS, rule-learning algorithms (Cohen remaining uncertainty about the unknown classifier. 1995), nearest-neighbor methods (Dasarathy 1991), and cer- These two theories lead to two different practical tain kinds of BAYESIAN NETWORKS. approaches. The first theory encourages the development of There are four questions to answer when developing a large, flexible hypothesis spaces (such as decision trees and machine learning system: neural networks) that can represent many different classifi- 1. How is the classifier represented? ers. The second theory encourages the development of rep- 2. How are the examples represented? resentational systems that can readily express prior 3. What objective function should be employed to evaluate knowledge (such as Bayesian networks and other stochastic candidate classifiers? models). 4. What search algorithm should be used? The discussion thus far has focused on discrete classifi- cation, but the same issues arise for the second learning Let us illustrate these four questions using two of the task: supervised learning for continuous prediction (also most popular learning algorithms, C4.5 and backpropaga- called “regression”). In this task, the computer is given a tion. description of an object and it must output a real-valued The C4.5 algorithm (Quinlan 1993) represents a classi- quantity. For example, given a description of a prospective fier as a decision tree. Each example is represented as a vec- student (high school grade-point average, SAT scores, etc.), tor of features. For example, one feature describing a the system must predict the student’s college grade-point printed character might be whether it has a long vertical line average (GPA). The machine learning approach is the same: segment (such as the letters B, D, E, etc.). a collection of training examples describing students and Each node in the decision tree tests the value of one of their college GPAs is provided to the learning algorithm, the features and branches to one of its children, depending which outputs a predictor to predict college GPA. Learning on the result of the test. A new example is classified by methods for continuous prediction include neural networks, starting at the root of the tree and applying the test at that regression trees (Breiman et al. 1984), linear and additive node. If the test is true, it branches to the left child; other- 498 Machine Translation models (Hastie and Tibshirani 1990), and kernel regression Once a stochastic model has been fitted to a collection of methods (Cleveland and Devlin 1988). Classification and objects, that model can be applied to classify new objects. prediction are often called “supervised learning” tasks, Given a new astronomical object, we can determine which because the training data include not only the input objects multivariate Gaussian is most likely to have generated it, but also the corresponding output values (provided by a and we can assign it to the corresponding group. Similarly, “supervisor”). given a new speech signal, we can determine which HMM We now turn from supervised learning to reinforcement is most likely to have generated it, and thereby guess which learning tasks, most of which involve sequential decision word was spoken. A general algorithm schema for training making. In these tasks, each decision made by the computer both HMMs and mixture models is the expectation maximi- affects subsequent decisions. For example, consider a zation (EM) algorithm (Dempster, Laird, and Rubin 1976). computer-controlled robot attempting to navigate from a See also COMPUTATIONAL LEARNING THEORY; DECISION hospital kitchen to a patient’s room. At each point in time, MAKING; EXPLANATION-BASED LEARNING; INDUCTIVE the computer must decide whether to move the robot for- LOGIC PROGRAMMING; LEARNING; PATTERN RECOGNITION ward, left, right, or backward. Each decision changes the AND LAYERED NETWORKS; RATIONAL DECISION MAKING; location of the robot, so that the next decision will depend SPEECH RECOGNITION IN MACHINES; STATISTICAL LEARN- on previous decisions. After each decision, the environment ING THEORY provides a real-value reward. For example, the robot may —Tom Dietterich receive a positive reward for delivering a meal to the correct patient and a negative reward for bumping into walls. The References goal of the robot is to choose sequences of actions to maxi- mize its long-term reward. This is very different from the Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. (1984). standard supervised learning task, where each classification Classification and Regression Trees. Monterey, CA: Wads- decision is completely independent of other decisions. worth. The final learning task we will discuss is unsupervised Cheeseman, P., J. Kelly, M. Self, J. Stutz, W. Taylor, and D. Free- man. (1988). Autoclass: A Bayesian classification system. Pro- learning, where the computer is given a collection of objects ceedings of the Fifth International Conference on Machine and is asked to construct a model to explain the observed Learning. San Francisco: Kaufmann, pp. 54–64. properties of these objects. No teacher provides desired out- Cleveland, W. S., and S. J. Devlin. (1988). Locally-weighted puts or rewards. For example, given a collection of astro- regression: An approach to regression analysis by local fitting. nomical objects, the learning system should group the Journal of the American Statistical Association 83: 596–610. objects into stars, planets, and galaxies and describe each Cohen, W. W. (1995). Fast effective rule induction. In Proceedings group by its electromagnetic spectrum, distance from earth, of the Twelfth International Conference on Machine Learning. and so on. San Francisco: Kaufmann, pp. 115–123. Although often called “cluster analysis” (Everitt 1993), Dasarathy, B. V., Ed. (1991). Nearest Neighbor (NN) Norms: NN unsupervised learning arises in a much wider range of tasks. Pattern Classification Techniques. Los Alamitos, CA: IEEE Computer Society Press. Indeed, there is no single agreed-upon definition of unsu- Dempster, A. P., N. M. Laird, and D. B. Rubin. (1976). Maximum pervised learning, but one useful formulation views unsu- likelihood from incomplete data via the EM algorithm. Pro- pervised learning as density estimation. Define a probability ceedings of the Royal Statistical Society B39: 1–38. distribution P(X) to be the probability that an object X will Everitt, B. (1993). Cluster Analysis. London: Halsted Press. be observed. The goal of unsupervised learning is to find Hastie, T. J., and R. J. Tibshirani. (1990). Generalized Additive this probability distribution, given a sample of the Xs. This Models. London: Chapman and Hall. is typically accomplished by defining a family of possible Quinlan, J. R. (1993). C4.5: Programs for Empirical Learning. stochastic models and choosing the model that best San Francisco: Kaufmann. accounts for the data. Rabiner, L. R. (1989). A tutorial on hidden Markov models and For example, one might model each group of astronomi- selected applications in speech recognition. Proceedings of the IEEE 77(2): 257–286. cal objects as having a spectrum that is a multivariate nor- mal distribution (centered at a “typical” spectrum). The Further Readings probability distribution P(X) describing the whole collec- tion of astronomical objects could then be modeled as a Bishop, C. M. (1996). Neural Networks for Pattern Recognition. mixture of normal distributions—one distribution for each Oxford: Oxford University Press. group of objects. The learning process determines the num- Mitchell, T. M. (1997). Machine Learning. New York: McGraw- ber of groups and the mean and covariance matrix of each Hill. multivariate distribution. The Autoclass program discovered a new class of astronomical objects in just this way (Cheese- Machine Translation man et al. 1988). Another widely used unsupervised learning model is the Machine translation (MT), which celebrated its fiftieth HIDDEN MARKOV MODEL (HMM). An HMM is a stochastic anniversary in 1997, uses computers to translate texts writ- finite-state machine that generates strings. In speech recog- ten in one human language, such as Spanish, into another nition applications, the strings are speech signals, and one human language, such as Ukrainian. In the ideal situation, HMM is trained for each word in the vocabulary (Rabiner sometimes abbreviated as FAHQMT (for Fully Automated 1989). Machine Translation 499 Modern research using this approach has focused on High Quality Machine Translation) the computer program the semiautomated construction of large word and produces fully automatic, high-quality translations of text. phrase correspondence “tables,” extracting this infor- Programs that assist human translators are called “machine- mation from human translations as examples (Niren- aided translators” (MATs). burg, Beale, and Domashnev 1994) or using statistics MT is the intellectual precursor to the field of COMPU- (Brown et al. 1990, 1993). TATIONAL LINGUISTICS (also called NATURAL LANGUAGE 2. Transfer. In order to produce grammatically appropriate PROCESSING), and shares interests with computer science translations, syntactic transfer systems include so-called (artificial intelligence), linguistics, and occasionally parsers, transfer modules, and realizers or generators anthropology. Machine translation dates back to the work (see NATURAL LANGUAGE GENERATION). A parser is a of Warren Weaver (1955), who suggested applying ideas computer program that accepts a natural language sen- from cryptography, as employed during World War II, and tence as input and produces a parse tree of that sentence as output. A transfer module applies transfer rules to information theory, as outlined in 1947 by Claude Shan- convert the source parse tree into a tree conforming to non, to language processing. Not surprisingly, the first the requirements of the target language grammar—for large-scale MT project was funded by the U.S. government example, by shifting the verb from the end of the sen- to translate Russian Air Force manuals into English. After tence (Japanese) to the second position (English). A an initial decade of naive optimism, the ALPAC (for Auto- realizer then converts the target tree back into a linear matic Language Processing Advisory Committee) report string of words in the target language, inflecting them as (Pierce et al. 1966), issued by a government-sponsored required. (For more details, see Kinoshita, Phillips, and study panel, put a damper on research in the United States Tsujii 1992; Somers et al. 1988; and Nagao 1987.) for many years. Research and commercial development Unfortunately, syntactic analysis is often not enough. continued largely in Europe and after 1970 also in Japan. Effective translation may require the system to “under- stand” the actual meaning of the sentence. For example, Today, over fifty companies worldwide produce and sell “I am small” is expressed in many languages using the translation by computer, whether as translation services to verb “to be,” but “I am hungry” is often expressed using outsiders, as in-house translation bureaus, or as providers the verb “to have,” as in “I have hunger.” For the transla- of on-line multilingual chat rooms. Some 250 of the tion system to handle such cases (and their more complex world’s most widely spoken languages have been trans- variants), it needs to have information about the meaning lated, at least in pilot systems. By some estimates, expendi- of size, hunger, and so on (see SEMANTICS). Often such tures for MT in 1989 exceeded $20 million worldwide, information is represented in case frames, small collec- involving 200–300 million pages per year (Wilks 1992). tions of attributes and their values. The translation sys- Translation is not easy—even humans find it hard to trans- tem then requires an additional analysis module, usually late complex texts such as novels. Current technology pro- called the “semantic analyzer,” additional (semantic) transfer rules, and additional rules for the realizer. The duces output whose quality level varies from perfect (for very semantic analyzer produces a case frame from the syntax circumscribed domains with just a few hundred words) to tree, and the transfer module converts the case frame hardly readable (for unrestricted domains requiring lexicons derived from the source language sentence into the case of a quarter million words or more). Research groups con- frame required for the target language. tinue to investigate unsolved issues. Recent large 3. Interlinguas. Although transfer systems are common, government-sponsored research collaborations include CICC because a distinct set of transfer rules must be created for in Japan (Tsujii 1990), Eurotra in Europe (Johnson, King, and each language pair in each direction—for N languages, one needs about N2 pairs of rule sets—they require a des Tombes 1985), and the DARPA MT effort in the United States (White et al. 1992–1994). (For reviews of the history, great deal of effort to build. The solution is to create a theory, and applications of MT, see Hutchins and Somers single intermediate representation scheme to capture the language-neutral meaning of any sentence in any lan- 1992; Nirenburg et al. 1992; and Hovy 1993; useful collec- guage. Then only 2N sets of mappings are required— tions of papers can be found in AMTA 1996, 1994; CL 1985.) from each language into the interlingua and back out Before producing their output, all MT systems perform again. some analysis of the input text. The degree of analysis largely This idea appeals to many. Despite numerous determines what type of translation is being performed, and attempts, however, it has never yet been achieved on a what the average output quality is. Generally, the more large scale; all interlingual MT systems to date have refined or “deeper” the analysis, the better the output quality. been at the level of demonstrations (a few hundred lexi- Three major levels of analysis are traditionally recognized: cal items) or prototypes (a few thousand). A great deal has been written about interlinguas, but no clear method- 1. Direct replacement. The simplest systems perform ology exists for determining exactly how one should very little analysis of the input, essentially replacing build a true language-neutral meaning representation, if source language (input) words with equivalent target such a thing is possible at all (Nirenburg et al. 1992; language (output) words, inflected as necessary for Dorr 1994; Hovy and Nirenburg 1992). tense, number, and so on. When the source and target Machine translation applications are classified into two languages are fairly similar in structure and word use, as between Italian, Spanish, and French, this approach traditional and one more recent types: can produce surprisingly understandable results. However, as soon as the word order starts to differ 1. Assimilation. People interested in tracking the multilin- (say, if the verb appears at the end of the sentence, as gual world use MT systems for assimilation—to pro- in Japanese), then some syntactic analysis is required. duce (rough) translations of many externally created 500 Machine Translation documents, from which they then select the ones of Hovy, E. H. (1993). How MT works. Byte. January: 167–176. Spe- interest (and then possibly submit them for more cial feature on machine translation. refined, human translation). Typical users are commer- Hovy, E. H., and S. Nirenburg. (1992). Approximating an interlin- cial and government staff who monitor developments in gua in a principled way. In Proceedings of the DARPA Speech areas of interest. The desired output quality need not be and Natural Language Workshop, New York: Arden House. very high, but the MT system should cover a large Hutchins, W. J., and H. Somers. (1992). An Introduction to domain, and it should be fast. Machine Translation. San Diego: Academic Press. 2. Dissemination. People wishing to disseminate their own Johnson, R., M. King, and L. des Tombes. (1985). Eurotra: A mul- documents to the world in various languages use MT tilingual system under development. Computational Linguistics systems to produce the translations. Typical users are 11(2–3): 155–169. manufacturers such as Caterpillar, Honda, and Fujitsu. In Kay, M. (1980). The proper place of men and machines in lan- this case, the desired output quality should be as high as guage translation. XEROX PARC Research Report CSL-80-11. possible, but the system need cover only the application Palo Alto, CA: Xerox Parc. domain, and speed is not generally a consideration. King, M., and K. Falkedal. (1990). Using test suites in evaluation 3. Interaction. People wanting to converse with others in of machine translation systems. In Proceedings of the Eigh- foreign countries via E-mail or chat rooms use MT sys- teenth COLING Conference, vol. 2, pp. 435–447. tems on-line to translate their messages. Typical users are Kinoshita, S., J. Phillips, and J. Tsujii. (1992). Interactions chat room participants and business travelers setting up between structural changes in machine translation. In Proceed- meetings and reserving hotel rooms. CompuServe cur- ings of the Twentieth COLING Conference, pp. 679–685. rently supports MT for some of its highly popular chat Nagao, M. (1987). Role of structural transformation in a machine rooms at the cost of one cent per word. The desired output translation system. In S. Nirenburg, Ed., Machine Translation: quality should be as high as possible, given the require- Theoretical and Methodological Issues. Cambridge: Cam- ments of system speed and relatively broad coverage. bridge University Press, pp. 262–277. Nirenburg, S., S. Beale, and C. Domashnev. (1994). A full-text A great deal of effort has been devoted to the evaluation experiment in example-based machine translation. In Proceed- of MT systems (see White et al. 1992–1994: AMTA 1992: ings of the International Conference on New Methods in Lan- Nomura 1992; Church and Hovy 1993; King and Falkedal guage Processing, pp. 95–103. 1990; Kay 1980; and Van Slype 1979). No single measure Nirenburg, S., J. C. Carbonell, M. Tomita, and K. Goodman. can capture all the aspects of a translation system. While, (1992). Machine Translation: A Knowledge-Based Approach. San Mateo, CA: Kaufmann. from the ultimate user’s point of view, the major dimensions Nomura, H. (1992). JEIDA Methodology and Criteria on Machine will probably be cost, output quality, range of coverage, and Translation Evaluation (JEIDA Report). Tokyo: Japan Elec- degree of automation, numerous more specific evaluation tronic Industry Development Association. metrics have been developed. These range from system- Pierce, J. R., J. B. Carroll, E. P. Hamp, D. G. Hays, C. F. Hockett, internal aspects such as number of grammar rules and treat- A. G. Dettinger, and A. Perlis (1966). Computers in Translation ment of multisentence phenomena to user-related aspects and Linguistics (ALPAC Report). National Academy of Sci- such as the ability to extend the lexicon and the quality of ences/National Research Council Publication 1416. Washing- the system’s interface. ton, DC: NAS Press. See also SPEECH RECOGNITION IN MACHINES; STATISTICAL Somers, H., H. Hirakawa, S. Miike, and S. Amano. (1988). The treatment of complex English nominalizations in machine TECHNIQUES IN NATURAL LANGUAGE PROCESSING; SYNTAX translation. Computers and Translation (now Machine Transla- —Eduard Hovy tion) 3(1): 3–22. Tsujii, Y. (1990). Multi-language translation system using interlin- gua for Asian languages. In Proceedings of International Con- References ference Organized by IPSJ for its Thirtieth Anniversary. AMTA (Association for Machine Translation in the Americas). Van Slype, G. (1979). Critical Study of Methods for Evaluating the (1992). MT Evaluation: Basis for Future Directions. San Quality of Machine Translation. Prepared for the European Diego, CA. Commission Directorate on General Scientific and Technical AMTA. (1994). Proceedings of the Conference of the AMTA. Information and Information Management. Report BR 19142. Columbia, MD. Brussells: Bureau Marcel van Dijk. AMTA. (1996). Proceedings of the Conference of the AMTA. Mon- White, J., and T. O’Connell (1992–1994). ARPA Workshops on treal, CAN. Machine Translation. Series of four workshops on comparative Brown, P. F., J. Cocke, S. A. Della Pietra, V. J. Della Pietra, F. evaluation. McLean, VA: Litton PRC Inc. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roossin. (1990). Wilks, Y. (1992). MT contrasts between the U.S. and Europe. In J. A statistical approach to machine translation. Computational Carbonell, E. Rich, D. Johnson, M. Tomita, M. Vasconcellos, Linguistics 16(2): 79–85. and Y. Wilks, Eds., JTEC Panel Report. Commissioned by Brown, P. F., S. Della Pietra, V. Della Pietra, and R. Mercer. (1993). DARPA and Japanese Technology Evaluation Center. The mathematics of statistical machine translation: Parameter Weaver, W. (1955). Translation. In W. N. Locke and A. D. Booth, estimation. Computational Linguistics 19(2): 263–311. Eds., Machine Translation of Languages. Cambridge, MA: Church, K. W., and E. H. Hovy. (1993). Good applications for MIT Press. crummy machine translation. Machine Translation 8: 239–258. CL (Computational Linguistics). (1985). Special issues on Further Readings machine translation. Vol. 11, nos. 2–3. Dorr, B. J. (1994). Machine translation divergences: A formal TMI. (1995). Proceedings of the Conference on Theoretical and description and proposed solution. Computational Linguistics Methodological Issues in Machine Translation. Leuven, Bel- 20(4): 597–634. gium. Machine Vision 501 The cues of stereopsis and structure from motion rely on TMI. (1997). Proceedings of the Conference on Theoretical and Methodological Issues in Machine Translation. Santa Fe, NM. the presence of multiple views, either acquired simulta- Whorf, B. L. (1956). Language, Thought, and Reality: Selected neously from multiple cameras or over time from a single Writings of Benjamin Lee Whorf. J. B. Carroll, Ed. Cambridge, camera during the relative motion of objects. When the pro- MA: MIT Press. jections of a sufficient number of points in the world are observed in multiple images, it is theoretically possible to Machine Vision deduce the 3-D locations of the points as well as of the cam- eras (Faugeras 1993; for further discussion of the mathemat- ics, see STEREO AND MOTION PERCEPTION). Machine vision is an applied science whose objective is to Shape can be recovered from visual TEXTURE—a spa- take two-dimensional (2-D) images as input and extract tially repeating pattern on a surface such as windows on a information about the three-dimensional (3-D) environment building, spots on a leopard, or pebbles on a beach. If the adequate for tasks typically performed by humans using arrangement is periodic, or at least statistically regular, it is vision. These tasks fall into four broad categories: possible to recover surface orientation and shape from a sin- 1. Reconstruction. Examples are building 3-D geometric gle image (Malik and Rosenholtz 1997). While the sizes, models of an environment, determining spatial layout by shapes, and spacings of the texture elements (texels) are finding the locations and poses of objects, and estimating roughly uniform in the scene, the projected size, shape, and surface color, reflectance, and texture properties. spacing in the image vary, principally because 2. Visually guided control of locomotion and manipulation. Locomotion tasks include navigating a robot around 1. Distances of the different texels from the camera vary. obstacles or controlling the speed and direction of a car Recall that under perspective projection, distant objects driving down a freeway. Manipulation tasks include appear smaller. The scaling factor is 1/Z. reaching, grasping, and insertion operations (see MANIP- 2. Foreshortening of the different texels varies. This is related to the orientation of the texel relative to the line ULATION AND GRASPING). 3. Spatiotemporal grouping and tracking. Grouping is the of sight of the camera. If the texel is perpendicular to the association of image pixels into regions corresponding to line of sight, there is no foreshortening. The magnitude of the foreshortening effect is proportional to cos σ, single objects or parts of objects. Tracking is matching where σ is the angle between the surface normal and the these groups from one time frame to the next. Grouping is used in the segmentation of different kinds of tissues ray from the viewer. in an ultrasound image or in traffic monitoring to distin- Expressions can be derived for the rate of change of vari- guish and track individual vehicles. ous image texel features, for example, area, foreshortening, 4. Recognition of objects and activities. Object recognition and density (GIBSON's texture gradients), as functions of sur- tasks include determining the class of particular objects face shape and orientation. One can then estimate the sur- that have been imaged (“This is a face”) and recognizing specific instances such as faces of particular individuals face shape, slant, and tilt that would give rise to the (“This is Nixon's face”). Activity recognition includes measured texture gradients. identifying gaits, expressions, and gestures. (See VISUAL Shading—spatial variation in the image brightness—is OBJECT RECOGNITION, AI Ullman 1996 provides a book- determined by the spatial layout of the scene surfaces, length account.) their reflectance properties, and the arrangement of light sources. If one neglects interreflections—the fact that Reconstruction Tasks objects are illuminated not just by light sources but also The most basic fact about vision, whether machine or by light reflected from other surfaces in the scene—then human, is that images are produced by perspective projec- the shading pattern is determined by the orientation of tion. Consider a coordinate system with origin at the optical each surface patch with respect to the light sources. For a center of a camera whose optical axis is aligned along the Z diffusely reflecting surface, the brightness of the patch axis. A point P with coordinates (X,Y,Z) in the scene gets varies as the cosine of the angle between the surface nor- imaged at the point P', with image plane coordinates (x,y) mal and the light source direction. A number of tech- where niques have been developed that seek to invert the process—to recover the surface orientation and shape giv- – fX – fY x = --------, y = ------- -, ing rise to the observed brightness pattern (Horn and Z Z Brooks 1989). and f is the distance from the optical center of the camera to Humans can perceive 3-D shape from line drawings, the image plane. All points in the 3-D world that lie on a ray which suggests that useful information can be extracted passing through the optical center are mapped to the same from the projected image of the contour of an object (Koen- point in the image. During reconstruction, we seek to derink 1990). It is easiest to do this for objects that belong to recover the 3-D information lost during perspective projec- parametrized classes of shapes, such as polyhedra or sur- tion. faces of revolution, for which the ambiguity resulting from Many cues are available in the visual stimulus to make perspective projection can be resolved by considering only this possible, including structure from motion, binocular those scene configurations that satisfy the constraints appro- stereopsis, texture, shading, and contour. Each of these priate to the particular class of shapes. relies on background assumptions about the physical scene Finally, it should be noted that shape and spatial layout (Marr 1982). are only some of the scene characteristics that humans can 502 Machine Vision infer from images. Surface color, reflectance, and texture in the context of humans under the rubric of GESTALT PER- are also perceived simultaneously. In machine vision, there CEPTION. For instance, the Gestaltists listed similarity as a has been some work in this direction. For example, attempts major grouping factor—humans readily form groups from have been made to solve the color constancy problem—to parts of an image that are uniform in color, such as a con- estimate true surface color, given that the apparent color in nected red patch, or uniform in texture, such as a plaid the image is determined both by the surface color and the region. Computationally, this has motivated edge detec- spectral distribution of the illuminant. tion, a technique based on marking boundaries where neighboring pixels have significant differences in bright- Visually Guided Control ness or color. If we look for differences in texture descrip- tors of image patches, suitably defined, we can find texture One of the principal uses of vision is to provide information edges. for manipulating objects and guiding locomotion. Consider Similarity is only one of the factors that can promote the use of vision in driving on a freeway. A driver needs to grouping. Good continuation suggests linking edge seg- ments that have directions consistent with being part of a 1. Keep moving at a reasonable speed. smoothly curving extended contour. Relaxation methods 2. Control the lateral position of the vehicle in its lane— make sure it stays in the center and is oriented properly. and dynamic programming approaches have been proposed 3. Control the longitudinal position of the vehicle—keep a to exploit this factor. safe distance from the vehicle in front of it. Earlier work in machine vision was based primarily on local methods, which make decisions about the presence of The lateral and longitudinal control tasks do not require a boundaries purely on the information in a small neighbor- complete reconstruction of the environment. For instance, hood of an image pixel. Contemporary efforts aim to make lateral control of the car only requires the following infor- use of global information. A number of competing formal- mation: the position of the car relative to the left and right isms, such as Markov random fields (Geman and Geman lane markers, its orientation relative to the lanes, and the 1984), layer approaches (Wang and Adelson 1994) based on curvature of the upcoming road. A feedback control law can the expectation maximization technique from statistics, and be designed using these measurements and taking into cut techniques drawn from spectral graph theory (Shi and account the dynamics of the car. Several research groups Malik 1997) are being explored. Some of these allow for the (e.g., Dickmanns and Mysliwetz 1992) have demonstrated combined use of multiple grouping factors such as similar- vision-based automated driving. ity in brightness as well as common motion. For dynamic tasks, it is important that measurements can The temporal grouping problem, visual tracking, lends be integrated over time to yield better estimates—Kalman itself well to the Kalman filtering formalism for dynamic filtering provides one formalism. Often the motion of the estimation. At each frame, the position of a moving object is sensing device is known (perhaps because it has been com- estimated by combining measurements from the current manded by the agent) and estimation of relevant scene prop- time frame with the predicted position from previous data. erties can be made even more robust by exploiting this Generalizations of this idea have also been developed (see knowledge. Isard and Blake 1996). It is worth noting that even a partial reconstruction of See also COLOR VISION; COMPUTATIONAL VISION; scene information, as suggested above, may not be neces- STRUCTURE FROM VISUAL INFORMATION SOURCES; SUR- sary. Lateral control could be achieved by feedback directly FACE PERCEPTION; OBJECT RECOGNITION; HUMAN NEURO- on image (as opposed to scene) measurements. Just steer so PSYCHOLOGY; VISION AND LEARNING that the left and right lane markers are seen by the forward pointing camera in a symmetric position with respect to the —Jitendra Malik center of the image. For the more general task of navigation around obstacles, other variables computable from the opti- References cal flow field have been proposed. Dickmanns, E. D., and B. D. Mysliwetz. (1992). Recursive 3-D Grouping and Tracking road and relative ego-state recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 14: 199–213. Humans have a remarkable ability to organize their percep- Faugeras, O. (1993). Three-Dimensional Computer Vision: A Geo- tual input—instead of a collection of values associated with metric Viewpoint. Cambridge, MA: MIT Press. individual photoreceptors, we perceive a number of visual Geman, S., and D. Geman. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE groups, usually associated with objects or well-defined parts Transactions on Pattern Analysis and Machine Intelligence 6: of objects. This ability is equally important for machine 721–741. vision. To recognize objects, we must first separate them Horn, B. K. P., and M. J. Brooks. (1989). Shape from Shading. from their backgrounds. Monitoring and surveillance appli- Cambridge, MA: MIT Press. cations require the ability to detect individual objects, and Isard, M., and A. Blake. (1996). Contour tracking by stochastic track them over time. Tracking can be viewed as grouping propagation of conditional density. In B. Buxton and R. in the temporal dimension. Cipolla, Eds., Proceedings of the Fourth European Conference Most machine vision techniques for grouping and on Computer Vision. (ECCV 1996), Cambridge. Berlin: tracking can be viewed as attempts to construct algorith- Springer, vol. 1, p. 343–356. mic implementations of various grouping factors studied Koenderink, J. J. (1990). Solid Shape. Cambridge, MA: MIT Press. Machines and Cognition 503 Why do magical beliefs take the particular form they do? Malik, J., and R. Rosenholtz. (1997). Computing local surface ori- entation and shape from texture for curved surfaces. Interna- They seem heavily guided by natural intuitions about the tional Journal of Computer Vision 23(2): 149–168. nature of the world, and are selected for their ability to give Shi, J., and J. Malik. (1997). Normalized cuts and image segmenta- satisfying (including anxiety-reducing) as much as accurate tion. In Proceedings of the 1997 IEEE Computer Society Con- accounts. These intuitions may derive from primary process ference on Computer Vision and Pattern Recognition, San Juan, thought (FREUD), failure to distinguish the self from the Puerto Rico, pp. 731–737. world (PIAGET), preprogrammed or readily acquired cogni- Ullman, S. (1996). High-Level Vision: Object Recognition and tive heuristics, or the very nature of symbolic thinking. Visual Cognition. Cambridge, MA: MIT Press. Since some “magical” beliefs have proven true over time Wang, J. Y. A., and E. H. Adelson. (1994). Representing moving (e.g., the folk belief in mid-19th-century America that cholera images with layers. IEEE Transactions on Image Processing was contagious), and since scientific knowledge is by defini- 3(5): 625–638. tion open to continuous revision, we focus on the form rather Further Readings than the accuracy of magical beliefs. We examine two princi- ples that are widespread, specific, and relatively well studied. Haralick, R. M., and L. G. Shapiro. (1992). Computer and Robot These are two of the three “laws of sympathetic magic,” orig- Vision. 2 vols. Reading, MA: Addison-Wesley. inally described as aspects of the “primitive” mind by Tylor Horn, B. K. P. (1986). Robot Vision. Cambridge, MA: MIT Press. (1879), Frazer (1890), and Mauss (1902). The laws were con- Marr, D. (1982). Vision. San Francisco: Freeman. ceived as basic features of human thought, projected onto the Nalwa, V. S. (1993). A Guided Tour of Computer Vision. Reading, world, leading to beliefs that things associated or symboli- MA: Addison Wesley. cally related in the mind actually go together, and may be Trucco, E., and A. Verri. (1998). Introductory Techniques for 3-D causally linked, in the world (see Rozin and Nemeroff 1990; Computer Vision. Englewood Cliffs, NJ: Prentice-Hall. Nemeroff and Rozin 1998; Tambiah 1990 for reviews). Machines and Cognition The law of similarity has been summarized as “like pro- duces like,” “like goes with like,” or “the image equals the object.” Likeness is elevated to a basic, often causal princi- See INTRODUCTION: COMPUTATIONAL INTELLIGENCE; ple; the simplest example confounds likeness with identity, INTRODUCTION: PHILOSOPHY hence “appearance equals reality.” The adaptive value of this law is clear: generally speaking, if it looks like a tiger, it is a Magic and Superstition tiger. For humans, this law becomes problematic because humans make artifacts that are imitations of entities in the We generally call something magical or superstitious if it world, as in drawings or photographs, or more abstractly, the involves human agency (as distinct from religion), and words that represent them. A picture of a tiger does not jus- invokes causes inconsistent with current understandings, by tify fear. Similarity functions in nonhumans and in young relevant “experts,” (e.g., Western scientists) of how the children (presumably from birth); one feature of develop- world operates. ment is learning about situations in which appearance does Magical beliefs and activities have been understood by not correspond to reality. historians (Thomas 1971), psychologists (Freud 1950; Examples of similarity include burning effigies of per- Piaget 1929), and anthropologists (Frazer 1890; Mauss sons in order to cause harm to them, or reliance on appear- 1902) as primitive attempts to understand and control the ance in judging objects when the appearance is known to be world. Some of these authors posit an “evolutionary” or deceiving (e.g., avoidance by educated adults of chocolate developmental course, with magic replaced over historical shaped to look like feces; or difficulty experienced in throw- time, first by religion and ultimately by science. This ing darts at a picture of a respected person). In the domain hypothesized progression cannot account for the high inci- of words, Piaget (1929) described as “nominal realism” the dence of magical beliefs in educated late twentieth century child’s difficulty in understanding the arbitrary relation of adults in the developed world, in the face of the great word and referent. Similarly, educated people have diffi- advances of science in the twentieth century. culty disregarding a label on a bottle (e.g., “poison”) that Why do these beliefs (and actions) persist? The adaptive they know does not apply. human tendency to understand, control, and make meaning The law of contagion holds that when two objects come out of occurrences in the world probably lies at the heart of into even brief physical contact, properties are permanently magic and religion. Reliance by scientific explanations on transmitted between them (“once in contact, always in con- impersonal forces and random events fails to satisfy the tact”). Contagion typically flows from a valenced source human mind, which is inclined to personalize and humanize (e.g., a detested or favorite person), often through a vehicle accounts of events in the world. The pervasiveness of magi- (e.g., clothing or food) to a target, usually a person. Tradi- cal beliefs can probably be attributed to three causes: (1) this tional examples include the idea that food or objects that type of thinking is natural and intuitive for the human mind have been in contact with death, disease, or enemies will (though some, e.g., Sperber 1985, propose that ideas such as cause harm if eaten or contacted, or that damage done to a this may survive because they strikingly depart from intu- separated part of a person (e.g., a piece of hair) will damage ition and expectations); (2) magical thinking often makes that person (sorcery). moderately accurate predictions; and (3) a major function of Contagion has clear adaptive value by reducing the risk magical acts and rituals is performative (Tambiah 1990). of transmitting microbes. It is closely associated with the 504 Magic and Superstition emotion of disgust (disgusting entities are contaminating, AND PRACTICES; SCIENTIFIC THINKING AND ITS DEVELOP- i.e., negative contagion), and may have originated in the MENT food context. On the positive side, contagion provides a —Paul Rozin and Carol Nemeroff concrete representation of kinship as shared blood, and may serve as the proximal means to induce kin preferences. References Contagion, in opposition to similarity, holds that things are often not what they appear to be, since they bear invisi- Frazer, J. G. (1890/1959). The Golden Bough: A Study in Magic ble “traces” of their histories. Consistent with this sophisti- and Religion. New York: Macmillan. (Reprint of 1922 abr. ed. cation, contagion seems to be absent in young children and T. H. Gaster; original work published 1890.) all nonhumans. However, contagion is probably present in Freud, S. (1950). Totem and Taboo: Some Points of Agreement all normal adult humans. between the Mental Lives of Savages and Neurotics. Translated by J. Strachey. New York: W. W. Norton. (Original work pub- Magical contagion is shown by educated adults, who, for lished 1913.) example, reject a preferred beverage after brief contact with Mauss, M. (1902/1972). A General Theory of Magic. Translated by a dead cockroach (Rozin and Nemeroff 1990). Western sub- R. Brain. New York: W. W. Norton. (Original work published jects in situations such as these generally attribute their 1902: Esquisse d’une theorie generale de la magie. L’Annee aversion to health risks; however, they quickly realize that Sociologique 1902–1903.) this account is insufficient when their aversion remains after Meigs, A. (1994). Food, Sex, and Pollution: A New Guinea Reli- the contaminant has been rendered harmless (e.g., steril- gion. New Brunswick, NJ: Rutgers University Press. ized). Magical thinking often exposes such “head vs. heart” Nemeroff, C., and P. Rozin. (Forthcoming). The makings of the conflicts. magical mind. In K. Rosengren, C. Johnson, and P. Harris, Other examples of contagion in everyday life include Eds., Imagining the Impossible: The Development of Magical, Scientific, and Religious Thinking in Contemporary Society. celebrity token hunting, valuing of family heirlooms, and Oxford: Oxford University Press. the reluctance of many individuals to share or buy used Nemeroff, C. and P. Rozin. (1994). The contagion concept in adult clothing. Sources capable of producing positive contagion thinking in the United States: Transmission of germs and inter- (“transvaluation”) include loved ones and celebrities. Those personal influence. Ethos. The Journal of Psychological capable of producing substantial aversions include virtually Anthropology 22: 158–186. any disgusting substance, and a wide variety of people; even Piaget, J. (1929/1967). The Child’s Conception of the World. unknown healthy others contaminate for most persons, and Totawa, NJ: Littlefield and Adams. (Original work published contamination is enhanced if the person is described as ill or 1929.) morally tainted. Rozin, P., and C. J. Nemeroff. (1990). The laws of sympathetic Some properties of contagion (Rozin and Nemeroff magic: A psychological analysis of similarity and contagion. In J. Stigler, G. Herdt, and R. A. Shweder, Eds., Cultural Psychol- 1990) across situations and cultures are: (1) Physical contact ogy: Essays on Comparative Human Development. Cambridge, is either definitionally necessary or almost always present; UK: Cambridge University Press, pp. 205–232. (2) Effects are relatively permanent; (3) Even very brief Sperber, D. (1985). Anthropology and psychology. Towards an contact with any part of the source produces almost the full epidemiology of representations. Man 20: 73–89. effect (dose and route insensitivity); (4) Negative contagion Tambiah, S. J. (1990). Magic, Science, Religion, and the Scope of is more widespread and powerful than positive (negativity Rationality. Cambridge, UK: Cambridge University Press. dominance); (5) Properties passed may be physical or men- Thomas, K. (1971). Religion and the Decline of Magic. London: tal, including intentions and “luck”; (6) Contagion can oper- Weidenfeld and Nicolson. ate in a “backward” direction, with effects flowing from Tylor, E. B. (1871/1974). Primitive Culture: Researches into the recipient or vehicle back on to the source (as when one Development of Mythology, Philosophy, Religion, Art and Custom. New York: Gordon Press. (Original work published attempts to harm someone by burning a lock of their hair). 1871.) The contagious entity or “essence” may be mentally rep- resented in at least three ways (depending on culture, nature Further Readings of the source, and individual within-culture differences). One is pure association (which does not entail contact, and Boyer, P. (1995). Causal understandings in cultural representa- can be thought of as an artifactual account of contagion), a tions: Cognitive constraints on inferences from cultural input. second is the passage of a material-like essence, and a third In D. Sperber, D. Premack, and A. J. Premack, Eds., Causal is the passage of a spiritual, nonmaterial essence (Nemeroff Cognition: A Multidisciplinary Debate. Oxford: Clarendon and Rozin 1994). Press, pp. 615–644. Evans-Pritchard, E. E. (1976). Witchcraft, Oracles and Magic Magical thinking varies substantially in quality and among the Azande. Oxford: Oxford University Press. (Original quantity across cultures, lifetimes, and history, as well as work published 1937.) among adults within a culture. Contagion, particularly via Horton, R. (1967). African traditional thought and Western sci- contact with those perceived as undesirable, is omnipresent, ence. Africa 37(1–2): 50–71, 155–187. and is potentially crippling. While this type of interpersonal Humphrey, N. (1996). Leaps of Faith. New York: Basic Books. contagion is universal, in Hindu India and many of the cul- Nemeroff, C., A. Brinkman, and C. Woodward. (1994). Magical tures in Papua, New Guinea (Meigs 1984), it is especially cognitions about AIDS in a college population. AIDS Educa- salient in daily life, and has an overt moral significance. tion and Prevention 6: 249–265. See also CULTURAL SYMBOLISM; CULTURAL VARIATION; Rosengren, K., C. Johnson, and P. Harris, Eds. (Forthcoming). Imagining the Impossible: The Development of Magical, Scien- ESSENTIALISM; LÉVI-STRAUSS, CLAUDE; RELIGIOUS IDEAS Magnetic Resonance Imaging 505 any one of three orthogonal directions. Contrast is achieved tific, and Religious Thinking in Contemporary Society. Oxford: Oxford University Press. either based on the regional density of the nuclear species Rozin, P., M. Markwith, and C. R. McCauley. (1994). The nature imaged or by the impact of the chemical and biological of aversion to indirect contact with other persons: AIDS aver- environment on parameters that determine the behavior or sion as a composite of aversion to strangers, infection, moral relaxation of a population of spins from a nonequilibrium taint and misfortune. Journal of Abnormal Psychology 103: state toward thermal equilibrium. Improving contrast and 495–504. spatial encoding strategies are central to developments in Shweder, R. A. (1977). Likeness and likelihood in everyday MRI as the field strives to image faster, with higher resolu- thought: magical thinking in judgments about personality. Cur- tion and structural detail, and image not only anatomy but rent Anthropology 18: 637–658. also physiological processes such as blood flow, perfusion, Siegel, M., and D. L. Share. (1990). Contamination sensitivity in organ function, and intracellular chemistry. young children. Developmental Psychology 26: 455–458. An avidly pursued new dimension in the acquisition of physiological and biochemical information with MRI is Magnetic Fields mapping human brain function, referred to as fMRI. The first fMRI image of the human brain was based on measure- See ments of task-induced blood volume change assessed with ELECTROPHYSIOLOGY, ELECTRIC AND MAGNETIC intravenous bolus injection of an MRI contrast agent, a EVOKED FIELDS highly paramagnetic substance, into the human subject and tracking the bolus passage through the brain with consecu- Magnetic Resonance Imaging tive, rapidly acquired images (Belliveau et al. 1991). How- ever, this method was quickly rendered obsolete with the Magnetic resonance imaging (MRI) is based on the phe- introduction of totally noninvasive methods of fMRI. Of the nomenon of nuclear magnetic resonance (NMR), first two current ways of mapping alterations in neuronal activa- described in landmark papers over fifty years ago (Rabi et tion noninvasively, the most commonly used method relies al. 1938; Rabi, Millman, and Kusch 1939; Purcell et al. on the weak magnetic interactions between the nuclear 1945; Bloch, Hansen, and Packard 1946). In the presence of spins of water protons in tissue and blood, and the paramag- an external magnetic field, atomic nuclei with magnetic netic deoxyhemoglobin molecule, termed BOLD (blood moments, such as 1H, 13C, and 31P nuclei, encounter a sepa- oxygen level–dependent) contrast, first described for the ration in the energy levels of their quantum mechanically brain by Ogawa (Ogawa et al. 1990a, 1990b; Ogawa and allowed orientations relative to the external field. Transi- Lee 1990) and is similar to the effect described for blood tions between these orientations can be induced with elec- alone by Thulborn et al. (1982). The presence of paramag- tromagnetic radiation typically in the radiofrequency range. netic deoxyhemoglobin, compartmentalized in red blood The discrete frequency associated with such a transition is cells and in blood vessels, generates local magnetic field proportional to the external magnetic field strength and to inhomogeneities surrounding these compartments which are two parameters that are determined by the intrinsic proper- dynamically (due to rapid diffusion) or statically averaged ties of the nucleus and its chemical environments, the over the smallest volume element in the image and lead to nuclear gyromagnetic ratio, and the chemical shift, respec- signal loss when a delay is introduced between signal exci- tively. Based on the ability to obtain discrete resonances tation and subsequent sampling. sensitive to the chemical environment, NMR has evolved In the original papers describing the BOLD effect, func- rapidly to become an indispensable tool in chemical and tional mapping in the human brain using BOLD was antici- biological research focused on molecular composition, pated (Ogawa et al. 1990a) based on data documenting structure, and dynamics. regional elevation in blood flow and glucose metabolism In 1973, a novel concept of using NMR of the hydrogen without a commensurate increase in oxygen consumption atoms in the human body as an imaging modality was intro- rate during increased neuronal activity (Fox and Raichle duced (Lauterbur 1973). While all NMR applications of the 1985; Fox et al. 1988); these data would predict a task- time were singularly concerned with eliminating inhomoge- induced decrease in deoxyhemoglobin content in the human neities in magnetic field magnitude over the sample, this brain and a consequent alteration in MRI signal intensity new concept embraced them and proposed to utilize them to when the signal intensity difference between two states, for extract spatial information. Consequently, today magnetic example, in the absence and presence of a mental task or resonance is solidly established as a noninvasive imaging sensory stimulation, is examined. This was demonstrated technique suitable for use with humans. and the first BOLD-based fMRI images of the human brain MRI is essentially based on two fundamental ideas, “spa- were published in 1992 by three groups in papers submitted tial encoding” and “contrast.” The former is the means by within five days of each other (Bandettini et al. 1992; which the NMR data contain information on the spatial ori- Kwong et al. 1992; Ogawa et al. 1992). Initial functional gin of the NMR signal. The latter must provide the ability to brain mapping studies were focused on simple sensory stim- distinguish and visualize different structures or processes ulation and regions of the brain that are relatively well occurring within the imaged object and is ultimately trans- understood. These studies were aimed at demonstrating and lated into gray scale or color coding for presentation. Spatial evaluating the validity of the technique rather than address- encoding is accomplished by external magnetic fields ing the plethora of as yet unanswered questions concerned whose magnitude depend linearly on spatial coordinates in with aspects of brain function. In the short period of time 506 Magnetic Resonance Imaging since its introduction, however, BOLD fMRI has been used OGY, NEURAL BASIS OF; POSITRON EMISSION TOMOGRAPHY; to map functions in the whole brain, including subcortical SINGLE-NEURON RECORDING nuclei, with a few millimeters resolution and has been —Kamil Ugurbil shown to display specificity at the level of ocular dominance columns in humans at the high magnetic field of 4 tesla References (Menon, Ogawa, and Ugurbil 1996; Menon et al. 1997). At present field strengths, the sensitivity and hence the spatial Bandettini, A., E. C. Wang, R. S. Hinks, R. S. Rikofsky, and J. S. resolution attainable, however, is at the margin of what is Hyde. (1992). Time course EPI of human brain function during required for visualizing human ocular dominance columns task activation. Magn. Reson. Med. 25: 390. which are approximately 1 × 1 mm in cross-sectional Belliveau, J. W., D. N. Kennedy, R. C. McKinstry, B. R. Buch- binder, R. M. Weisskoff, M. S. Cohen, J. M. Vevea, T. J. Brady, dimensions. and B. R. Rosen. (1991). Functional mapping of the human A second approach to generating functional maps of the visual cortex by magnetic resonance imaging. Science 254: brain with fMRI relies on the task-induced increase in 716–719. regional blood flow alone. This method is analogous to the Bloch, F., W. W. Hansen, and M. Packard. (1946). The nuclear POSITRON EMISSION TOMOGRAPHY (PET)–based functional induction experiment. Physical Review 70: 474–485. brain mapping using water labeled with a positron emitter Edelman, R. E., B. Siewer, D. G. Darby, V. Thangaraj, A. C. (H215O). In the noninvasive MRI approach, however, the Nobre, M. M. Mesulam, and S. Warach. (1992). Qualitative label is simply the collective spins of the water molecules mapping of cerebral blood flow and functional localization whose net bulk magnetization is inverted or nulled (satu- with echo-planar MR imaging and signal targeting with alter- rated) either within a slice to be imaged or outside of the nating radio frequency. Radiology 192: 513–520. Fox, P. T., and M. E. Raichle. (1985). Stimulus rate determines slice to be imaged. For example, if the slice to be imaged is regional brain blood flow in striate cortex. Ann. Neurol. 17: inverted, the inverted magnetization must relax back to its 303–305. thermal equilibrium value and does so in a few seconds; in Fox, P. T., M. E. Raichle, M. A. Mintun, and C. Dence. (1988). the absence of flow, this occurs with what is termed spin- Nonoxidative glucose consumption during focal physiologic lattice relaxation mechanisms. However, when flow is neural activity. Science 241: 462–464. present, apparent relaxation occurs because of replacement Kim, S.-G. (1995). Quantification of relative cerebral blood flow of inverted spins by unperturbed spins coming from outside change by flow-sensitive alternating inversion recovery (FAIR) the inversion slice. Such flow-based fMRI methods were technique: Application to functional mapping. Magn. Reson. first demonstrated in 1992 (Kwong et al. 1992), and signifi- Med. 34: 293–301. cantly refined subsequently (Edelman et al. 1992; Kim Kwong, K. K., J. W. Belliveau, D. A. Chesler, I. E. Goldberg, R. M. Weisskoff, B. Poncelet, D. N. Kennedy, B. E. Hoppel, M. S. 1995). While flow-based techniques have some advantages Cohen, R. Turner, H. -M. Cheng, T. J. Brady, and B. R. Rosen. over BOLD methods, such as simplicity of interpretation (1992). Dynamic magnetic resonance imaging of human brain and ability to poise the sensitivity to perfusion as opposed to activity during primary sensory stimulation. Proc. Natl. Acad. macrovascular flow, rapid imaging of large sections of the Sci. U. S. A. 89: 5675–5679. brain or whole brain is not yet possible. Lauterbur, P. C. (1973). Image formation by induced local interac- fMRI techniques rely on secondary and tertiary responses, tion: Examples employing nuclear magnetic resonance. Nature metabolic and hemodynamic, to increased neuronal activity. 242: 190–191. Hence, they are subject to limitations imposed by the tempo- Malonek, D., and A. Grinvald. (1996). Interactions between elec- ral characteristics and spatial specificity of these responses. trical activity and cortical microcirculation revealed by imaging Current data suggest that BOLD images, when designed with spectroscopy: Implication for functional brain mapping. Sci- ence 272: 551–554. appropriate paradigms, may have spatial specificity down to Menon, R. S., S. Ogawa, and K. Ugurbil. (1996). Mapping ocular the millimeter to submillimiter scale (e.g., ocular dominance dominance columns in human V1 using fMRI. Neuroimage 3: columns) presumably because the spatial extent of altered S357. oxygen consumption, hence deoxyhemoglobin alterations, Menon, R. S., S. Ogawa, J. P. Strupp, and K. Ugurbil. (1997). Ocu- coupled to neuronal activity, is confined accurately to the lar dominance in human V1 demonstrated by functional mag- region of elevated neuronal activity; this scale may be netic resonance imaging. J. Neurophysiol. 77(5): 2780–2787. coarser, possibly in the range of several millimeters, for per- Ogawa, S., and T.-M. Lee. (1990). Magnetic resonance imaging of fusion images if blood flow response extends beyond the blood vessels at high fields: In vivo and in vitro measurements region of increased activity (Malonek and Grinvald 1996). and image simulation. Magn. Reson. Med. 16: 9–18. With respect to temporal resolution, the sluggish metabolic Ogawa, S., T. -M. Lee, A. R. Kay, and D. W. Tank. (1990a). Brain magnetic resonance imaging with contrast dependent on blood response and even more sluggish hemodynamic response to oxygenation. Proc. Natl. Acad. Sci. U. S. A. 87: 9868–9872. changes in neuronal activity suggest that better than approxi- Ogawa, S., T. -M. Lee, A. S. Nayak, and P. Glynn. (1990b). mately 0.5-sec time resolution may not be achievable with Oxygenation-sensitive contrast in magnetic resonance image of current fMRI techniques even though image acquisition can rodent brain at high magnetic fields. Magn. Reson. Med. 14: be accomplished in as little as 20 to 30 msec. While this 68–78. excludes a very large temporal domain of interest, the pleth- Ogawa, S., D. W. Tank, R. Menon, J. M. Ellermann, S. -G. Kim, H. ora of mental processes accomplished in the seconds domain Merkle, and K. Ugurbil. (1992). Intrinsic signal changes by the human brain remains accessible to fMRI. accompanying sensory stimulation: Functional brain mapping See also ELECTROPHYSIOLOGY, ELECTRIC AND MAG- with magnetic resonance imaging. Proc. Natl. Acad. Sci. U. S. A. 89: 5951–5955. NETIC EVOKED FIELDS; MOTION, PERCEPTION OF; PHONOL- Malinowski, Bronislaw 507 Freudian theory of relations within the Trobriand family. Purcell, E. M., H. C. Torrey, and R. V. Pound. (1945). Resonance absorption by nuclear magnetic moments in a solid. Physical The Sexual Life of Savages (1929) focuses on sexuality, Review 69: 37. marriage, and kinship and includes vivid descriptions of Rabi, I. I., S. Millman, and P. Kusch. (1939). The molecular beam children’s daily lives. Coral Gardens and their Magic resonance method for measuring nuclear magnetic moments. (1935, two volumes) deals with horticulture, land tenure, Physical Review 55: 526–535. and the language of magic and gardening. Here Malinowski Rabi, I. I., J. R. Zacharias, S. Millman, and P. Kusch. (1938). A draws out his idea of “the context of situation,” first put for- new method of measuring nuclear magnetic moment. Physical ward in an essay published twelve years earlier: “The con- Review 53: 318. ception of meaning as contained in an utterance is false and Thulborn, K. R., J. C. Waterton, P. M. Mathews, and G. K. Radda. futile. A statement, spoken in real life, is never detached (1982). Oxygenation dependence of the transverse relaxation from the situation in which it has been uttered” (1952: 307). time of water protons in whole blood at high field. Biochem. Biophys. Acta 714: 265–270. This perspective on language—radical in its time—is con- sistent with Malinowski’s empiricism, which was always tempered by an awareness that “any belief or any item of Malinowski, Bronislaw folklore is not a simple piece of information . . . [it] must be examined in the light of diverse types of minds and of the Bronislaw Malinowski (1884–1942), founder of British diverse institutions in which it can be traced. To ignore this social anthropology and first thorough-going practitioner (if social dimension [of belief] . . . is unscientific” (1974: 239– not the inventor) of the fieldwork method known as “partici- 240). pant observation,” continues to be read with fascination and Malinowski’s evolutionist view took for granted the cul- admiration. His reputation rests on six classic monographs tural superiority of Europeans over other peoples; this idea he wrote between 1922 and 1935 about the lives and ideas is evident in his various works, especially in his private Tro- of the world of the people of the Trobriands, a group of briand diaries, which, being somewhat at odds with the eth- islands off the northeast coast of Papua New Guinea. nographies, caused controversy when they were published Malinowski was born in Cracow in 1884 to aristocratic posthumously in 1967. Even so, and despite his sometimes parents. His father—a linguist and professor of Slavic phi- patronizing and even spiteful asides on “the natives,” Mali- lology at the University of Cracow—died when he was 12. nowski was clearly genuine both in his pursuit of his field- His evidently clever mother taught herself Latin and mathe- work aims and in the often admiring respect for Trobriand matics in order to tutor him during a long illness in his mid- people he expressed in his works and in person to his stu- teens. In 1902 he entered the University of Cracow to study dents (Firth 1957, 1989; Young 1979). physics and philosophy and graduated with a Ph.D. in A charismatic teacher, revered by his students at the Lon- 1908—his thesis influenced by the empiricist epistemology don School of Economics, where he held the first chair in of Ernst Mach. Afterward, at Leipzig, he studied economic social anthropology, Malinowski spent a good deal of his history with Bucher and psychology with WUNDT, whose intellectual force engaged in the battle to make his function- “folk psychology” concerned people’s day-to-day ideas and alist theory of human behavior dominant in social anthro- their interconnections—their language, customs, art, myths, pology. He argued that “culture is essentially an religion—in short, their “culture.” Frazer’s The Golden instrumental apparatus by which man is put in a position the Bough was another definitive influence (see Kuper 1996: 1– better to cope with the concrete specific problems that face 34; Stocking 1995: 244–297). him in his environment in the course of the satisfaction of In 1910 Malinowski left Leipzig for the London School his needs” (1944: 150). This perspective had stood Mali- of Economics where, under Westermarck, he worked on The nowski in good stead in gathering field data, but it assumed Family among the Australian Aborigines, published in “culture” to be an integrated whole, left no place for change 1913. In 1914, aged 30, he made his first field trip to Papua as a condition of human existence, and lacked any analytical New Guinea, where, following the wishes of his mentor, W. power to explain cross-cultural similarities and differences. H. R. Rivers, he worked for some months among the Mailu; In 1938 Malinowski went to the United States, where he but this was “no more than an apprentice’s trial run, conven- was caught by the outbreak of World War II and remained, tional enough in method and results” (Kuper 1966: 12). The as a visiting professor at Yale, until he died suddenly in ground-breaking fieldwork came in 1915–1916 and 1917– 1942. His work was developed and sometimes amended by 1918 in Kiriwina, the largest of the Trobriand Islands. later ethnographers of the Trobriands, but “the legacy of his Malinowski’s first Trobriand ethnography was written in Trobriand ethnography continues to play an unprecedented Australia in 1916. Baloma: The Spirits of the Dead in the role in the history of anthropology” (Weiner 1988: 4). Trobriand Islands (1916) is an engaging study of magic, See also BOAS, FRANZ; CULTURAL EVOLUTION; CUL- witchcraft, and religious beliefs that also reveals Mali- TURAL PSYCHOLOGY; RELIGIOUS IDEAS AND PRACTICES; nowski’s tenacity in investigation. Argonauts of the Western SAPIR, EDWARD Pacific (1922) describes the ceremonial exchange known as —Christina Toren the Kula; a key text for anthropologists, it influenced, for example, Marcel Mauss and Claude LÉVI-STRAUSS. Crime References and Custom in Savage Society (1926) examines reciprocity as an underlying principle of social control. Sex and Repres- Firth, R., Ed. (1957). Man and Culture: An Evaluation of the Work sion in Savage Society (1927) looks at the implications for of Bronislaw Malinowski. London: Routledge and Kegan Paul. 508 Manipulation and Grasping velocities are small enough, and allows for a geometric Firth, R. (1989). Introduction. In B. Malinowski, A Diary in the Strict Sense of the Term. Stanford University Press. analysis of object motion under kinematic constraints. Our Kuper, A. (1996). Anthropologists and Anthropology: The Modern discussion of manipulation is restricted to the problem of British School. 3rd ed. London: Routledge. characterizing the motion of an object pushed by one or sev- Malinowski, B. (1916). Baloma: Spirits of the dead in the Trobri- eral fingers, and it excludes some fundamental problems and Islands. Journal of the Royal Anthropological Institute 46: such as general robot motion planning in the presence of 354–430. obstacles. Malinowski, B. (1922). Argonauts of the Western Pacific. London: Grasping emerged as a field of its own in the early eight- Routledge. ies with the introduction of dextrous multifinger grippers Malinowski, B. (1926). Crime and Custom in Savage Society. Lon- such as the Salisbury Hand (Salisbury 1982) and the Utah- don: Kegan Paul. MIT Dextrous Hand (Jacobsen et al. 1984). Much of the Malinowski, B. (1927). Sex and Repression in Savage Society. London: Kegan Paul. early work was conducted in Roth’s research group at Stan- Malinowski, B. (1929). The Sexual Life of Savages. London: Rout- ford (e.g., Salisbury 1982; Kerr and Roth 1986) drawing on ledge. notions of form and force closure from screw theory (Ball Malinowski, B. (1935). Coral Gardens and their Magic: A Study of 1900), which provides a unified representation for displace- the Methods of Tilling the Soil and of Agricultural Rites in the ments and velocities as well as forces and torques using a Trobriand Islands. Two vols. London: George Allen and line-based geometry. Namely, when a hand holds an object Unwin. at rest, the forces and moments exerted by the fingers should Malinowski, B. (1944). A Scientific Theory of Culture and Other balance each other so as not to disturb the position of this Essays. Chapel Hill: University of North Carolina Press. object. Such a grasp is said to achieve equilibrium. An equi- Malinowski, B. (1952). The problem of meaning in primitive lan- librium grasp achieves force closure when it is capable of guages. In C. K. Ogden and I. A. Richards, Eds., The Meaning of Meaning: A Study of the Influence of Language upon balancing any external force and torque, thus holding the Thought and of the Science of Symbolism. London: Routledge object securely. A form closure grasp achieves the same and Kegan Paul. result by preventing any small object motion through the Malinowski, B. (1974). Magic, Science and Religion. London: geometric constraints imposed by the finger contacts. Intu- Souvenir Press. ition suggests that the two conditions are equivalent, and it Malinowski, B. (1967). A Diary in the Strict Sense of the Term. can indeed be shown that force closure implies form closure London: Routledge and Kegan Paul. and vice versa (Mishra and Silver 1989). A secure grasp Stocking, G. W. (1995). After Tylor: British Social Anthropology should also be stable; in particular, a compliant grasp sub- 1888–1951. Madison: University of Wisconsin Press. mitted to a small external disturbance should return to its Weiner, A. (1988). The Trobrianders of Papua New Guinea. New equilibrium state. Nguyen (1989) has shown that force or York: Holt, Rinehart and Winston. Young, M. W., Ed. (1979). The Ethnography of Malinowski: The form closure grasps are indeed stable. Trobriand Islands 1915–18. London: Routledge and Kegan Screw theory can be used to show that, in the frictionless Paul. case, four or seven fingers are both necessary and, under very general conditions, sufficient (Lakshminarayana 1978; Further Readings Markenscoff, Ni, and Papadimitriou 1990) to construct fric- tionless form or force closure grasps of two- or three- Wayne Malinowska, H. (1985). Bronislaw Malinowski: The influ- dimensional objects, respectively. As could be expected, ence of various women on his life and works. American Ethnol- friction “helps” and it can also be shown that only three or ogist 12:529–40. four fingers are sufficient in the presence of Coulomb fric- tion (Markenscoff, Ni, and Papadimitriou 1990). In fact, it Manipulation and Grasping can also be shown that any grasp achieving equilibrium for some friction coefficient µ will also achieve form or force closure for any friction coefficient µ' > µ (Nguyen 1988; Manipulation and grasping are branches of robotics and Ponce et al. 1997). involve notions from kinematics, mechanics, and CONTROL Screw theory can also be used to characterize the geo- THEORY. Grasping is concerned with characterizing and metric arrangement of contact forces that achieve equilib- achieving the conditions that will ensure that a robot gripper rium (and thus form or force closure under friction). In holds an object securely, preventing, for example, any particular, two forces are in equilibrium when they oppose motion due to external forces. Manipulation, on the other each other and share the same line of action, and three hand, is concerned with characterizing and achieving the forces are in equilibrium when they add to zero and their conditions under which a robot or a part held by a robot will lines of action intersect at a point. The four-finger case is perform a certain motion. Research in both areas has led to more involved, but a classical result from line geometry is practical systems for picking up parts from a conveyor belt that the lines of action of four noncoplanar forces achieving or a pallet, reorienting them, and inserting them into an equilibrium lie on the surface of a (possibly degenerated) assembly (e.g., Tournassoud, Lozano-Perez, and Mazer hyperboloid (Ball 1990). In turn, these geometric conditions 1987; Peshkin and Sanderson 1988; Goldberg 1993), with have been used in algorithms for computing optimal grasp promising applications in flexible manufacturing. forces given fixed finger positions (e.g., Kerr and Roth This entry focuses on a quasi-static model of mechanics 1986), constructing at least one (maybe optimal) configura- that neglects inertial forces and dynamic effects. This is tion of the fingers that will achieve force closure (e.g., valid in typical grasping and manipulation tasks when all Manipulation and Grasping 509 ate characterization of the mechanics of pushing would at Mishra, Schwartz, and Sharir 1987; Markenscoff and the same time provide the means of (1) predicting (at least Papadimitriou 1989), and computing entire ranges of finger partially) the motion of the manipulated object once contact positions that yield force closure (e.g., Nguyen 1988; Ponce is established, and (2) reducing the uncertainty in object et al. 1997). The latter techniques provide some degree of position without sensory feedback. Assuming Coulomb robustness in the presence of the unavoidable positioning friction, he constructed a program that predicts the motion uncertainties of real robotic systems. of an object with a known distribution of support forces As shown in Rimon and Burdick (1993), for example, being pushed at a single contact point. He also devised a certain grasps that are not form closure nevertheless immobi- simple rule for determining the rotation sense of the pushed lize the grasped object. For example, three frictionless fin- object when the distribution is unknown. gers positioned at the centers of the edges of an equilateral Extensions of this approach have been applied to a num- triangle cannot prevent an infinitesimal rotation of the trian- ber of other manipulation problems. Fearing (1986) has gle about its center of mass, although they can prevent any shown how to exploit local tactile information to determine finite motion. Rimon and Burdick (1993) have shown how to how a polygonal object rotates while it is grasped, and dem- characterize these grasps by mapping the constraints onstrated the capture of an object by the three-finger Salis- imposed by the fingers on the motion of an object onto its bury hand as well as the execution of other complex configuration space, that is, the set of object positions and manipulation tasks such as part “twirling.” In the manufac- orientations. In this setting, screw theory becomes a first- turing domain, Peshkin and Sanderson (1988) have shown order theory of mobility, where the curved obstacle surfaces how to use static fences to reorient parts carried by a con- are approximated by their tangent planes, and where immo- veyor belt, and Goldberg (1993) has used a modified two- bilized object configurations correspond to isolated points of jaw gripper to plan a sequence of grasping operations that the free configuration space. Rimon and Burdick have shown will reorient a part with unknown original orientation. A that second-order (curvature) effects can effectively prevent variant of this approach has also been used to plan a set of any finite object motion, and they have given operational tray-tilting operations that will reorient a part lying in a tray conditions for immobilization and proven the dynamic sta- (Erdmann and Mason 1988). More recently, Lynch and bility of immobilizing grasps under various deformation Mason (1995) have derived sufficient conditions for stable models (Rimon and Burdick 1994). An additional advantage pushing, namely, for finding a set of pushing directions that of this theory is that second-order immobility can be will guarantee that the pushed object remains rigidly achieved with fewer fingers than form closure (e.g., four fin- attached to the pusher during the manipulation task. They gers instead of seven are in general sufficient to guarantee have also proven conditions for local and global controlla- immobility in the frictionless case). Techniques for comput- bility, and given an algorithm for planning pushing tasks in ing second-order immobilizing grasps have been proposed in the presence of obstacles. Sudsang, Ponce, and Srinivasa (1997) for example. The kinematics of pushing are important as well, because Once an object has been grasped, it can of course be they determine the relative positions and orientations of the manipulated by moving the gripper while keeping its fingers gripper-object pair during the execution of a manipulation locked, but the range of achievable motions is limited by task. Brost (1991) has shown how to construct plans for physical constraints. For example, the rotational freedom of pushing and compliant motion tasks through a detailed geo- a gripper about its axis is usually bounded by the mechanics metric analysis of the obstacle formed by a rigid polygon in of the attached robot wrist. A simple approach to fine the configuration space of a second polygon. More recently, manipulation in the plane is to construct finger gaits (Hong Sudsang, Ponce, and Srinivasa (1997) have introduced the et al. 1990). Assume that a disk is held by a four-finger hand notion of inescapable configuration space (ICS) region for in a three-finger force closure grasp. A certain amount of, a grasp. As noted earlier, an object is immobilized when it say, counterclockwise rotation can be achieved by rotating rests at an isolated point of its free configuration space. A the wrist. To achieve a larger rotation, first position the small motion of a finger away from the object will trans- fourth finger so that the disk will be held in force closure by form this isolated point into a compact region of free space the second, third, and fourth fingers, then release the first (the ICS) that cannot be escaped by the object. For simple finger and reposition it. By repositioning the four fingers in pushing mechanisms, it is possible to compute the maxi- turn so that their overall displacement is clockwise, we can mum ICS regions and the corresponding range of finger then apply a new counterclockwise rotation of the wrist, and motions, and to show that moving the finger from the far repeat the process as many times as necessary. (See Li, end of this range to its immobilizing position will cause the Canny, and Sastry 1989 for a related approach to dextrous ICS to continuously shrink, ensuring that the object ends up manipulation, which includes coordinated manipulation, as in the planned immobilizing configuration. Thus a grasp can well as rolling and sliding motions.) be executed in a robust manner, without requiring a model Thus far, our discussion has assumed implicitly that a of the part motion at contact. More complex manipulation workpiece starts and remains at rest while it is grasped. This tasks can also be planned by constructing a graph of over- will be true when the part is very heavy or bolted to a table, lapping maximum ICS regions. This approach has been but in a realistic situation, it is likely to move when the first applied to grasping and in-hand manipulation with a multi- contact is established and contact may be immediately bro- fingered reconfigurable gripper (Sudsang, Ponce, and Srini- ken. Moreover, the actual position and orientation of the vasa 1997), and more recently, to manipulation tasks using object with respect to the hand are usually (at best) close to disk-shaped mobile platforms in the plane. the nominal ones. Mason (1986) proposed that an appropri- 510 Manipulation and Grasping See also Int. Conf. on Robotics and Automation: Atlanta, GA, pp. 994– BEHAVIOR-BASED ROBOTICS; HAPTIC PERCEP- 1000. TION; MOBILE ROBOTS; ROBOTICS AND LEARNING; WALKING Rimon, E. and J. W. Burdick. (1994). Mobility of bodies in con- AND RUNNING MACHINES tact: 2. How forces are generated by curvature effects. In Proc. —Jean Ponce IEEE Int. Conf. on Robotics and Automation, San Diego, CA. Salisbury, J. K. (1982). Kinematic and force analysis of articulated hands. Ph.D. diss., Stanford University. References Sudsang, A., J. Ponce, and N. Srinivasa. (1997). Algorithms for con- Ball, R. S. (1900). A Treatise on the Theory of Screws. New York: structing immobilizing fixtures and grasps of three-dimensional Cambridge University Press. objects. In J-P. Laumont and M. Overmars, Eds., Algorithmic Brost, R. C. (1991). Analysis and planning of planar manipulation Foundations of Robotics, vol. 2. Peters, pp. 363–380. tasks. Ph.D. diss., Carnegie-Mellon University. Tournassoud, P., T. Lozano-Perez, and E. Mazer. (1987). Regrasp- Erdmann, M. A., and M. T. Mason. (1988). An exploration of sen- ing. In Proc. IEEE Int. Conf. on Robotics and Automation, sorless manipulation. IEEE Journal of Robotics and Automa- Raleigh, NC, pp. 1924–1928. tion 4: 369–379. Fearing, R. S. (1986). Simplified grasping and manipulation with Further Readings dextrous robot hands. IEEE Transactions on Robotics and Automation 4(2): 188–195. Akella, S., and M. T. Mason. (1995). Parts orienting by push- Goldberg, K. Y. (1993) Orienting polygonal parts without sensors. aligning. In Proc. IEEE Int. Conf. on Robotics and Automation. Algorithmica 10(2): 201–225. Nagoya, Japan, pp. 414–420. Hong, J., G. Lafferriere, B. Mishra, and X. Tan. (1990). Fine Baker, B. S., S. J. Fortune, and E. H. Grosse. (1985). Stable pre- manipulation with multifinger hands. In Proc. IEEE Int. Conf. hension with a multi-fingered hand. In Proc. IEEE Int. Conf. on on Robotics and Automation. IEEE Press, 1568–1573. Robotics and Automation, St. Louis, MO. pp. 570–575. Jacobsen, S. C., J. E. Wood, D. F. Knutti, and K. B. Biggers. Brost, R. C., and K. Goldberg. (1996). A complete algorithm (1984). The Utah-MIT Dextrous Hand: Work in progress. Inter- for designing planar fixtures using modular components. national Journal of Robotics Research 3(4): 21–50. IEEE Transactions on Robotics and Automation 12(1): 31– Kerr, J. R., and B. Roth. (1986). Analysis of multi-fingered hands. 46. International Journal of Robotics Research 4(4). Cutkosky, M. R. (1984). Mechanical properties for the grasp of a Lakshminarayana, K. (1978). Mechanics of form closure. Techni- robotic hand. Technical Report CMU-RI-TR-84-24, Carnegie- cal Report 78-DET-32.: American Society of Mechanical Engi- Mellon University Robotics Institute. neers. Ferrari, C., and J. F. Canny. (1992). Planning optimal grasps. In Li, Z., J. F. Canny, and S. S. Sastry. (1989). On motion planning for Proc. IEEE Int. Conf. on Robotics and Automation, Nice, dextrous manipulation: 1. The problem formulation. In Proc. France, pp. 2290–2295. IEEE Int. Conf. on Robotics and Automation, Scottsdale, AZ, Goldberg, K., and M. T. Mason. (1990). Bayesian grasping. In pp. 775–780. Proc. IEEE Int. Conf. on Robotics and Automation, IEEE Press, Lynch, K. M., and M. T. Mason. (1995). Stable pushing: Mechanics, pp. 1264–1269. controllability, and planning. In K. Y. Goldberg, D. Halperin, Howard, W. S., and V. Kumar. (1994). Stability of planar grasps. In J-C. Latombe, and R. Wilson, Eds., Algorithmic Foundations of Proc. IEEE Int. Conf. on Robotics and Automation, San Diego, Robotics. A. K. Peters, pp. 239–262. CA, pp. 2822–2827. Markenscoff, X., L. Ni, and C. H. Papadimitriou. (1990). The Ji, Z., and B. Roth. (1988). Direct computation of grasping force geometry of grasping. International Journal of Robotics for three-finger tip-prehension grasps. Journal of Mechanics, Research 9(1): 61–74. Transmissions, and Automation in Design 110: 405–413. Markenscoff, X., and C. H. Papadimitriou. (1989). Optimum grip of a Kirkpatrick, D. G., B. Mishra, and C. K. Yap. (1990). Quantitative polygon. International Journal of Robotics Research 8(2): 17–29. Steinitz’s theorems with applications to multifingered grasping. Mason, M. T. (1986). Mechanics and planning of manipulator In Twentieth ACM Symp. on Theory of Computing, Baltimore, pushing operations. International Journal of Robotics Research MD, pp. 341–351. 5(3): 53–71. Latombe, J-C. (1991). Robot Motion Planning. Dordrecht: Mishra, B., J. T. Schwartz, and M. Sharir. (1987). On the existence Kluwer. and synthesis of multifinger positive grips. Algorithmica, Spe- Laugier, C. (1981). A program for automatic grasping of objects cial issue on robotics 2(4): 541–558. with a robot arm. In Eleventh International Symposium on Mishra, B., and N. Silver. (1989). Some discussion of static grip- Industrial Robots. ping and its stability. IEEE Systems, Man, and Cybernetics Li, Z., and S. Sastry. (1987). Task-oriented optimal grasping by 19(4): 783–796. multifingered robot hands. In Proc. IEEE Int. Conf. on Robotics Nguyen, V-D. (1988). Constructing force-closure grasps. Interna- and Automation, IEEE Press, pp. 389–394. tional Journal of Robotics Research 7(3): 3–16. Lozano-Perez, T. (1976). The design of a mechanical assembly Nguyen, V-D. (1989). Constructing stable grasps. International system. MIT AI Memo 397. Cambridge, MA: MIT Artificial Journal of Robotics Research 8(1): 27–37. Intelligence Lab. Peshkin, M. A., and A. C. Sanderson. (1988). Planning robotic Mason, M., and J. K. Salisbury. (1985). Robot Hands and the manipulation strategies for workpieces that slide. IEEE Journal Mechanics of Manipulation. Cambridge, MA: MIT Press. of Robotics and Automation 4(5). Murray, R. M., Z. Li, and S. S. Sastry. (1994). A Mathematical Ponce, J., S. Sullivan, A. Sudsang, J-D. Boissonnat, and J-P. Mer- Introduction to Robotic Manipulation. CRC Press. let. (1997). On computing four-finger equilibrium and force- Pertin-Troccaz, J. (1987). On-line automatic programming: A case closure grasps of polyhedral objects. International Journal of study in grasping. In Proc. IEEE Int. Conf. on Robotics and Robotics Research 16(1): 11–35. Automation, Raleigh, NC, pp. 1292–1297. Rimon, E., and J. W. Burdick. (1993). Towards planning with force Pollard, N. S., and T. Lozano-Perez. (1990). Grasp stability and constraints: On the mobility of bodies in contact. In Proc. IEEE feasibility for an arm with an articulated hand. In Proc. IEEE Marr, David 511 other on the neocortex, both of which remain landmarks in Int. Conf. on Robotics and Automation, IEEE Press, pp. 1581– 1585. theoretical neurophysiology. Ponce, J., and B. Faverjon. (1995). On computing three-finger After obtaining his Ph.D., Marr accepted an appointment force-closure grasps of polygonal objects. IEEE Transactions at the MRC Laboratory of Molecular Biology under Sydney on Robotics and Automation 11(6): 868–881. Brenner and Francis Crick, and he retained an affiliation Ponce, J., D. Stam, and B. Faverjon. (1993). On computing force- with MRC until 1976. The thrust of Marr’s work changed closure grasps of curved two-dimensional objects. Interna- rather dramatically in 1972, however, following an interdis- tional Journal of Robotics Research 12(3): 263–273. ciplinary workshop on the brain where he met Marvin Min- Reulaux, F. (1876/1963). The Kinematics of Machinery. New York: sky and Seymour Papert, who extended an invitation to visit Macmillan. Reprint, New York: Dover. the MIT Artificial Intelligence Laboratory. This visit rein- Rimon, E., and A. Blake. (1996). Caging 2D bodies by one- forced Marr’s growing conviction that a complete theory of parameter two-fingered gripping systems. In Proc. IEEE Int. Conf. on Robotics and Automation, Minneapolis, MN, pp. 1458– any brain structure must go beyond interpreting anatomical 1464. and physiological facts to include an analysis of the task Roth, B. (1984). Screws, motors, and wrenches that cannot be being performed, or more specifically, an understanding of bought in a hardware store. In Int. Symp. on Robotics Research the problem that the information-processing device was Cambridge, MA: MIT Press, pp. 679–693. “solving.” Equally important, he recognized the weakness Trinkle, J. C. (1992). On the stability and instantaneous velocity of of theories that appeared explanatory but were not demon- grasped frictionless objects. IEEE Transactions on Robotics strative. However, demonstrative theories of brain function and Automation 8(5): 560–572. required large and flexible computing resources such as Wallack, A., and J. F. Canny. (1994). Planning for modular and those available at MIT. Consequently, Marr’s initial three- hybrid fixtures. In Proc. IEEE Int. Conf. on Robotics and Auto- month visit to the AI lab oratory in 1973 became extended, mation, San Diego, CA, pp. 520–527. extended again, and then by 1976 became permanent. In 1977 he was appointed to the faculty of the (current) Markov Decision Problems Department of Brain and Cognitive Sciences, becoming full professor at MIT in 1980, while continuing to hold his AI See Lab appointment. DYNAMIC PROGRAMMING; RATIONAL DECISION MAK- Marr’s years at the AI laboratory were incredibly produc- ING; REINFORCEMENT LEARNING tive. The goal was to understand both the competence as well as the performance of a biological information-processing Marr, David system. Although some studies of movement continued, the primary thrust was understanding the mammalian visual sys- David Marr (1945–1980), theoretical neurophysiologist and tem. Although small, Marr’s remarkably talented group cognitive scientist, integrated neurophysiological and psycho- included Tomaso Poggio and Shimon Ullman. They shared physical studies with the computational methods of artificial Marr’s conviction that explanations of a complex system are intelligence (AI) to found a new, more powerful approach to found at several levels. A complete study should address at understanding biological information-processing systems. least three levels: an analysis of the competence, the design The approach has come to redefine the standard for achieving of an algorithm and choice of representation, and an imple- a suitable comprehension of brain structure and function. mentation. In order to provide a coherent framework for Marr was born in Essex, England on 19 January 1945 to organizing and attacking visual problems—more properly, Douglas and Madge Marr, and died at age thirty-five of leu- problems of “seeing”—Marr proposed separating perceptual kemia. After attending Rugby, he entered Trinity College, processing tasks into three main stages: a primal sketch, Cambridge in 1963, obtaining the B.S. degree in mathemat- where properties of the image are made explicit; a 2½-D ics in 1966 with first-class honors. Shortly thereafter, under sketch, which is a viewer-centered representation of the sur- the guidance of Giles Brindley, he began an intensive year of face geometry and surface properties; and lastly a 3-D model study on all aspects of brain function, with the intent of representation, which is object centered rather than viewer focusing on the neural implementation of efficient associa- based. His 1982 book, Vision: A Computational Investiga- tive memories. By the end of 1968 he had submitted a disser- tion into the Human Representation and Processing of Visual tation for a title, a fellowship at Trinity College, and was Information, published posthumously, summarizes these elected. He received two advanced degrees from Trinity in ideas, as well as the contributions to image processing, 1971: an M.S. in mathematics and a Ph.D. in theoretical neu- grouping, color, stereopsis, motion, surface geometry, TEX- rophysiology. The first part of his thesis was a theoretical TURE, SHAPE PERCEPTION, and OBJECT RECOGNITION. Many analysis of the cerebellar cortex, published in the Journal of of these contributions had appeared by the late 1970s, and in Physiology in 1969. This work was the first detailed theory recognition of this work, Marr received in 1979 the Comput- on any really complex piece of neural machinery, with very ers and Thought Award from the International Joint Confer- specific predictions concerning the input-output relations ence on Artificial Intelligence. More recently, “best paper” and details of synaptic modifiabilities during the learning of awards in the name of David Marr have been created by the new motor movements. The essence of this theory is still via- International Conference on Computer Vision and by the ble today, and continues to be the benchmark for further Cognitive Science Society. advances in understanding cerebellar cortex. Two other As a person, David Marr was charismatic and inspiring. papers also appeared before 1971: one on the archicortex, the He was both fun and brilliant, enjoying the adventures of 512 Materialism understanding brain structure and function, as well as life enduring interest in the bulbo-reticular system, due mainly itself. He communicated his pleasure in clear and compel- to the innovative studies of Magvin and Snyder at North- ling ways, not only in personal exchanges and in his writ- western University. ing but also in music. He was an accomplished clarinetist. In 1942 he took a medical student, Jerry Lettvin, together His early death was a great loss but even in his short life, he with his friend Walter PITTS, into his laboratory, and by mid- was able to bring together two previously diverse disci- year, into his home. Pitts, taking an interest in the nervous plines, set a higher standard for explanatory understanding, system, told McCulloch of Gottfried Wilhelm Leibniz’s dic- and open new doors to unraveling the mysteries of mind tum that any task that can be completely and unambiguously and brain. set forth in logical terms can be performed by a logical engine. In the previous year, David Lloyd had demonstrated See also CEREBELLUM; COMPUTATIONAL VISION; monosyllabic excitation, facilitation, and inhibition. And in MACHINE VISION; MID-LEVEL VISION; STEREO AND MOTION 1936–37 Alan TURING had published his brilliant essay on PERCEPTION the universal logical engine. It seemed to McCulloch and —Whitman A. Richards Pitts that neurons could be conceived as logical elements, pulsatile rather than two-state devices, and capable of realiz- References and Further Readings ing logical process. The ensuing paper, “A Logical Calculus of the Ideas Immanent in Nervous Activity” (McCulloch Marr, D. (1982). Vision: A Computational Investigation into the and Pitts 1943) became the inspiration for a new view of the Human Representation and Processing of Visual Information. nervous system, and a justification for the project of artifi- San Francisco: W. H. Freeman. cial intelligence. Vaina, L. (1991). From Retina to Cortex: Selected Papers of David Marr. Boston: Birkhäuser. In a later second paper, “On how we know universals: The perception of auditory and visual forms,” Pitts and Materialism McCulloch implemented their notions by showing how the anatomy of the cerebral cortex might accommodate the identification of form independent of its angular size in the See MIND-BODY PROBLEM; PHYSICALISM image, and other such operations in perception. McCulloch at the same time carried on a full load of his Maximum Likelihood Density Estimation studies in the physiology of the cortex and other parts of the nervous system. That was his daytime work. But, having become enamored of LOGIC, his evenings were devoted to See MINIMUM DESCRIPTION LENGTH; UNSUPERVISED LEARN- problems in logical representation of mental operations. ING After all, he had developed a profound interest in philoso- phy while at Yale. McCulloch, Warren S. By the end of the 1940s, with John VON NEUMANN’s first digital computers, Norbert Wiener’s linear prediction the- Warren McCulloch (1898–1968) was a physician turned ory, and the massive and singularly intellectual thrust of the physiologist. After medical school, he trained in neurology military and industrial complex, it was evident that a new from 1928–1931, studied mathematical physics in 1931– era was opening. In 1951, Jerome Wiesner, on Norbert 1932, worked as a clinician from 1932–34, then joined the Wiener’s advice, offered a place at the Research Laboratory Yale Laboratory of Neurophysiology and by 1941 became of Electronics at MIT to McCulloch, Pat Wall, and Lettvin. an assistant professor in the department. His main work at There McCulloch and Wall worked on spinal cord physiol- Yale was on the functional connections in the CEREBRAL ogy. Pitts was already at MIT. At the time McCulloch was CORTEX of primates. Dusser de Barenne, his mentor and full professor at the University of Illinois, and Wall was collaborator, had developed the method of strychnine neu- assistant professor at the University of Chicago. Although ronography, a way of determining the direct projection of the move brought loss of academic status and a serious cut one architectonically specified region in the cortex of the in pay, the three of them accepted the invitation. forebrain to other regions. It is a clever and reliable tech- Beginning in 1951, McCulloch became a magnet for a nique that served well to show in a single day of experiment most diverse company of those concerned with the new what would take years to work out by the standard anatomi- communications revolution. Benoit Mandelbrot, Manuel cal procedures at the time. Little of what the technique Blum, Marvin Minsky, Seymour Papert, and a host of oth- revealed has been faulted, but it never caught on for a vari- ers, then young and eager and hungry for discovery, visited ety of reasons, the main one being a general misunderstand- frequently and stayed for long discussions. ing of the underlying physiology. Several major works were issued before 1960. One that In 1941 McCulloch came to the Illinois Neuropsychiatric charmed McCulloch particularly was the arduous source- Institute as Associate Professor of Psychiatry at the College sink analysis of currents in the spinal cord. Pitts had laid out of Medicine, University of Illinois. Percival Bailey, Profes- the general method behind the effort; published in 1954– sor of Neurosurgery, had worked with McCulloch at Yale, 1955, it was the first demonstration of presynaptic inhibi- as had Gerhardt van Bonin, Professor of Anatomy. The neu- tion between the collaterals of dorsal root fibers. A year and ronography of primate cortex work continued and attracted a half had been spent on computations that would occupy many visiting collaborators, and McCulloch developed an about an hour on today’s machines. Then there was the Meaning 513 demonstration of the action of strychnine, which provided Moreno-Diaz, R., and J. Mira-Mira, Eds. (1996). Brain Processes, Theories and Models: An International Conference in Honor of the ground needed to justify strychnine neuronography. W. S. McCulloch 25 Years after His Death. Cambridge, MA: Wall turned his interest to the mechanism involved in pain. MIT Press. H. R. Maturana joined the group from Harvard and, thanks Papert, S. (1965). Introduction to Embodiments of Mind. to his patience and skill, “What the Frog’s Eye Tells the Perkel, D. H. Logical neurons: The enigmatic legacy of Warren Frog’s Brain” was published (Lettvin et al. 1959). McCulloch. Trends in Neuroscience 11: 9–12. Bob Gesteland joined the group as McCulloch’s graduate Weir, M. K. (1996). Putting the mind inside the head. In Moreno- student, addressing himself, under the urging of Pitts and Diaz and Mira-Mira (1996). McCulloch, to the problem of olfaction. His were the first recordings of the activity of single olfactory cells in the McCulloch-Pitts Neurons nasal mucosa of frogs. And his results—namely, that every cell responds to almost every odorant but with different cod- See ing patterns of activity, for different odorants, and that cells AUTOMATA; MCCULLOCH, WARREN S.; PITTS, WALTER; differ among themselves in their coding patterns for the VON NEUMANN, JOHN odorants—are just bearing fruit today. In short, McCulloch Meaning was the center of a new thrust in nervous physiology, even more fascinating than what he envisioned before coming to MIT. The meaning of an expression, as opposed to its form, is that McCulloch had great generosity of spirit. He treated feature of it which determines its contribution to what a everyone as an equal and made an effort to encourage the speaker says in using it. Meaning conveyed by a speaker is best in everyone he met. He never showed the faintest hint the speaker’s communicative intent in using an expression, of malice or envy or deviousness, but spoke, wrote, and car- even if that use departs from the expression’s meaning. ried himself as a nineteenth-century cavalier. A complete Accordingly, any discussion of meaning should distinguish list of his publications is given in his Collected Works, and a speaker’s meaning from linguistic meaning. small indication of his influence is evidenced by the Further We think of meanings as what synonyms (or translations) Readings below. have in common, what ambiguous expressions have more See also AUTOMATA; COMPUTATION AND THE BRAIN; than one of, what meaningful expressions have and gibberish NEURAL NETWORKS; NEWELL; VON NEUMANN; WIENER lacks, and what competent speakers grasp. Yet linguistic meaning is a puzzling notion. The traditional view is that the —Jerome Lettvin meaning of a word is the concept associated with it and, as FREGE suggested, what determines its reference, but this References plausible view is problematic in various ways. First, it is not Lettvin, J., H. Maturana, W. McCulloch, and W. Pitts. (1959). clear what CONCEPTS are. Nor is it clear what the relevant What the frog’s eye tells the frog’s brain. Proceedings of the sort of association with words is, or, indeed, that every word IRE 47: 1940–1959. Reprinted in Embodiments of Mind. has a concept, much less a unique concept, associated with McCulloch, R., Ed. (1989). Collected Works of Warren S. McCul- it. Wittgenstein (1953) even challenged the Platonic assump- loch. 4 volumes. Salinas, CA: Intersystems Publications. tion that all the items to which a word applies must have McCulloch, W. S. (1988). Embodiments of Mind. Cambridge, MA: something in common. Unfortunately, there is no widely MIT Press. Originally published 1965. accepted alternative to the traditional view. Skepticism about McCulloch, W., and W. Pitts. (1943). A logical calculus of the meaning, at least as traditionally conceived, has also been ideas immanent in nervous activity. Bulletin of Mathematical registered in various ways by such prominent philosophers Biophysics 5: 115–133. Reprinted in Embodiments of Mind. as Quine (1960), Davidson (1984), Putnam (1975), and Pitts, W., and W. McCulloch. (1947). On how we know universals: The perception of auditory and visual forms. Bulletin of Mathe- Kripke (1982); for review of the debates these philosophers matical Biophysics 9:127–147. Reprinted in Embodiments of have generated see Hale and Wright 1997 (chaps. 8 and 14– Mind. 17). Psychological approaches based on prototypes or on semantic networks, as well as COGNITIVE LINGUISTICS, seem Further Readings to sever the connection between meaning and reference. The most popular philosophical approaches to sentence meaning, Anderson, J. A. (1996). From discrete to continuous and back such as truth-conditional, model-theoretic, and POSSIBLE again. In Moreno-Diaz and Mira-Mira (1996). WORLDS SEMANTICS, also have their limitations. They seem Arbib, M. A. (1996). Schema theory: From Kant to McCulloch and ill equipped to distinguish meanings of expressions that nec- beyond. In Moreno-Diaz and Mira-Mira (1996). essarily apply in the same circumstances or to handle non- Cull, P. (1996). Neural nets: Classical results and current problems. truth-conditional aspects of meaning. In Moreno-Diaz and Mira-Mira (1996). Lettvin, J. (1988). Foreword to the 1988 reprint of Embodiments of Here are six foundational questions about meaning, as Mind. difficult as they are basic: Lettvin, J. (1989). Introduction to vol. 1. R. McCulloch, Ed., Col- 1. What are meanings? lected Works of Warren S. McCulloch. 2. What is it for an expression to have meaning? Lindgren, N. (1969). The birth of cybernetics—an end to the old 3. What is it to know the meaning(s) of an expression? world: The heritage of Warren S. McCulloch. Innovation 6: (More generally, what is it to understand a language?) 12–15. 514 Memory 4. What is the relationship between the meaning of an sition that they are intended to recognize it. This idea, which expression and what, if anything, the expression refers has important applications to GAME THEORY (communica- to? tion is a kind of cooperative game), is essential to explaining 5. What is the relationship between the meaning of a com- how a speaker can make himself understood even if he does plex expression and the meanings of its constituents? not make fully explicit what he means, as in IMPLICATURE. An answer to question 1 would say whether meanings are Understanding a speaker is not just a matter of understand- psychological, social, or abstract, although many philoso- ing his words but of identifying his communicative inten- phers would balk at the question, insisting that meanings are tion. One must rely not just on knowledge of linguistic not entities in their own right and that answering question 2 meaning but also on collateral information that one can rea- would take care of question 1. An answer to question 3 sonably take the speaker to be intending one to rely on (see would help answer question 2, for what expressions mean Bach and Harnish 1979 for a detailed account). Communi- cannot be separated from (and is perhaps reducible to) what cation is essentially an intentional-inferential affair, and lin- people take them to mean. And question 4 bears on question guistic meaning is just the input to the inference. 3. It was formerly assumed that the speaker’s internal state See also INDIVIDUALISM; NARROW CONTENT; RADICAL underlying his knowledge of the meaning of a term deter- INTERPRETATION; SENSE AND REFERENCE; REFERENCE, THE- mines the term’s reference, but Putnam’s (1975) influential ORIES OF TWIN EARTH thought experiments have challenged this —Kent Bach “internalist” or “individualist” assumption. In reaction, Chomsky (1986, 1995) and Katz (1990) have defended ver- References sions of internalism about knowledge of language and meaning. Bach, K., and R. M. Harnish. (1979). Linguistic Communication Question 5 points to the goal of linguistic theory: to pro- and Speech Acts. Cambridge, MA: MIT Press. vide a systematic account of the relation between form and Chomsky, N. (1986). Knowledge of Language. New York: Praeger. meaning. SYNTAX is concerned with linguistic form, includ- Chomsky, N. (1995). Language and nature. Mind 104: 1–61. ing LOGICAL FORM, needed to represent scope relationships Davidson, D. (1984). Essays on Truth and Interpretation. Oxford: Oxford University Press. induced by quantificational phrases and modal and other Grice, P. (1989). Studies in the Way of Words. Cambridge, MA: operators; SEMANTICS, with how form maps onto linguistic Harvard University Press. meaning. The aim is to characterize the semantic contribu- Hale, B., and C. Wright, Eds. (1997). The Blackwell Companion to tions made by different types of expression to sentences in the Philosophy of Language. Oxford: Blackwell. which they occur. The usual strategy is to seek a systematic, Katz, J. J. (1990). The Metaphysics of Meaning. Cambridge, MA: recursive way of specifying the meanings of a complex MIT Press. expression (a phrase or sentence) in terms of the meanings Kripke, S. (1982). Wittgenstein on Rules and Private Language. of its constituents and its syntactic structure (see Larson and Cambridge, MA: Harvard University Press. Segal 1995 for a detailed implementation). Underlying this Larson, R., and G. Segal. (1995). Knowledge of Meaning. Cam- strategy is the principle of semantic COMPOSITIONALITY, bridge, MA: MIT Press. Lyons, J. (1995). Linguistic Semantics: An Introduction. Cam- which seems needed to explain how a natural language is bridge: Cambridge University Press. learnable (but see Schiffer 1987). Compositionality poses Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, MA: certain difficulties, however, regarding conditional sen- MIT Press. tences, PROPOSITIONAL ATTITUDE ascriptions, and various Putnam, H. (1975). The meaning of “meaning.” In K. Gunderson, constructions of concern to linguists, such as genitives and Ed., Language, Mind, and Knowledge. Minneapolis: University adjectival modification. For example, although Rick’s team of Minnesota Press, pp. 131–193. is a so-called possessive phrase, Rick’s team need not be the Quine, W. V. (1960) Word and Object. Cambridge, MA: MIT team Rick owns—it might be the team he plays for, Press. coaches, or just roots for. Or consider how the force of the Schiffer, S. (1972). Meaning. Oxford: Oxford University Press. adjective fast varies in the phrases fast car, fast driver, fast Schiffer, S. (1987). The Remnants of Meaning. Cambridge, MA: MIT Press. track, and fast race (see Pustejovsky 1995 for a computa- Wittgenstein, L. (1953). Philosophical Investigations. New York: tional approach to such problems). Macmillan. The study of speaker’s meaning belongs to PRAGMATICS. What a speaker means in uttering a sentence is not just a matter of what his words mean, for he might mean some- Memory thing other than or more than what he says. For example, one might use “You’re another Shakespeare” to mean that The term memory implies the capacity to encode, store, and someone has little literary ability and “The door is over retrieve information. The possibility that memory might not there” to mean also that someone should leave. The listener be a unitary system was proposed by William JAMES (1898) has to figure out such things, and also resolve any AMBIGU- who suggested two systems which he named primary and ITY or VAGUENESS in the utterance and identify the refer- secondary memory. Donald HEBB (1949) also proposed a ences of any INDEXICALS AND DEMONSTRATIVES. GRICE dichotomy, suggesting that the brain might use two separate (1989) ingeniously proposed that communicating involves a neural mechanisms with primary or short-term storage distinctive sort of audience-directed intention: that one’s being based on electrical activation, while long-term mem- audience is to recognize one’s intention partly on the suppo- Memory 515 ory reflected the growth of relatively permanent neuronal deficits showed preservation of the long-term component, links between assemblies of cells. but little or no recency, while amnesiac patients showed the Empirical support for a two-component view began to opposite pattern. emerge in the late 1950s, when Brown (1958) and Peterson Finally the learning characteristics of the two systems and Peterson (1959) observed that even small amounts of appeared to differ. The short-term system has a limited information would show rapid forgetting, provided the sub- capacity, but appears to be relatively insensitive to speed of ject was prevented from maintaining it by active rehearsal. presentation, and in the case of verbal material to be sensi- The characteristic forgetting pattern appeared to differ from tive to the sound or phonological characteristics of the mate- that observed in standard long-term memory experiments, rial presented. The long-term system, on the other hand, has leading to the suggestion that performance depended on a a huge capacity but a relatively slow rate of acquisition of separate short-term store. Such a view was vigorously new material, and a tendency to encode verbal material in opposed by Melton (1963), leading to a period of intense terms of its meaning rather than sound (Baddeley 1966a, activity during the early 1960s that was concerned with the 1966b; Waugh and Norman 1965). question of whether memory should be regarded as a unitary The 1960s saw a growing interest in developing mathe- or dichotomous system. matical models of learning and memory, with the most By the late 1960s, the evidence seemed to strongly favor influential of these being that of Atkinson and Shiffrin the dichotomous view. A particularly influential source of (1968) which became known as the modal model. However, evidence was provided by a small number of neuropsycho- problems with a simple dichotomy rapidly emerged, leading logical patients who appeared to have a specific deficit of to the wide-scale abandonment of the field by many of its either the short-term or the long-term system. The clearest investigators. evidence of preserved short-term (STM) and impaired long- One problem stemmed from Atkinson and Shiffrin’s term memory (LTM) comes in the classic amnesiac assumption that the probability of an item being stored in syndrome. Particularly influential was case H.M. who LTM was a simple function of how long it was main- underwent bilateral excision of the HIPPOCAMPUS in an tained in the short-term system. A number of studies demonstrated that active and vigorous verbal rehearsal attempt to treat intractable epilepsy. H.M. was left with a might link to very little durable LTM (Craik and Watkins profound amnesia, unable to commit new material to mem- 1973; Bjork and Whitten 1974). This prompted Craik and ory, whether visual or verbal, and showing no capacity to Lockhart (1972) to propose their levels of processing the- learn his way around a new environment, to recognize peo- ory of memory. This proposed that an item to be remem- ple who worked with him regularly, or to remember the con- bered, such as a word, could be processed at a series of tent of anything he read or saw. His STM, on the other hand, encoding levels, beginning with the visual appearance of as evidenced by the capacity to hear and repeat back a string the word on the page, moving on to the sound of the word of digits such as a telephone number, was quite normal when pronounced, and, given further and deeper process- (Milner 1966). ing, to the meaning of that word and its relationship to The opposite pattern of memory deficit was demon- other experiences of the subject. Craik and Lockhart sug- strated by Shallice and Warrington (1970) in a patient, K.F., gested that the deeper the level of encoding, the more who was unable to repeat back more than two digits, but durable the memory trace. There is no doubt that this sim- whose long-term learning capacity and everyday memory ple formulation does capture an important characteristic were well within the normal range. His lesion was in the left of long-term learning, namely, that encoding material hemisphere in an area known to be associated with lan- richly and elaborately in terms of prior experience will guage. Subsequent studies have shown that language and lead to a comparatively durable and readily retrievable short-term phonological memory are often impaired in the memory trace. same patient, but that the two areas are separable and the Note however that levels of processing is not an alterna- symptoms dissociable. When tested on the Petersons’ short- tive to a dichotomous view; indeed Craik and Lockhart term forgetting task, patients like K.F. proved to show very themselves postulate a primary memory system as part of rapid forgetting, whereas densely amnesiac patients show their model, although this aspect of their work receives very normal performance, provided their amnesia is pure and much less attention than the concept of encoding levels. unaffected by more general intellectual deficits (Baddeley A second difficulty for the modal model lay in the neu- and Warrington 1970). ropsychological evidence. It may be recalled that patients Evidence from normal subjects paralleled the neuropsy- with an STM deficit performed poorly on tasks such as chological research in suggesting the need for at least two immediate memory span and recency, but were normal in separate memory systems. Many memory tests appeared to their LTM performance. The modal model suggested, how- show two separate components, one that was durable and ever, that the short-term system acts as a crucial antecham- long-term while the other showed rapid dissipation. For ber to long-term learning, hence predicting that such example, if a subject hears a list of twenty unrelated words patients should have impaired learning capacity, and indeed and is asked to recall as many as possible in any order, there should show poor performance on a wide range of tasks that will be a tendency for the last few words to be well recalled, were assumed to be dependent on the limited-capacity the so-called recency effect. However a delay of only a few short-term system. They showed no evidence of this, with seconds is sufficient for the effect to disappear, while recall one such patient being an efficient secretary, while another of earlier items remains stable. When this paradigm was ran a shop and raised a family. applied to neuropsychological patients, those with STM 516 Memory This problem formed the focus of work by Baddeley tion between an episodic LTM system (depending on a cir- and Hitch (1974), who attempted to simulate the neuropsy- cuit linking the temporal lobes, the frontal lobes, and chological STM deficit by means of a dual task technique. parahippocampal regions), and a whole range of implicit Subjects were required to hold and rehearse sequences of learning systems, each tending to reflect a different brain digits varying in length while at the same time performing region. a range of other tasks that were assumed to depend upon While these systems are of considerable interest in their the limited-capacity store. It was assumed that longer own right, and as ways of analyzing perceptual and motor sequences of digits would absorb more of the store, until processing, it can be questioned as to whether they should eventually capacity was reached, leaving the main tasks to be referred to as memory systems, as they typically involve be performed without the help of the short-term system. A relatively automatic retrieval processes that are often not range of tasks were studied including long-term learning, under the direct control of the subject. In contrast, episodic reasoning, and comprehension. A clear pattern emerged memory is the system that typifies our experience of recol- suggesting that concurrent digits did impair performance lecting the past. Indeed, Tulving (1985) suggests that its systematically, but by no means obliterated it. This led to a crucial and defining feature is the recollective process, reformulation of the STM hypothesis and the postulation accompanied by the feeling of familiarity, a process he of a multicomponent system which was termed working refers to as ecphory. There have in recent years been a grow- memory. It was suggested that this comprised a limited ing number of studies concerned with the phenomenologi- capacity attentional control system, the central executive, cal aspect of memory, often with considerable success (see together with at least two slave systems, one concerned Gardiner 1988). with maintaining visual-spatial information, the sketch- A second proposed distinction within LTM is that pad, while the other was responsible for holding and between semantic and episodic memory (see EPISODIC VS. manipulating speech-based information, the phonological SEMANTIC MEMORY). Semantic memory refers to the stored knowledge of the world that underlies not only our capacity loop. The concept of working memory has proved extremely to understand language but also our ability to take advan- fruitful, not only in accounting for the initial neuropsycho- tage of prior knowledge in perceiving and organizing both logical evidence but also in being applicable to a wide range the physical and social world around us. The need for such a of tasks and subject groups, and more recently, providing a store of information was initially made obvious by attempts very fruitful basis for a range of neuroradiological studies to develop computer-based systems for comprehending text, concerned with the neuroanatomical basis of working mem- such as that of Quillian (1969). These stimulated attempts to ory (see Smith and Jonides 1995). understand semantic memory in human subjects, and As in the case of STM, the concept of LTM has also prompted Tulving (1972) to propose that semantic and epi- undergone a detailed analysis in the last twenty years, again sodic memory are distinct systems. At first sight, the evi- resulting in a degree of fractionation. One of the strongest dence appeared persuasive. Densely amnesiac patients may cases for a basic distinction is that between implicit and perform normally on semantic memory tests while showing explicit memory (see IMPLICIT VS. EXPLICIT MEMORY). no evidence of new episodic learning (Wilson and Baddeley Once again this distinction was heavily influenced by neu- 1988). However, semantic memory tests typically involve ropsychological evidence, when it was observed that even accessing old memories, whereas episodic tests are princi- densely amnesiac patients could nevertheless show compar- pally concerned with the laying down of new memory atively normal learning on certain tasks, including the traces. When amnesiac patients are required to extend their acquisition of motor skills, classical conditioning, and a existing semantic memory systems, for example, by learn- whole range of procedures that come under the general term ing about the developing political system within their coun- of priming. The classic demonstration within this area was try, or learning new routes within their town, learning that of Warrington and Weiskrantz (1968), who showed that appears to be catastrophically bad. An alternative way of amnesiac patients who were shown a list of words were conceptualizing semantic memory is to suggest that it repre- totally unable to recall or recognize the words, but were able sents the residue of many episodic memories, with access to demonstrate learning by perceiving the words more rap- being based on generic commonalities, rather than the idly when they were presented in fragmented form. Subse- retrieval of a specific episode. The nature of semantic mem- quent work showed that learning was also preserved when ory and its neuroanatomical basis continues to be a very tested by cueing with the first few letters of the word (e.g., active research area, with neuropsychological evidence present CROCODILE, test with CRO——), or with a frag- again being particularly cogent (see Patterson and Hodges ment of the word, (C—O—O—I—E). Equivalent phenom- 1996). ena have been demonstrated in other modalities, and have No survey of memory would be complete without com- shown to be widely demonstrable in normal subjects (see ment on one aspect of memory that has been both active and Roediger 1990 for a review). controversial in recent years, namely, the attempt to apply Over the last decade there has been substantial contro- the lessons learned in the laboratory to everyday function- versy as to how best to explain this pattern of results. There ing. Although the link between the laboratory and the field is still some support for attempts to account for the data has occasionally appeared to be excessively confrontational within a unitary system, but my own view (Baddeley 1998) (e.g., see Neisser 1978; Banaji and Crowder 1989), the is that this is no longer a tenable position. In particular, the interaction has on the whole been a fruitful one. This is par- neuropsychological evidence seems to argue for a distinc- ticularly true of clinical applications of the psychology of Memory, Animal Studies 517 memory, where, as we have seen, the study of memory defi- Roediger, H. L. (1990). Implicit memory: Retention without remembering. American Psychologist 45: 1043–1056. cits in patients has been enormously influential in changing Shallice, T., and E. K. Warrington. (1970). Independent function- our views of the normal functioning of human memory. ing of verbal memory stores: A neuropsychological study. See also ECOLOGICAL PSYCHOLOGY; MEMORY, ANIMAL Quarterly Journal of Experimental Psychology 22: 261–273. STUDIES; MEMORY, HUMAN NEUROPSYCHOLOGY Smith, E. E., and J. Jonides. (1995). Working memory in humans: Neuropsychological evidence. In M. Gazzaniga, Ed., The Cog- —Alan Baddeley nitive Neurosciences. Cambridge, MA: MIT Press, pp. 1009– 1020. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving References and W. Donaldson, Eds., Organization of Memory. New York: Atkinson, R. C., and R. M. Shiffrin. (1968). Human memory: A Academic Press, pp. 381–403. proposed system and its control processes. In K. W. Spence, Tulving, E. (1985). How many memory systems are there? Ameri- Ed., The Psychology of Learning and Motivation: Advances in can Psychologist 40: 385–398. Research and Theory. New York: Academic Press, pp. 89–195. Warrington, E. K., and L. Weiskrantz. (1968). New methods of Baddeley, A. D. (1966a). Short-term memory for word sequences testing long-term retention with special reference to amnesic as a function of acoustic, semantic and formal similarity. Quar- patients. Nature 217: 972–974. terly Journal of Experimental Psychology 18: 362–365. Waugh, N. C., and D. A. Norman. (1965). Primary memory. Psy- Baddeley, A. D. (1966b). The influence of acoustic and semantic chological Review 72: 89–104. similarity on long-term memory for word sequences. Quarterly Wilson, B. A., and A. D. Baddeley. (1988). Semantic, episodic and Journal of Experimental Psychology 18: 302–309. autobiographical memory in a post-meningitic amnesic patient. Baddeley, A. D. (1998). Human Memory: Theory and Practice, Brain and Cognition 8: 31–46. Revised ed. Needham Heights, MA: Allyn and Bacon. Baddeley, A. D., and G. Hitch. (1974). Working memory. In G. A. Further Readings Bower, Ed., The Psychology of Learning and Motivation. New Baddeley, A. D. (1999). Essentials of Human Memory. Hove: Psy- York: Academic Press, pp. 47–89. chology Press. Baddeley, A. D., and E. K. Warrington. (1970). Amnesia and the Parkin, A. (1987). Memory and Amnesia: An Introduction. Oxford: distinction between long- and short-term memory. Journal of Blackwell. Verbal Learning and Verbal Behavior 9: 176–189. Banaji, M. R., and R. G. Crowder. (1989). The bankruptcy of everyday memory. American Psychologist 44: 1185–1193. Memory, Animal Studies Bjork, R. A., and W. B. Whitten. (1974). Recency-sensitive retrieval processes. Cognitive Psychology 6: 173–189. Brown, J. (1958). Some tests of the decay theory of immediate Information about which structures and connections in the memory. Quarterly Journal of Experimental Psychology 10: brain are important for MEMORY has come from studies of 12–21. amnesiac patients and from systematic experimental work Craik, F. I. M., and R. S. Lockhart. (1972). Levels of processing: A with animals. Work in animals includes studies which framework for memory research. Journal of Verbal Learning assess the effects of selective brain lesions on memory, as and Verbal Behavior 11: 671–684. well as studies using neurophysiological recording and Craik, F. I. M., and M. J. Watkins. (1973). The role of rehearsal in stimulating techniques to investigate neural activity within short-term memory. Journal of Verbal Learning and Verbal particular brain regions (for discussions of the latter two Behavior 12: 599–607. approaches, see OBJECT RECOGNITION, ANIMAL STUDIES; Gardiner, J. M. (1988). Functional aspects of recollective experi- FACE RECOGNITION; SINGLE-NEURON RECORDING). An ence. Memory and Cognition 16: 309–313. Hebb, D. O. (1949). Organization of Behavior. New York: Wiley. important development that has occurred in the area of James, W. (1890). The Principles of Psychology. New York: Holt, memory during the past two decades was the establishment Rinehart and Winston. of an animal model of human amnesia in the monkey Melton, A. W. (1963). Implications of short-term memory for a (Mahut and Moss 1984; Mishkin 1982; Squire and Zola- general theory of memory. Journal of Verbal Learning and Ver- Morgan 1983). In the 1950s, Scoville and Milner (1957) bal Behavior 2: 1–21. described the severe amnesia that followed bilateral surgical Milner, B. (1966). Amnesia following operation on the temporal removal of the medial temporal lobe (patient H.M.). This lobes. In C. W. M. Whitty and O. L. Zangwill, Eds., Amnesia. important case demonstrated that memory is a distinct cere- London: Butterworths, pp. 109–133. bral function, dissociable from other perceptual and cogni- Neisser, U. (1978). Memory: What are the important questions? In tive abilities. M. M. Gruneberg, P. E. Morris, and R. N. Sykes, Eds., Practi- cal Aspects of Memory. London: Academic Press. In monkeys, surgical lesions of the medial temporal lobe, Patterson, K. E., and J. R. Hodges. (1996). Disorders of semantic which were intended to approximate the damage sustained memory. In A. D. Baddeley, B. A. Wilson, and F. N. Watts, by patient H.M., reproduced many features of human mem- Eds., Handbook of Memory Disorders. Chichester, England: ory impairment. In particular, both monkeys and humans Wiley, pp. 167–186. were impaired on tasks of declarative memory, but fully Peterson, L. R., and M. J. Peterson. (1959). Short-term retention of intact at skills and habit learning and other tasks of non- individual verbal items. Journal of Experimental Psychology declarative memory. This achievement set the stage for 58: 193–198. additional work in monkeys and for work in rodents that has Quillian, M. R. (1969). The teachable language comprehender: A identified structures in the medial temporal lobe that are simulation program and theory of language. Communication of important for declarative memory. These structures include the ACM 12: 459–476. 518 Memory, Animal Studies the hippocampal region (i.e., the cell fields of the HIPPO- keys suggest that the amygdala is important for other kinds CAMPUS, the dentate gyrus, and the subiculum), and adja- of memory, including the development of conditioned fear cent cortical areas that are anatomically related to the and other forms of affective memory (see EMOTION AND hippocampal region, namely, the entorhinal, perirhinal, and THE ANIMAL BRAIN). These and other findings (Murray parahippocampal cortices (Zola-Morgan and Squire 1993). 1992) focused attention away from the amygdala toward The midline diencephalon is another brain area important the cortical structures of the medial temporal lobe, that is, for memory, although less is known about which specific the perirhinal, entorhinal, and parahippocampal cortices, in structures in this region contribute to memory function. addition to the hippocampal region itself. Findings from work in animals, including the development Direct evidence for the importance of the cortical regions of an animal model of alcoholic Korsakoff’s syndrome in has come from studies in which circumscribed damage has the rat (Mair et al., 1992), have been consistent with the been done to the perirhinal, entorhinal, or parahippocampal anatomical findings from human amnesia in showing the cortices, either separately or in combination (Moss, Mahut, importance of damage within the medial THALAMUS, espe- and Zola-Morgan 1981; Zola-Morgan et al. 1989; Gaffan cially damage in the internal medullary lamina, for produc- and Murray 1992; Meunier et al. 1993; Suzuki et al. 1993; ing memory loss. Lesions in the internal medullary lamina Leonard et al. 1995). For example, monkeys with combined would be expected to disconnect or damage several thalamic lesions of the perirhinal and parahippocampal cortices nuclei, including intralaminar nuclei, the mediodorsal exhibited severe, multimodal, and long-lasting memory nucleus, and the anterior nucleus (Aggleton and Mishkin impairment (Zola-Morgan et al. 1989; Suzuki et al. 1993). 1983; Mair et al. 1991; Zola-Morgan and Squire 1985). More limited lesions of the cortical regions also produce However, the separate contributions to memory of the memory impairment. For example, several studies found mediodorsal nucleus, the anterior nucleus, and the intralam- that monkeys with bilateral lesions limited to the perirhinal inar nuclei remain to be explored systematically with well- cortex exhibit long-lasting memory impairment (Meunier et circumscribed lesions in animals. al. 1993; Ramus, Zola-Morgan, and Squire 1994). Addition- A major criterion for demonstrating that an animal has a ally, a large number of individual studies in monkeys and in memory deficit is to show that performance is impaired at rats with varying extents of damage to the medial temporal long-delay intervals, but is intact at short-delay intervals, lobe, together with work in humans, has led to the idea that that is, no impairment in perception, attention, or general the severity of memory impairment increases as more com- intellectual function. A successful strategy for demonstrat- ponents of the medial temporal lobe memory system are ing intact short-term memory and impaired long-term mem- damaged. ory has involved training normal monkeys and monkeys with A long-standing and controversial issue in work on medial temporal lobe lesions on the delayed nonmatching memory has been whether the hippocampal region is dis- -to-sample task, a recognition memory task sensitive to proportionately involved in spatial memory, or whether spa- amnesia in humans. In this task, the monkey first sees an tial memory is simply a good example of a broader category object, and then after a prescribed delay the animal is given a of memory that requires the hippocampal region. One view choice between the previously seen object and a novel one. of the matter comes from earlier work with monkeys (Par- The key feature of this experimental approach is the use of kinson, Murray, and Mishkin 1988). Monkeys with lesions very short delay intervals (e.g., 0.5 sec). The absence of an that involved the hippocampal formation (hippocampus impairment at a delay of 0.5 sec would indicate that the plus underlying posterior entorhinal cortex and parahippoc- medial temporal lobe lesions do not affect short-term mem- ampal cortex) were severely impaired in acquiring an ory. Using this strategy, Alvarez-Royo, Zola-Morgan, and object-place association task, whereas lesions that involved Squire (1992) and Overman, Ormsby, and Mishkin (1990) the amygdala plus underlying anterior entorhinal cortex and showed that medial temporal lobe lesions impair memory at perirhinal cortex were only mildly impaired. The authors long delays, but not at very short delays. Studies in rats using suggested that the hippocampus has an especially important delayed nonmatching-to-sample as well as a variety of other role in spatial memory, an idea developed originally by memory tasks have also demonstrated that long-term mem- O’Keefe and Nadel (1978), based mostly on rat work. It ory is impaired while short-term memory is spared follow- was unclear from this monkey study, however, whether the ing lesions that involve the hippocampal region (Kesner and observed spatial deficit was due to hippocampal damage, Novak 1982; for recent reviews of work in rats, see Further the adjacent cortical damage, or both. Additional work from Readings). These findings underscore the idea that medial both humans and animals suggests another view. In one for- temporal lobe lesions reproduce a key feature of human mal study (Cave and Squire 1991), spatial memory was amnesia, that is, the distinction between intact short-term found to be proportionately impaired in amnesiac patients memory and impaired long-term memory. relative to object recognition memory and object recall It was originally supposed that damage to the memory. The same (nonspatial) view of hippocampal func- tion has also been proposed for the rat, based, for example, AMYGDALA directly contributed to the memory impairment on demonstrated deficits in odor memory tasks after ibote- associated with large medial temporal lobe lesions (Murray nate hippocampal lesions (Bunsey and Eichenbaum 1996). and Mishkin 1984). Subsequent work showed that monkeys The role of the hippocampus in spatial memory remains with virtually complete lesions of the amygdala performed unclear. Recent commentaries on the issue of the hippoc- as well as normal monkeys on four different memory tasks, ampus and spatial memory can be found under Further including delayed nonmatching-to-sample task (Zola- Reading. Morgan et al. 1989). Other experiments with rats and mon- Memory, Animal Studies 519 virtually none of the input to perirhinal cortex originates in Uncertainty about the function of the hippocampus has the parietal cortex. These anatomical considerations lead to been due, in part, to the inability until recently to make cir- the expectation that perirhinal cortical lesions might impair cumscribed lesions limited to the hippocampal region in visual memory more than spatial memory and that the experimental animals. Studies in which selective lesions of reverse might be true for parahippocampal cortex. Further- the hippocampal region could be accomplished became pos- more, because both the perirhinal and the parahippocampal sible only with the development of (a) a technique for pro- cortices project to the hippocampus, one might expect that ducing restricted ibotenate lesions of the hippocampus in hippocampal damage will similarly impair visual memory the rat and (b) a technique that uses MAGNETIC RESONANCE and spatial memory. The establishment of new, more sensi- IMAGING to guide the placement of radiofrequency or tive behavioral tests and the development of new techniques ibotenic acid stereotaxic lesions of the hippocampal region for producing selective brain lesions have now made it possi- in the monkey. Monkeys with bilateral, radiofrequency ble to address these possibilities and to systematically clarify lesions of the hippocampal region, which spared almost the separate contributions to memory of structures in the entirely the perirhinal, entorhinal, and parahippocampal cor- medial temporal lobe and the diencephalon. tices, exhibited impaired performance at long delays (ten minutes and forty minutes) on the delayed nonmatching-to- See also ATTENTION IN THE ANIMAL BRAIN; EPISODIC VS. sample task (Alvarez, Zola-Morgan, and Squire 1995). SEMANTIC MEMORY; IMPLICIT VS. EXPLICIT MEMORY; MEM- Ibotenic acid lesions cause cell death but, unlike radio- ORY, HUMAN NEUROPSYCHOLOGY; WORKING MEMORY; frequency lesions, spare afferent and efferent white matter WORKING MEMORY, NEURAL BASIS OF fibers within the region of the lesion. If it should turn out, —Stuart Zola after systematic study, that ibotenic acid lesions of the hip- pocampal region do not impair performance on the delayed References nonmatching task, the interpretation of such studies should not be overstated. The results concern recognition memory, Aggleton, J. P., and M. Mishkin. (1983). Memory impairments fol- not memory in general, and only the kind of recognition lowing restricted medial thalamic lesions. Exp. Brain Res. 52: memory measured by the nonmatching-to-sample task 199–209. itself. The delayed nonmatching task has been extraordinar- Alvarez-Royo, P., S. Zola-Morgan, and L. R. Squire. (1992). ily useful for evaluating the effects on visual recognition Impairment of long-term memory and sparing of short-term memory of damage to the medial temporal lobe memory memory in monkeys with medial temporal lobe lesions: A system and for measuring the severity of recognition mem- response to Ringo. Behav. Brain Res. 52: 1–5. ory impairment. However, in the case of human memory, Alvarez, P., S. Zola-Morgan, and L. R. Squire. (1995). Damage limited to the hippocampal region produces long-lasting mem- recognition memory tests are known to be rather easy and ory impairment. J. Neurosci. 15: 3796–3807. not as sensitive to memory impairment as other tests, for Bachevalier, J., M. Brickson, and C. Hagger. (1993). Limbic- instance, tests of recall or cued recall. The issue of task sen- dependent recognition memory in monkeys develops early in sitivity is crucially important. Other kinds of recognition infancy. Neuroreport 4: 77–80. memory tasks, for example, the paired comparisons task (a Bunsey, M., and H. Eichenbaum. (1996). Conservation of hippo- task of spontaneous novelty preference; Bachevalier, Brick- campal memory function in rats and humans. Nature 379: 255– son, and Hagger 1993) and tasks that are thought to be more 257. sensitive than tasks of simple recognition memory, for Cave, C. B., and L. R. Squire. (1991). Equivalent impairment of example the transverse patterning, the transitive inference, spatial and nonspatial memory following damage to the human and naturalistic association tasks, have recently been devel- hippocampus. Hippocampus 1: 329–340. Gaffan, D., and E. A. Murray. (1992). Monkeys (Macaca fascicu- oped to assess memory in animals. laris) with rhinal cortex ablations succeed in object discrimina- An important question with respect to the components of tion learning despite 24-hr intertrial intervals and fail at the medial temporal lobe memory system is whether these matching to sample despite double sample presentations. structures all share similar functions as part of a common Behav. Neurosci. 106: 30–38. memory system, or do they have distinct and dissociable Kesner, R. P., and J. M. Novak. (1982). Serial position curve in functions? In this regard, one must consider the rats: Role of the dorsal hippocampus. Science 218: 173–175. neuroanatomy of the medial temporal lobe system and its Leonard, B. W., D. G. Amaral, L. R. Squire, and S. Zola-Morgan. pattern of connectivity with association cortex. An extensive (1995). Transient memory impairment in monkeys with bilateral anatomical investigation by Suzuki and Amaral (1994) lesions of the entorhinal cortex. J. Neurosci. 15: 5637–5659. showed that different areas of neocortex gain access to the Mahut, H., and M. Moss. (1984). Consolidation of memory: The hippocampus revisited. In L. R. Squire and N. Butters, Eds., medial temporal lobe memory system at different points. Neuropsychology of Memory, vol 1. New York: Guilford Press, Visual information arrives preferentially to perirhinal cortex. pp. 297–315. Approximately 65 percent of the input reaching the perirhi- Mair, R. G., R. L. Knoth, S. A. Rabehenuk, and P. J. Lanlais. nal cortex is unimodal visual information, mostly from TE (1991). Impairment of olfactory, auditory, and spatial serial and TEO. By contrast, about 40 percent of the input reaching reversal learning in rats recovered from pyrithiamine induced parahippocampal cortex is visual, mostly from area V4. Cor- thiamine deficiency. Behav. Neurosci. 105: 360–374. tical areas that are believed to be important for processing Mair, R. G., J. K. Robinson, S. M. Koger, G. D. Fox, and Y. P. spatial information project preferentially to parahippo- Zhang. (1992). Delayed non-matching to sample is impaired by campal cortex. Approximately 8 percent of the input to para- extensive, but not by limited lesions of thalamus in rats. Behav. hippocampal cortex originates in the parietal cortex, whereas Neurosci. 106: 646–656. 520 Memory, Human Neuropsychology Meunier, M., J. Bachevalier, M. Mishkin, and E. A. Murray. Amaral, D. G., Ed. (1991). Is the hippocampal formation preferen- (1993). Effects on visual recognition of combined and separate tially involved in spatial behavior? Hippocampus (special issue) ablations of the entorhinal and perirhinal cortex in rhesus mon- 1: 221–292. keys. J. Neurosci. 13: 5418–5432. Bunsey, M., and H. Eichenbaum. (1995). Selective damage to the Mishkin, M. (1982). A memory system in the monkey. Philos. hippocampal region blocks long-term retention of a natural and Trans. R. Soc. Lond. B. Biol. Sci. 98: 85–95. nonspatial stimulus-stimulus association. Hippocampus 5: Moss, M., H. Mahut, and S. Zola-Morgan. (1981). Concurrent dis- 546–556. crimination learning of monkeys after hippocampal, entorhinal, Eichenbaum, H. (1997). Declarative memory: Insights from cogni- or fornix lesions. J. Neurosci. 1: 227–240. tive neurobiology. Annu. Rev. Psychol. 48: 547–572. Murray, E. A. (1992). Medial temporal lobe structures contributing Eichenbaum, H., T. Otto, and N. J. Cohen. (1994). Two functional to recognition memory: The amygdaloid complex versus the components of the hippocampal memory system. Behav. and rhinal complex. In J. P. Aggleton, Ed., The Amygdala: Neurobi- Brain Sci. 17: 449–517. ological Aspects of Emotion, Memory, and Mental Dysfunction. Horel, J. A., D. E. Pytko-Joiner, M. Voytko, and K. Salsbury. New York: Wiley-Liss, pp. 453–470. (1987). The performance of visual tasks while segments of the Murray, E. A., and M. Mishkin. (1984). Severe tactual as well as inferotemporal cortex are suppressed by cold. Behav. Brain visual memory deficits following combined removal of the Res. 23: 29–42. amygdala and hippocampus in monkeys. J. Neurosci. 4: 2565– Jaffard, R., and M. Meunier. (1993). Role of the hippocampal for- 2580. mation in learning and memory. Hippocampus 3: 203–218. O’Keefe, J., and L. Nadel. (1978). The Hippocampus as a Cogni- Jarrard, L. E. (1993). On the role of the hippocampus in learning tive Map. Oxford: Clarendon Press. and memory in the rat. Behav. Neural Biol. 60: 9–26. Overman, W. H., G. Ormsby, and M. Mishkin. (1990). Picture rec- Jarrard, L. E., and B. S. Meldrum. (1993). Selective excitotoxic ognition vs. picture discrimination learning in monkeys with pathology in the rat hippocampus. Neuropathol. Appl. Neuro- medial temporal removals. Exp. Brain Res. 79: 18–24. biol. 19: 381–389. Parkinson, J. K., E. A. Murray, and M. Mishkin. (1988). A selec- Mair, R. G., C. D. Anderson, P. J. Langlais, and W. J. McEntree. tive mnemonic role for the hippocampus in monkeys: Memory (1988). Behavioral impairments, brain lesions and monoamin- for the location of objects. J. Neurosci. 8: 4159–4167. ergic activity in the rat following a bout of thiamine deficiency. Ramus, S. J., S. Zola-Morgan, and L. R. Squire. (1994). Effects of Behav. Brain Res. 27: 223–239. Mishkin, M. (1978). Memory in monkeys severely impaired by lesions of perirhinal cortex or parahippocampal cortex on mem- ory in monkeys. Soc. Neurosci. Abst. 20: 10–74. combined but not separate removal of the amygdala and hip- pocampus. Nature 273: 297–298. Scoville, W. B., and B. Milner. (1957). Loss of recent memory after bilateral hippocampal lesions. J. Neurol. Neurosurg. Psy- Nadel, L. (1995). The role of the hippocampus in declarative mem- chiatry 20: 11–21. ory: A comment on Zola-Morgan, Squire, and Ramus (1994). Hippocampus 5: 232–234. Squire, L. R., and S. Zola-Morgan. (1983). The neurology of mem- Vnek, N., T. C. Gleason, and L. F. Kromer. (1995). Entorhinal- ory: The case for correspondence between the findings for hippocampal connections and object memory in the rat: Acqui- human and nonhuman primates. In J. A. Deutsch, Ed., The sition versus retention. J. Neurosci. 15: 3193–3199. Physiological Basis of Memory. New York: Academic Press, Zola-Morgan, S., L. R. Squire, and S. J. Ramus. (1995). The role pp. 199–268. of the hippocampus in declarative memory: A reply to Nadel. Suzuki, W. A., and D. G. Amaral. (1994). Perirhinal and parahip- Hippocampus 5: 235–239. pocampal cortices of the macaque monkey: Cortical afferents. J. Comp. Neurol. 350: 497–533. Suzuki, W. A., S. Zola-Morgan, L. R. Squire, and D. G. Amaral. Memory, Human Neuropsychology (1993). Lesions of the perirhinal and parahippocampal corti- ces in the monkey produce long-lasting memory impairment in the visual and tactual modalities. J. Neurosci. 13: 2430– is the process by which new knowledge is 2451. LEARNING Zola-Morgan, S. M., and L. R. Squire. (1985). Amnesia in mon- acquired about the world. MEMORY is the process by which keys following lesions of the mediodorsal nucleus of the thala- what is learned can be retained in storage with the possibil- mus. Ann. Neurol. 17: 558–564. ity of drawing on it later. Most of what humans know about Zola-Morgan, S. M., and L. R. Squire. (1993). Neuroanatomy of the world is not built into the brain at the time of birth but is memory. Ann. Rev. Neurosci. 16: 547–563. acquired through experience. It is learned, stored in the Zola-Morgan, S., L. R. Squire, D. G. Amaral, and W. A. Suzuki. brain as memory, and is available later to be retrieved. (1989). Lesions of perirhinal and parahippocampal cortex that Memory is localized in the brain as physical changes pro- spare the amygdala and hippocampal formation produce severe duced by experience. Memory is thought to be stored as memory impairment. J. Neurosci. 9: 4355–4370. changes in synaptic connectivity within large ensembles of neurons. New synaptic connections may be formed, and Further Readings there are changes as well in the strength of existing synapses. Alvarado, M. C., and J. W. Rudy. (1995). Rats with damage to the What makes a memory is not the manufacture of some hippocampal-formation are impaired on the transverse- chemical code, but rather increases and decreases in the patterning problem but not on elemental discriminations. strength of already existing neural connections and forma- Behav. Neurosci. 109: 204–211. tion of new connections. What makes the memory specific Alvarez-Royo, P., R. P. Clower, S. Zola-Morgan, and L. R. Squire. (memory of a trip to England instead of memory of a drive to (1991). Stereotaxic lesions of the hippocampus in monkeys: the hardware store) is not the kind of cellular and molecular Determination of surgical coordinates and analysis of lesions event that occurs in the brain, but where in the nervous sys- using magnetic resonance imaging. J. Neurosci. Methods 38: tem the changes occur and along which pathways. 223–232. Memory, Human Neuropsychology 521 amnesia, which provide both neuropsychological and neu- The brain is highly specialized and differentiated, orga- rohistological information, and from the study of an animal nized so that different regions of neocortex simultaneously model of human amnesia in the monkey. The available carry out computations on separate features of the external human cases make several points. First, damage limited world (e.g., the analysis of form, color, and movement). bilaterally to the CA1 region of the HIPPOCAMPUS is suffi- Memory of a specific event, or even memory of something so apparently simple as a single object, is thought to be cient to cause a moderately severe anterograde amnesia. stored in a distributed fashion, essentially in component Second, when more damage occurs in the hippocampal for- parts. These components are stored in the same neural sys- mation, (e.g., damage to the CA fields, dentate gyrus, subic- tems in neocortex that ordinarily participate in the process- ular complex, and some cell loss in entorhinal cortex), the ing and analysis of what is to be remembered. In one sense, anterograde amnesia becomes more severe. Third, damage memory is the persistence of perception. It is stored as out- limited bilaterally to the hippocampal formation is sufficient comes of perceptual operations and in the same cortical to produce temporally limited retrograde amnesia covering regions that are ordinarily involved in the processing of the more than 10 years. items and events that are to be remembered. Systematic and cumulative work in monkeys has further It has long been appreciated that severe memory demonstrated that the full medial temporal lobe memory impairment can occur against a background of otherwise system consists of the hippocampus and adjacent, ana- normal intellectual function. This dissociation shows that tomically related structures: entorhinal cortex, perirhinal the brain has to some extent separated its intellectual and cortex, and parahippocampal cortex. The critical regions of perceptual functions from its capacity for laying down in the medial diencephalon important for memory appear to be memory the records that ordinarily result from intellectual the mediodorsal thalamic nucleus, the anterior nucleus, the and perceptual work. Specifically, the medial temporal mammillary nuclei, and the structures within and inter- lobe and the midline diencephalon of the brain have spe- connected by the internal medullary lamina. cific memory functions, and bilateral damage to these One fundamental distinction in the neuropsychology of regions causes an amnesic syndrome. The amnesic syn- memory separates immediate memory from long-term drome is characterized by profound forgetfulness for new memory. Indeed, this is the distinction that is revealed by material (anterograde amnesia), regardless of the sensory the facts of human amnesia. In addition, a number of dis- modality through which the material is presented and tinctions can be made within long-term memory. Memory is regardless of the kind of material that is presented (faces, not a unitary mental faculty but depends on the operation of names, stories, musical passages, or shapes). Immediate several separate systems that operate in parallel to record memory, as measured by digit span, is intact. However, a the effects of experience. The major distinction is between memory deficit is easily detected with conventional mem- the capacity for conscious recollection about facts and ory tests that ask subjects to learn and remember an events (so-called declarative or explicit memory) and a col- amount of information that exceeds what can be held in lection of nonconscious memory abilities (so-called non- immediate memory or with memory tests that ask subjects declarative or implicit memory), whereby memory is to learn even a small amount of information and then hold expressed through performance without any necessary con- onto it for several minutes in the face of distraction. The scious memory content or even the experience that memory impairment appears whether memory is tested by unaided is being used. (free) recall, by recognition, or by cued recall. As assessed Declarative memory is the kind of memory that is by these various instruments, the deficit in amnesic impaired in amnesia. Declarative memory is a brain- patients is proportional to the sensitivity with which these systems construct. It is the kind of memory that depends tests measure memory in intact subjects. Recognition is on the integrity of the medial temporal lobe–diencephalic easier than recall for all subjects, amnesic patients and brain structures damaged in amnesia. Declarative mem- normal subjects alike. ory is involved in modeling the external world, in storing The same brain lesions that cause difficulties in new representations of objects, episodes, and facts. It is fast, learning also cause retrograde amnesia, difficulty in recol- specialized for one-trial learning, and for making arbi- lecting events that occurred prior to the onset of amnesia. trary associations or conjunctions between stimuli. The Typically, retrograde amnesia is temporally graded such that acquired representations are flexible and available to very old (remote) memory is affected less than recent mem- multiple response systems. Nondeclarative memory is not ory. Retrograde amnesia can cover as much as a decade or itself a brain-systems construct, but rather an umbrella two prior to the onset of amnesia. These observations show term for several kinds of memory, each of which has its that the structures damaged in amnesia are not the reposito- own brain organization. Nondeclarative memory under- ries of long-term memory. Rather, these structures are lies changes in skilled behavior, the development through essential, beginning at the time of learning, and they are repetition of appropriate ways to respond to stimuli, and thought to drive a gradual process of memory consolidation it underlies the phenomenon of priming—a temporary in neocortex. As the result of this process, memory storage change in the ability to identify or detect perceptual in neocortex comes to be independent of the medial tempo- objects. In these cases, performance changes as the result ral lobe and diencephalic structures that are damaged in of experience and therefore deserves the name memory, amnesia. but like drug tolerance or immunological memory, perfor- Information about what specific structures are important mance changes without providing a record of the particu- for human memory comes from carefully studied cases of lar episodes that led to the change in performance. What 522 Memory Storage, Modulation of is learned tends to be encapsulated and inflexible, avail- Memory Storage, Modulation of able most readily to the same response systems that were involved in the original learning. Among the prominent kinds of nondeclarative memory The formation of lasting, long-term memory occurs gradu- are procedural memory (memory for skills and habits), ally, over time, following learning. A century ago Mueller simple classical CONDITIONING, and the phenomenon of and Pilzecker (1900) proposed that the neural processes priming. Skill and habit memory depends importantly on underlying new memories persist in a short-lasting modifi- the dorsal striatum, even when motor activity is not an able state and then, with time, become consolidated into a important part of the task. Thus, nondemented patients relatively long-lasting state. Later, HEBB (1949) proposed with Parkinson’s disease, who have dorsal striatal damage, that the first stage of the “dual-trace” memory system is are impaired at learning a two-choice discrimination task based on reverberating neural circuits and that such neural where the correct answer on each trial is determined prob- activity induces lasting changes in synaptic connections that abilistically. In this task, normal subjects learn gradually, provide the basis for long-term memory. not by memorizing the cues and their outcomes, but by Clinical and experimental evidence strongly supports the gradually developing a disposition to respond differen- hypothesis that memory storage is time-dependent. Disrup- tially to the cues that are presented. Classical conditioning tion of brain activity shortly after learning impairs long-term of skeletal musculature (e.g., eyeblink conditioning) memory. In humans, acute brain trauma produces retrograde depends on cerebellar and brain stem pathways. Emotional amnesia, a selective loss of memory for recent experiences learning, including fear conditioning, depends on the (Burnham 1903; Russell and Nathan 1946) and in animals amygdaloid complex. In the case of fear conditioning, sub- retrograde amnesia is induced by many treatments that jects will often remember the unpleasant, aversive event. impair brain functioning, including electrical brain stimula- This component of memory is declarative and depends on tion and drugs (McGaugh and Herz 1972). Additionally, and the medial temporal lobe and diencephalon. But subjects more importantly, in humans as well as animals (Soetens et may also develop a negative feeling about the stimulus al. 1995), stimulant drugs administered shortly after learning object, perhaps even a phobia, and this component of enhance memory. Drugs affecting many neurotransmitter remembering depends on the amygdala. The AMYGDALA and hormonal systems improve long-term memory when also appears to be an important modulator of both declara- they are administered within a few minutes or hours after tive and nondeclarative forms of memory. For example, training (McGaugh 1973, 1983). Extensive evidence indi- activity originating in the amygdala appears to underlie the cates that the drugs enhance the consolidation of long-term observation that emotional events are typically remem- memory. bered better than neutral events. Finally, the phenomenon Our memories of experiences vary greatly in strength. of priming appears to depend on the neocortical pathways Some memories fade quickly and completely, whereas oth- that are involved in processing the material that is primed. ers last a lifetime. Generally, remembrance of experiences Neuroimaging studies have described reductions in activ- varies with their significance; emotionally arousing events ity in posterior neocortex in correspondence with percep- are better remembered (Christianson 1992). William JAMES tual priming. observed that, “An experience may be so exciting emotion- Information is still accumulating about how memory is ally as almost to leave a scar on the cerebral tissue” (James organized, what structures and connections are involved, 1890). Studies of retrograde amnesia and memory enhance- and what jobs they do. The disciplines of both psychology ment provide important clues to the physiological systems and neuroscience contribute to this enterprise. underlying variations in memory strength. In particular, the finding that drugs enhance memory consolidation suggests See also AGING, MEMORY, AND THE BRAIN; EPISODIC VS. that hormonal systems activated by emotional arousal may SEMANTIC MEMORY; IMPLICIT VS. EXPLICIT MEMORY; MEM- influence consolidation (Cahill and McGaugh 1996). ORY, ANIMAL STUDIES; MEMORY STORAGE, MODULATION Emotionally exciting experiences induce the release of OF; WORKING MEMORY; WORKING MEMORY, NEURAL BASIS adrenal hormones, including the adrenal medullary hor- OF mone epinephrine (Adrenaline) and the adrenal cortex hor- mone corticosterone (in humans, cortisol). Experiments —Larry Squire with animal and human subjects indicate that these hor- mones, as well as other hormones released by learning Further Readings experiences, play an important role in regulating memory Knowlton, B. J., J. Mangels, and L. R. Squire. (1996). A neostri- storage (Izquierdo and Diaz 1983; McGaugh and Gold atal habit learning system in humans. Science 273: 1399–1402. 1989). Administration of epinephrine to rats or mice shortly Rempel-Clower, N., S. M. Zola, L. R. Squire, and D. G. Amaral. after training enhances their long-term memory of the train- (1996). Three cases of enduring memory impairment following ing (Gold, McCarty, and Sternberg 1982). β-adrenergic bilateral damage limited to the hippocampal formation. Journal antagonists such as propranolol block the memory enhance- of Neuroscience 16: 5233–5255. ment induced by epinephrine. Comparable findings have Squire, L. R., B. J. Knowlton, and G. Musen. (1993). The structure been obtained in studies with human subjects. The finding and organization of memory. Annual Review of Psychology 44: that β-adrenergic antagonists block the enhancing effects of 453–495. emotional arousal on long-term memory formation in Squire, L. R., and S. Zola-Morgan. (1991). The medial temporal humans supports the hypothesis that β-adrenergic agonists, lobe memory system. Science 253: 1380–1386. Memory Storage, Modulation of 523 including epinephrine, modulate memory storage (Cahill et with activation of the amygdala when viewing the film clips, al. 1994). Additionally, studies of the effects of corticoster- as indicated by POSITRON EMISSION TOMOGRAPHY (PET) one, as well as synthetic glucocorticoid receptor agonists brain scans (Cahill et al. 1996). and antagonists, indicate that memory storage is enhanced The formation of new long-term memory must, of by glucocorticoid agonists and impaired by antagonists. course, involve the formation of lasting neural changes. Furthermore, hormones of the adrenal medulla and adrenal Additionally, the strength of the induced neural changes, as cortex interact in modulating memory storage: metyrapone, subsequently reflected in long-term memory, is modulated a drug that impairs the synthesis and release of corticoster- by the actions of specific hormonal and brain systems acti- one, blocks the effects of epinephrine on memory consoli- vated by learning experiences. Such modulation serves to dation (Sandi and Rose 1994; De Kloet 1991; Roozendaal, ensure that the significance of experiences will influence Cahill, and McGaugh 1996). their remembrance. Investigations of the processes and sys- Recent research has revealed brain regions mediating tems underlying the modulation of memory storage are pro- drug and hormone influences on memory storage. Consider- viding new insights into how memories are created and able evidence indicates that many drugs and hormones mod- sustained. ulate memory through influences involving the amygdaloid See also EPISODIC VS. SEMANTIC MEMORY; IMPLICIT VS. complex. It is well established that electrical stimulation of EXPLICIT MEMORY; MEMORY; NEUROTRANSMITTERS the AMYGDALA modulates memory storage and that the —James L. McGaugh effect is influenced by adrenal hormones (Liang, Bennett, and McGaugh 1985). Lesions of the stria terminalis (a References major amygdala pathway that connects the amygdala with Burnham, W. H. (1903). Retrograde amnesia: Illustrative cases and many brain regions) block the memory-modulating effects a tentative explanation. Am. J. Psychol. 14: 382–396. of many drugs and hormones, including those of adrenal Cahill, L., and J. L. McGaugh. (1996). Modulation of memory hormones. Furthermore, lesions of the amygdala and, more storage. Current Opinion in Neurobiology 6: 237–242. specifically, lesions of the basolateral amygdala nucleus, Cahill, L., B. Prins, M. Weber, and J. L. McGaugh. (1994). β- also block the effects of adrenal hormones on memory stor- Adrenergic activation and memory for emotional events. age (McGaugh, Cahill, and Roozendaal 1996). Nature 371: 702–704. In humans, amygdala lesions block the effects of emo- Cahill, L., R. Babinsky, H. J. Markowitsch, and J. L. McGaugh. tional arousal on long-term memory (Cahill et al. 1995). In (1995). The amygdala and emotional memory. Nature 377: animals, infusions of β-adrenergic and glucocorticoid 295–296. antagonists into the amygdala impair memory, whereas Cahill, L., R. J. Haier, J. Fallon, M. Alkire, C. Tang, D. Keator, J. infusions of β-adrenergic agonists (e.g., norepinephrine) Wu, and J. L. McGaugh. (1996). Proc. Natl. Acad. Sci. U.S.A. 93: 8016–8021. and glucocorticoid receptor agonists into the amygdala after Christianson, S. -A., Ed. (1992). Handbook of Emotion and Mem- training enhance memory. As was found with lesions, the ory: Current Research and Theory. Hillsdale, NJ: Erlbaum. critical site for infusions is the basolateral amygdala nucleus De Kloet, E. R. (1991). Brain corticosteriod receptor balance and (Gallagher et al. 1981; Liang, Juler, and McGaugh 1986; homeostatic control. Neuroendocrinology 12: 95–164. McGaugh et al. 1996). Findings such as these indicate that Gallagher, M., B. S. Kapp, J. P. Pascoe, and P. R. Rapp. (1981). A the basolateral amygdala nucleus is an important and per- neuropharmacology of amygdaloid systems which contribute to haps critical brain region mediating arousal-induced neuro- learning and memory. In Y. Ben-Ari, Ed., The Amygdaloid modulatory influences on memory storage. Complex. Amsterdam: Elsevier. Thus, there is extensive evidence from human and animal Gold, P. E., R. McCarty, and D. B. Sternberg. (1982). Peripheral catecholamines and memory modulation. In C. Ajmone Marson studies that the amygdala is critically involved in modulating and H. Matthies, Eds., Neuronal Plasticity and Memory Forma- memory consolidation. However, it is also clear from the tion. New York: Raven Press, pp. 327–338. findings of many studies that the amygdala is not the neural Hebb, D. O. (1949). The Organization of Behavior. New York: locus of long-term memory. Lesions of the amygdala Wiley. induced after training do not block retention of the memory Izquierdo, I., and R. D. Diaz. (1983). Effect of ACTH, epineph- of the training (Parent, West, and McGaugh 1994). Addition- rine, β-endorphin, naloxone and of the combination of nalox- ally, the amygdala is not the locus of neural changes one or β-endorphin with ACTH or epinephrine on memory underlying the enhanced memory induced by infusing drugs consolidation. Psychoneuroendocrinology 8: 81–87. into the amygdala immediately after training. Drug infusions James, W. (1890). The Principles of Psychology. New York: Henry administered into the amygdala post training enhance long- Holt. Liang, K. C., C. Bennett, and J. L. McGaugh. (1985). Peripheral term memory for training in many types of tasks, including epinephrine modulates the effects of posttraining amygdala tasks known to involve the HIPPOCAMPUS or caudate nucleus. stimulation on memory. Behav. Brain Res. 15: 83–91. Furthermore, inactivation of the amygdala with lidocaine Liang, K. C., R. G. Juler, and J. L. McGaugh. (1986). Modulating infusions prior to retention testing does not block the effects of posttraining epinephrine on memory: Involvement of enhanced memory (Packard, Cahill, and McGaugh 1994). the amygdala noradrenergic system. Brain Research 368: 125– Research with humans has provided additional evidence 133. that amygdala activation is involved in modulating the con- McGaugh, J. L. (1973). Drug facilitation of learning and memory. solidation of long-term memory. In subjects tested several Ann. Rev. Pharmacol. 13: 229–241. weeks after viewing emotionally arousing film clips, mem- McGaugh, J. L. (1983). Hormonal influences on memory. Ann. ory of the content of the film clips correlated very highly Rev. Psychol. 34: 297–323. 524 Mental Causation in silicon-based systems of Alpha Centaurians—or appro- McGaugh, J. L., and P. E. Gold. (1989). Hormonal modulation of memory. In R. B. Rush and S. Levine, Eds., Psychoendocrinol- priately programmed computing machines. There is, then, ogy. New York: Academic Press. no prospect of locating a unique physical property to iden- McGaugh, J. L., and M. J. Herz. (1972). Memory Consolidation. tify with pain. San Francisco: Albion. When, however, we try to reconcile “multiple realizabil- McGaugh, J. L., L. Cahill, and B. Roozendaal. (1996). Involve- ity” with the idea that the physical realm is causally self- ment of the amygdala in memory storage: Interaction with contained, trouble arises. Consider your body, a complex other brain systems. Proc. Natl. Acad. Sci. U.S.A. 93: 13508– physical system composed of microparticles interacting in 13514. accord with fundamental physical laws. The behavior of Mueller, G. E., and A. Pilzecker. (1900). Experimentelle Beiträge those particles, hence the behavior of your body, is com- zur Lehre vom Gedächtnis. Z. Psychol. 1: 1–288. pletely determined (albeit probabilistically) by those laws. Packard, M. G., L. Cahill, and J. L. McGaugh. (1994). Amygdala modulation of hippocampal-dependent and caudate nucleus- Now suppose you step on a tack, experience a pain, and dependent memory processes. Proc. Natl. Acad. Sci. U.S.A. 91: quickly withdraw your foot. Common sense tells us that 8477–8481. your pain played a causal role in your foot’s moving. But Parent, M. B., M. West, and J. L. McGaugh. (1994). Memory of can this be right? Your experiencing a pain is a matter of rats with amygdala lesions induced 30 days after footshock- your possessing a “higher-level” property, a property motivated escape training reflects degree of original training. thought to be distinct from any of the properties possessed Behavioral Neuroscience 6: 1959–1064. by your “lower-level” physical constituents. It appears, Roozendaal, B., L. Cahill, and J. L. McGaugh. (1996). Interaction however, that your behavior is entirely determined by of emotionally activated neuromodulatory systems in regulat- “lower-level” nonmental goings-on. In what sense, then, is ing memory storage. In K. Ishikawa, J. L. McGaugh, and H. your experience of pain “causally relevant” to the move- Sakata, Eds., Brain Processes and Memory, Excerpta Medica International Congress Series 1108. Amsterdam: Elsevier, pp. ment of your foot? 39–54. Some philosophers have responded to this difficulty by Russell, W. R., and P. W. Nathan. (1946). Traumatic amnesia. adopting a deflationary view of CAUSATION. As they see it, Brain 69: 280–300. causation is just counterfactual dependence: roughly, if E Sandi, C., and S. P. R. Rose. (1994). Corticosterone enhances long- would not have occurred unless C had, then C causes E term retention in one-day old chicks trained in a weak passive (LePore and Loewer 1987). Others, appealing to scientific avoidance paradigm. Brain Research 647: 106–112. practice, have suggested that we replace metaphysically Soetens, E., S. Casear, R. D’Hooge, and J. E. Hueting. (1995). loaded references to causes with talk of causal explanation Effect of amphetamine on long-term retention of verbal mate- (Wilson 1995). Still others argue that the causal relevance of rial. Psychopharmacology 119: 155–162. “higher-level” properties requires reduction: “Higher-level” mental properties must be identified with “lower-level” Mental Causation properties (Kim 1989). This strategy resolves one problem of mental causation, but at a cost few philosophers seem The problem of mental causation is the most recent incarna- willing to pay. tion of the venerable MIND-BODY PROBLEM, Schopen- Lack of enthusiasm for reduction is due in part to the hauer’s “world knot”. DESCARTES held that mind and body widespread belief that mental properties are “multiply real- were distinct kinds of entity that interact causally, but waf- izable,” hence distinct from their physical realizers, and in fled over the question how this was possible. Mind-body part to a no less widespread commitment to externalism. dualism of the Cartesian sort is no longer popular, but vari- Externalists hold that the “contents” of states of mind ants of the Cartesian problem remain, and waffling is still in depend on agents’ contexts. Wayne, for example, believes fashion. Current worries about mental causation stem from that water is wet. Wayne’s belief concerns water, and the two sources: (1) “nonreductive” conceptions of the mental; “content” of his belief is that water is wet. Imagine an exact and (2) externalism (or anti-INDIVIDUALISM) about mental duplicate of Wayne, Dwayne, who inhabits a distant planet, “content.” TWIN EARTH, an exact duplicate of Earth with one important Although most theorists have left dualism behind, many difference: the colorless, tasteless, transparent liquid that remain wedded to the Cartesian idea that the mental and the fills rivers and bathtubs on Twin Earth differs in its molecu- physical are fundamentally distinct. Mental properties, lar constitution from water. Water is H2O. The substance on though properties of physical systems, are “higher-level” Twin Earth that Dwayne and his fellows call “water” is properties not reducible to or identifiable with “lower-level” XYZ. Now, so the story goes, although Wayne and Dwayne properties of those systems. Most functionalist accounts of are alike intrinsically, their thoughts differ. Wayne believes the mind embrace this picture, endorsing the slogan that that water is wet; whereas Dwayne’s beliefs concern, not mental properties are “multiply realizable.” Take the prop- water, but what we might call “twin water” (see Putnam erty of being in pain. This property, like the property of 1975). being an eye, is said to be a functional, “second-order” Thought experiments of this sort have convinced many property, one that a creature possesses by virtue of possess- philosophers that the contents of thoughts depend, at least in ing some first-order “realizer”—a particular physical con- part, on thinkers’ surroundings and causal histories (see figuration or process, for instance. Pains are capable of Burge 1986; Davidson 1987; Baker 1987). A contextualism endless “realizations” in the human nervous system, in the of this sort introduces a new twist on the problem of mental very different nervous system of a cephalopod, and perhaps causation, and simultaneously renders REDUCTIONISM even Mental Models 525 less attractive. Surely the contents of your thoughts are rele- Davidson, D. (1987). Knowing one’s own mind. Proceedings and Addresses of the American Philosophical Association 60: 441– vant to what those thoughts lead you to do. You flee because 458. you believe the creature on the path in front of you is a Fodor, J. (1987). Psychosemantics. Cambridge: MIT Press. skunk. Had you believed instead that the creature was a cat, Heil, J., and A. Mele, Eds. (1993). Mental Causation. Oxford: you would have behaved differently. If the content of your Clarendon Press. belief—its being a belief about a skunk—depends on your Jackson, F. (1996). Mental causation. Mind 105: 377–413. causal history, however, how could it make a here-and-now Kim, J. (1989). The myth of nonreductive materialism. Proceed- physical difference to the way you move your body? ings and Addresses of the American Philosophical Association A molecule, a billiard ball, a planet, or a brain behaves as 63: 31–47. Reprinted in Supervenience and Mind. Cambridge: it does and reacts to incoming stimuli because of its intrinsic Cambridge University Press, 1993, pp. 265–284. physical makeup. But if everything you do is a function of LePore, E., and B. Loewer. (1987). Mind matters. Journal of Phi- losophy 84: 630–642. your intrinsic physical properties, and if the contents of your Putnam, H. (1975). The meaning of “meaning.” In Mind, Lan- thoughts depend on relations you bear to other things, then guage and Reality. Cambridge: Cambridge University Press, it is hard to see how the contents of your thoughts could pp. 215–271. make any difference at all to what you do. Wilson, R. (1995). Cartesian Psychology and Physical Minds. Again, some philosophers have sought to accommodate Cambridge: Cambridge University Press. externalism and mental causation via deflationary accounts of causation or appeals to explanatory norms. Others have Mental Models defended a notion of “narrow content,” mental content that depends only on agents’ intrinsic composition (Fodor 1987). Wayne and Dwayne, for instance, are said to enter- As psychological representations of real, hypothetical, or tain thoughts that have the same “narrow content” but differ imaginary situations, mental models were first postulated by in their “broad content.” Externalists have been unenthusi- the Scottish psychologist Kenneth Craik (1943), who wrote astic about “narrow content.” And, in any case, even if we that the mind constructs “small-scale models” of reality to embrace “narrow content,” so long as we assume that men- anticipate events, to reason, and to underlie EXPLANATION. tal properties are irreducible “higher-level” properties of The models are constructed in working memory as a result physical systems, we are left with our initial worry about the of perception, the comprehension of discourse, or imagina- causal irrelevance of “higher-level” properties. tion (see MARR 1982; Johnson-Laird 1983). A crucial fea- Externalism aside, perhaps we could make progress by ture is that their structure corresponds to the structure of distinguishing predicates and properties. Predicates apply to what they represent. Mental models are accordingly akin to objects by virtue of properties those objects possess, but not architects’ models of buildings and to chemists’ models of every predicate designates a property. The predicate “is a complex molecules. tree” applies to objects by virtue of their properties, but The structure of a mental model contrasts with another there is no property of being a tree common to all trees. Per- sort of MENTAL REPRESENTATION. Consider the assertion haps mental predicates are like this. The predicate “pain,” The triangle is on the right of the circle. for instance, might apply to many different kinds of object, not because these objects share some single property, but Its meaning can be encoded in the mind in a propositional because they are similar in important ways: they possess representation, for example: distinct, though similar, first-order physical properties (right-of triangle circle) (which have uncontroversial causal roles). A view of this sort allows that “pain” applies truly to creatures in distress, The structure of this representation is syntactic, depending although it obliges us to abandon the philosopher’s conceit on the conventions governing the LANGUAGE OF THOUGHT: that the predicate “pain” thereby designates a property The predicate “right-of” precedes its subject “triangle” and shared by every creature to whom “pain” is truly predicable. its object “circle.” In contrast, the situation described by the Whether these remarks are on the right track, they sug- assertion can be represented in a mental model: gest that the philosophy of mind would benefit from an infu- sion of good old-fashioned metaphysics. Until we are clear ∆ O on the nature of properties, for instance, or the character of “multiple realizability,” we shall not be in a position to make headway on the problem of mental causation. The structure of this representation is spatial: it is isomor- phic to the actual spatial relation between the two objects. See also EPIPHENOMENALISM; INTENTIONALITY; PHYSI- The model captures what is common to any situation where CALISM; PROPOSITIONAL ATTITUDES; SUPERVENIENCE a triangle is on the right of a circle. Although it represents —John Heil nothing about their distance apart or other such matters, the shape and size of the tokens can be revised to take into References account subsequent information. Mental models appear to underlie visual IMAGERY. Unlike images, however, they can Baker, L. R. (1987). Saving Belief. Princeton: Princeton University represent three dimensions (see MENTAL ROTATION), nega- Press. tion, and other abstract notions. The construction of models Burge, T. (1986). Individualism and psychology. Philosophical from propositional representations of discourse is part of the Review 45: 3–45. 526 Mental Models process of comprehension and of establishing that different Nearly everyone responds “yes” (Johnson-Laird and Gold- expressions refer to the same entity. How this process varg 1997). Yet the response is a fallacy. If there were an ace occurs has been investigated in detail (e.g., Garnham and in the hand, then two of the assertions would be true, con- Oakhill 1996). trary to the rubric that only one of them is true. The illusion If mental models are the end result of perception and arises because individuals’ mental models represent what is comprehension, they can underlie reasoning. Individuals true for each premise, but not what is false concomitantly use them to formulate conclusions, and test the strength of for the other two premises. A variety of such illusions occur these conclusions by checking whether other models of the in all the main domains of reasoning. They can be reduced premises refute them (Johnson-Laird and Byrne 1991). This by making what is false more salient. theory is an alternative to the view that DEDUCTIVE REASON- The term mental model is sometimes used to refer to the ING depends on formal rules of inference akin to those of a representation of a body of knowledge in long-term mem- logical calculus. The distinction between the two theories ory, which may have the same sort of structure as the mod- parallels the one in LOGIC between proof-theoretic methods els used in reasoning. Psychologists have investigated based on formal rules and model-theoretic methods based, mental models of such physical systems as handheld cal- say, on truth tables. Which psychological theory provides a culators, the solar system, and the flow of electricity (Gen- better account of human reasoning is controversial, but tner and Stevens 1983). They have studied how children mental models have a number of advantages. They provide a develop such models (Halford 1993), how to design arti- unified account of deductive, probabilistic, and modal rea- facts and computer systems for which it is easy to acquire soning. People infer that a conclusion is necessary —it must models (Ehrlich 1996), and how models of one domain be true—if it holds in all of their models of the premises; may serve as an ANALOGY for another domain. Research- that it is probable—it is likely to be true—if it holds in most ers in artificial intelligence have similarly developed qual- of their models of the premises; and that it is possible—it itative models of physical systems that make possible may be true—if it holds in at least one of their models of the “commonsense” inferences (e.g., Kuipers 1994). To under- premises. Thus an assertion such as stand phenomena as a result either of short-term processes such as vision and inference or of long-term experience There is a circle or there is a triangle, or both appears to depend on the construction of mental models. yields three models, each of which corresponds to a true The embedding of one model within another may play a possibility, shown here on separate lines: critical role in METAREPRESENTATION and CONSCIOUS- NESS. O See also CAUSAL REASONING; SCHEMATA ∆ —Philip N. Johnson-Laird ∆ O References The modal conclusion Craik, K. (1943). The Nature of Explanation. Cambridge: Cam- bridge University Press. It is possible that there is both a circle and a triangle Ehrlich, K. (1996). Applied mental models in human-computer interaction. In J. Oakhill and A. Garnham, Eds., Mental Models follows from the assertion, because it is supported by the in Cognitive Science. Mahwah, NJ: Erlbaum. third model. Experiments show that the more models Garnham, A., and J. V. Oakhill. (1996). The mental models theory needed for an inference, the longer the inference takes and of language comprehension. In B. K. Britton and A. C. the more likely an error is to occur (Johnson-Laird and Graesser, Eds., Models of Understanding Text. Hillsdale, NJ: Byrne 1991). Models also have the advantage that they can Erlbaum, pp. 313–339. serve as counterexamples to putative conclusions—an Gentner, D., and A. L. Stevens, Eds. (1983). Mental Models. Hills- advantage over formal rules of inference that researchers in dale, NJ: Erlbaum. artificial intelligence exploit in LOGICAL REASONING SYS- Halford, G. S. (1993). Children’s Understanding: The Develop- TEMS (e.g., Halpern and Vardi 1991). ment of Mental Models. Hillsdale, NJ: Erlbaum. Halpern, J. Y., and M. Y. Vardi. (1991). Model checking vs. theo- Mental models represent explicitly what is true, but not rem proving: A manifesto. In J. A. Allen, R. Fikes, and E. what is false (see the models of the disjunction above). An Sandewall, Eds., Principles of Knowledge Representation and unexpected consequence of this principle is the existence of Reasoning: Proceedings of the Second International Confer- “illusory inferences” to which nearly everyone succumbs ence. San Mateo, CA: Kaufmann, pp. 325–334. (Johnson-Laird and Savary 1996). Consider the following Johnson-Laird, P. N. (1983). Mental Models: Towards a Cognitive problem: Science of Language, Inference, and Consciousness. Cam- bridge: Cambridge University Press; Cambridge, MA: Harvard Only one assertion is true about a particular hand of University Press. cards: Johnson-Laird, P. N., and R. M. J. Byrne. (1991). Deduction. Hills- dale, NJ: Erlbaum. There is a king in the hand or there is an ace, or both. Johnson-Laird, P. N., and Y. Goldvarg. (1997). How to make the There is a queen in the hand or there is an ace, or both. impossible seem possible. In Proceedings of the Nineteenth There is a jack in the hand or there is a ten, or both. Annual Conference of the Cognitive Science Society, Stanford, Is it possible that there is an ace in the hand? CA. Hillsdale, NJ: Erlbaum, pp. 354–357. Mental Representation 527 device (see COMPUTATIONAL THEORY OF MIND), the mental Johnson-Laird, P. N., and F. Savary. (1996). Illusory inferences about probabilities. Acta Psychologica 93: 69–90. representation bearers will be computational structures or Kuipers, B. (1994). Qualitative Reasoning: Modeling and Simula- states. The specific nature of these structures or states tion with Incomplete Knowledge. Cambridge, MA: MIT Press. depends on what kind of computer the mind/brain is Marr, D. (1982). Vision: A Computational Investigation into the hypothesized to be. To date, cognitive science research has Human Representation and Processing of Visual Information. focused on two kinds: conventional (von Neumann, sym- San Francisco: Freeman. bolic, or rule-based) computers and connectionist (parallel distributed processing) computers (see COGNITIVE MODEL- Further Readings ING, SYMBOLIC and COGNITIVE MODELING, CONNECTION- IST). If the mind/brain is a conventional computer, then the Byrne, R. M. J. (1996). A model theory of imaginary thinking. In J. Oakhill and A. Garnham, Eds., Mental Models in Cogni- mental representation bearers will be data structures. Koss- tive Science. Hove, England: Taylor and Francis. pp. 155– lyn’s (1980) work on mental IMAGERY provides a nice illus- 174. tration. If the mind/brain is a connectionist computer, then Garnham, A. (1987). Mental Models as Representations of Dis- the representation bearers of occurrent mental states will be course and Text. Chichester: Ellis Horwood. activation states of connectionist nodes or sets of nodes. In Glasgow, J. I. (1993). Representation of spatial models for geo- the first case, representation is considered to be “local”; in graphic information systems. In N. Pissinou, Ed., Proceedings the second, “distributed” (see DISTRIBUTED VS. LOCAL REP- of the ACM Workshop on Advances in Geographic Information RESENTATION and McClelland, Rumelhart, and Hinton Systems. Arlington, VA: Association for Computing Machin- 1986). There may also be implicit representation (storage of ery, pp. 112–117. information) in the connections themselves, a form of repre- Glenberg, A. M., M. Meyer, and K. Lindem. (1987). Mental mod- els contribute to foregrounding during text comprehension. sentation appropriate for dispositional mental states. Journal of Memory and Language 26: 69–83. While individual claims about what our representations Hegarty, M. (1992). Mental animation: Inferring motion from are about are frequently made in the cognitive science litera- static diagrams of mechanical systems. Journal of Experimen- ture, we do not know enough to theorize about the semantics tal Psychology: Learning, Memory, and Cognition 18: 1084– of our mental representation system in the sense that linguis- 1102. tics provides us with the formal SEMANTICS of natural lan- Johnson-Laird, P. N. (1993). Human and Machine Thinking. Hills- guage (see also POSSIBLE WORLDS SEMANTICS and DYNAMIC dale, NJ: Erlbaum. SEMANTICS). However, if we reflect on what our mental rep- Legrenzi, P., V. Girotto, and P. N. Johnson-Laird. (1993). Focus- resentations are hypothesized to explain—namely, certain sing in reasoning and decision making. Cognition 49: 37–66. features of our cognitive capacities—we can plausibly infer Moray, N. (1990). A lattice theory approach to the structure of mental models. Philosophical Transactions of the Royal Society that the semantics of our mental representation system must of London B 327: 577–583. have certain characteristics. Pretheoretically, human cogni- Polk, T. A., and A. Newell. (1995). Deduction as verbal reasoning. tive capacities have the following three properties: (1) each Psychological Review 102: 533–566. capacity is intentional, that is, it involves states that have Rogers, Y., A. Rutherford, and P. A. Bibby, Eds. (1992). Models in content or are “about” something; (2) virtually all of the the Mind: Theory, Perspective and Application. London: Aca- capacities can be pragmatically evaluated, that is, they can be demic Press. exercised with varying degrees of success; and (3) most of Schaeken, W., P. N. Johnson-Laird, and G. d’Ydewalle. (1996). the capacities are productive, that is, once a person has the Mental models and temporal reasoning. Cognition 60: 205– capacity in question, he or she is typically in a position to 234. manifest it in a practically unlimited number of novel ways. Schwartz, D. (1996). Analog imagery in mental model reasoning: Depictive models. Cognitive Psychology 30: 154–219. To account for these features, we must posit mental represen- Stevenson, R. J. (1993). Language, Thought and Representation. tations that can represent specific objects; that can represent New York: Wiley. many different kinds of objects—concrete objects, sets, properties, events, and states of affairs in this world, in possi- ble worlds, and in fictional worlds as well as abstract objects Mental Representation such as universals and numbers; that can represent both an object (in and of itself) and an aspect of that object (or both To understand the nature of mental representation posited extension and intension); and that can represent both cor- by cognitive scientists to account for various aspects of rectly and incorrectly. In addition, if we take the productivity human and animal cognition (see Von Eckardt 1993 for a of our cognitive capacities seriously, we must posit represen- more detailed account), it is useful to first consider repre- tations with constituent structure and a compositional sentation in general. Following Peirce (Hartshorne, Weiss, semantics. (Fodor and Pylyshyn 1988 use this fact to argue and Burks 1931–1958), we can say that any representation that our mental representation system cannot be connection- has four essential aspects: (1) it is realized by a representa- ist; see CONNECTIONISM, PHILOSOPHICAL ISSUES.) tion bearer; (2) it has content or represents one or more Cognitive scientists are interested not only in the content objects; (3) its representation relations are somehow of mental representations, but also in where this content “grounded”; and (4) it can be interpreted by (will function comes from, that is, in what makes a mental representation as a representation for) some interpreter. of a tree have the content of being about a tree. Theories of If we take one of the foundational assumptions of cogni- what determines content are often referred to as this-or-that tive science to be that the mind/brain is a computational kind of “semantics.” Note, however, that it is important to 528 Mental Representation distinguish such “theories of content determination” (Von the representation represents can make a difference to the Eckardt 1993) from the kind of semantics that systemati- internal states and behavior of the subject. This aspect of cally describes the content being determined (i.e., the kind mental representation has received little explicit attention; referred to in the previous paragraph). indeed, its importance and even its existence have been dis- There are currently five principal accounts of how mental puted by some. Nevertheless, many cognitive scientists hold representational content is grounded. Two are discussed that the interpretant of a mental representation, for a given elsewhere (see FUNCTIONAL ROLE SEMANTICS and INFOR- subject, consists of all the possible (token) computational MATIONAL SEMANTICS). The remaining three are character- consequences, including both the processes and the results ized below. of these processes, contingent on the subject’s actively 1. Structural isomorphism. A representation is under- “entertaining” that representation. stood to be “some sort of model of the thing (or things) it Cognitive scientists engaged in the process of modeling or represents” (Palmer 1978). The representation (or more pre- devising empirical theories of specific cognitive capacities cisely, the representation bearer) represents aspects of the (or specific features of those capacities) often posit particular represented object by means of aspects of itself. Palmer kinds of mental representations. For pedagogical purposes, (1978) treats both the representation bearer and the repre- Thagard (1995) categorizes representations into six main sented object as relational systems, that is, as sets of constit- kinds, each of which is typically associated with certain types uent objects and sets of relations defined over these objects. of computational processes: sentences or well-formed formu- A representation bearer then represents a represented object las of a logical system (see LOGICAL REASONING SYSTEMS); under some aspect if there exists a set G of relations consti- rules (see PRODUCTION SYSTEMS and NEWELL); representa- tuting the representation bearer and a set D of relations con- tions of concepts such as frames; SCHEMATA; scripts (see stituting the object such that G is isomorphic to D. CATEGORIZATION), analogies (see ANALOGY), images; and 2. Causal historical. (Devitt 1981; Sterelny 1990) connectionist representations. Another popular distinction is Intended to apply only to the mental analogues of designa- between symbolic representation (found in “conventional” tional expressions, this account holds that a token designa- computational devices) and subsymbolic representation tional “expression” in the LANGUAGE OF THOUGHT designates (found in connectionist devices). There is unfortunately no an object if there is a certain sort of causal chain connecting conceptually tidy taxonomy of representational kinds. Some- the representation bearer with the object. Such causal chains times such kinds are distinguished by their computational or include perceiving the object, designating the object in natu- formal characteristics—for example, local versus distributed ral language, and borrowing a designating expression from representation in connectionist systems. Sometimes they are another person (see REFERENCE, THEORIES OF). distinguished in terms of what they represent—for example, 3. Biological function. In this account (Millikan 1984), phonological, lexical, syntactic, and semantic representation mental representations, like animal communication signals, in linguistics and psycholinguistics. And sometimes both are “intentional icons,” a form of representation that is form and content play a role. Paivio’s (1986) dual-coding the- “articulate” (has constituent structure and a compositional ory claims that there are two basic modes of representation— semantics) and mediates between producer mechanisms and imagistic and propositional. According to Eysenck and interpreter mechanisms. The content of any given represen- Keane (1995), imagistic representations are modality- tation bearer will be determined by two things—the system- specific, nondiscrete, implicit, and involve loose combination atic natural associations that exist between the family of rules, whereas propositional representations are amodal, dis- intentional icons to which the representation bearer belongs crete, explicit, and involve strong combination rules. The first and some set of representational objects, and the biological contrast, modality-specific versus amodal, refers to the aspect functions of the interpreter device. More specifically, a rep- under which the object is represented, hence to content; the resentation bearer will represent an object if the existence of other three contrasts all concern form. a mapping from the representation bearer family to the Not all philosophers interested in cognitive science object family is a condition of the interpreter device suc- regard the positing of mental representations as being nec- cessfully performing its biological functions. Take the asso- essary or even unproblematic. Stich (1983) argues that if ciation between bee dances and the location of nectar one compares a “syntactic theory of mind” (STM), which relative to the hive. The interpreter device for bee dances treats mental states as relations to purely syntactic mental consists of the gatherer bees, among whose biological func- sentence tokens and which frames generalizations in tions are those adapted to specific bee dances, for example, purely formal or computational terms, with representa- finding nectar 120 feet to the north of the hive in response tional approaches, STM will win. Representational to, say, bee dance 23. The interpreter function can success- approaches, in his view, necessarily encounter difficulties fully perform its function, however, only if bee dance 23 is explaining the cognition of young children, “primitive” in fact associated with the nectar’s being at that location. folk, and the mentally and neurally impaired. STM does It can be argued that for a mental entity or state to be a not. Nor is it clear that cognitive science ought to aim at representation, it must not only have content, it must also be explaining the sorts of intentional phenomena (capacities significant for the subject who has it. According to Peirce, a or behavior) that mental representations are typically pos- representation having such significance can produce an ited to explain. “interpretant” state or process in the subject, and this state Even more damning critiques of mental representation or process is related to both the representation and the sub- can be found in Judge (1985) and Horst (1996). Judge ject in such a way that, by means of the interpretant, what accepts the Peirceian tripartite conception of representation Mental Retardation 529 according to which a representation involves a representa- Thagard, P. (1995). Mind: Introduction to Cognitive Science. Cam- bridge, MA: MIT Press. tion bearer R, an object represented O, and an interpretant I, Von Eckardt, B. (1993). What is Cognitive Science? Cambridge, but takes the interpretant to require an agent performing an MA: MIT Press. intentional act such as understanding R to represent O, which causes problems for mental representation, in her Further Readings view. Understanding R to represent O itself necessitates that the agent have nonmediated access to O. But, if we assume Anderson, J. R. (1983). The Architecture of Cognition. Cambridge, that all cognition is mediated by mental representation, this MA: Harvard University Press. is impossible. (Another problem with this view of the inter- Anderson, J. R. (1993). Rules of the Mind. Hillsdale, NJ: Erlbaum. Bechtel, W., and A. Abrahamson. (1991). Connectionism and the pretant, not discussed by Judge, is that it leads to an infinite Mind: An Introduction to Parallel Processing in Networks. regress of mental representations.) Oxford: Blackwell. Horst (1996) also believes that cognitive science’s Block, N. (1986). Advertisement for a semantics for psychology. attempt to explain INTENTIONALITY by positing mental rep- In P. A. French, T. E. Uehling, Jr., and H. K. Wettstein, Eds., resentations is fundamentally confused. Mental representa- Studies in the Philosophy of Mind, vol. 10. Minneapolis: Uni- tions are usually taken to be symbols. But a symbol, in the versity of Minnesota Press, pp. 615–678. standard semantic sense, involves conventions, both with Cummins, R. (1989). Meaning and Mental Representation. Cam- respect to its meaning and with respect to its syntactic type. bridge, MA: MIT Press. And because conventions themselves involve intentionality, Devitt, M., and K. Sterelny. (1987). Language and Reality. Cam- intentionality cannot be explained by positing mental repre- bridge, MA: MIT Press. Dretske, F. (1981). Knowledge and the Flow of Information. Cam- sentations. An alternative is to treat “mental symbol” as a bridge, MA: MIT Press. technical term. But Horst argues that, viewed in this techni- Fodor, J. (1975). The Language of Thought. New York: Crowell. cal way, the positing of mental representations also fails to Fodor, J. (1981). Representations. Cambridge, MA: MIT Press. be explanatory. Furthermore, even if such an alternative Fodor, J. (1987). Psychosemantics. Cambridge, MA: MIT Press. approach were to work, cognitive science would still be sad- Fodor, J. (1990). A Theory of Content and Other Essays. Cam- dled with the conventionality of mental syntax. bridge, MA: MIT Press. See also CONCEPTS; KNOWLEDGE REPRESENTATION; Genesereth, M. R., and N. J. Nilsson. (1987). Logical Foundations MENTAL MODELS of Artificial Intelligence. Los Altos, Ca.: Kaufmann. Hall, R. (1989). Computational approaches to analogical reason- —Barbara Von Eckardt ing: A comparative analysis. Artificial Intelligence 39: 39–120. Holyoak, K. J., and P. Thagard. (1995). Mental Leaps: Analogy in References Creative Thought. Cambridge, MA: MIT Press. Johnson-Laird, P. N. (1983). Mental Models. Cambridge, MA: Devitt, M. (1981). Designation. New York: Columbia University Harvard University Press. Press. Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Eysenck, M. W., and M. T. Keane. (1995). Cognitive Psychology: Imagery Debate. Cambridge, MA: MIT Press. A Student’s Handbook. Hillsdale, NJ: Erlbaum. Lloyd, D. (1987). Mental representation from the bottom up. Syn- Fodor, J. A., and Z. W. Pylyshyn. (1988). Connectionism and cog- thèse 70: 23–78. nitive architecture: A critical analysis. In S. Pinker and J. Loar, B. (1981). Mind and Meaning. Cambridge: Cambridge Uni- Mehler, Eds., Connections and Symbols. Cambridge, MA: MIT versity Press. Press, pp. 3–71. Millikan, R. (1989). Biosemantics. Journal of Philosophy 86: 281– Hartshorne, C., P. Weiss, and A. Burks, Eds. (1931–1958). Col- 297. lected Papers of Charles Sanders Peirce. Cambridge, MA: Har- Minsky, M. (1975). A framework for representing knowledge. In P. vard University Press. H. Winston, Ed., The Psychology of Computer Vision. New Horst, S. W. (1996). Symbols, Computation, and Intentionality: A York: McGraw-Hill, pp. 211–277. Critique of the Computational Theory of Mind. Berkeley: Uni- Newell, A. (1990). Unified Theories of Cognition. Cambridge, versity of California Press. MA: Harvard University Press. Judge, B. (1985). Thinking About Things: A Philosophical Study of Rumelhart, D. E. (1980). Schemata: The building blocks of cogni- Representation. Edinburgh: Scottish Academic Press. tion. In R Spiro, B. Bruce, and W. Brewer, Eds., Theoretical Kosslyn, S. M. (1980). Image and Mind. Cambridge, MA: Harvard Issues in Reading Comprehension. Hillsdale, NJ: Erlbaum, pp. University Press. 33–58. McClelland, J. L., D. E. Rumelhart, and G. E. Hinton, Eds. (1986). Schank, R. C., and R. P. Abelson. (1977). Scripts, Plans, Goals, Parallel Distributed Processing: Explorations in the Micro- and Understanding: An Inquiry into Human Knowledge Struc- structures of Cognition. 2 vols. Cambridge, MA: MIT Press. tures. Hillsdale, NJ: Erlbaum. Millikan, R. (1984). Language, Thought, and Other Biological Searle, J. R. (1983). Intentionality. Cambridge: Cambridge Univer- Categories. Cambridge, MA: MIT Press. sity Press. Paivio, A. (1986). Mental Representations: A Dual Coding Smith, E. E., and D. L. Medin. (1981). Categories and Concepts. Approach. Oxford: Oxford University Press. Cambridge, MA: Harvard University Press. Palmer, S. E. (1978). Fundamental aspects of cognitive representa- tion. In E. Rosch and B. Lloyd, Eds., Cognition and Categori- Mental Retardation zation. Mahwah, NJ: Erlbaum. Sterelny, K. (1990). The Representational Theory of Mind: An Introduction. Oxford: Blackwell. Definitions of mental retardation characteristically include Stich, S. P. (1983). From Folk Psychology to Cognitive Science. three criteria: (1) significantly subaverage INTELLIGENCE Cambridge, MA: MIT Press. 530 Mental Retardation both for research purposes and for educational intervention accompanied by (2) significant limitations in adaptive skills (Goodman 1990; Hodapp 1997). Extensive characteriza- with (3) an onset during the developmental period. Thus, tions are provided in the Manual of Diagnosis and Profes- according to the DSM-IV (American Psychiatric Associa- sional Practice in Mental Retardation (American tion 1994), individuals are considered to have mental retar- Psychological Association 1996). Individuals with mild dation if (1) they have a current IQ, based on an individually mental retardation are expected to attain a mental age of administered test, of approximately two or more standard between 8 and 12 years. Many individuals acquire fluent deviations below the mean (e.g., ~70); (2) they have signifi- language by adolescence; READING and arithmetic skills are cant limitations (relative to those expected for chronological age and sociocultural background) in two or more of the fol- usually between the first and sixth grade levels. Indepen- lowing domains: communication, social/interpersonal skills, dence in both employment and daily living is typically self-care, home living, self-direction, leisure, functional attained. Individuals with moderate mental retardation are academic skills, use of community resources, work, health, expected to attain a mental age of between 6 and 8 years, to and safety; and (3) these difficulties were first evidenced acquire functional language abilities, but not functional prior to age 18 years. Each of the other major organizations reading or arithmetic skills, and to require supervision dur- involved in the treatment of individuals with mental retarda- ing adulthood. Individuals with severe mental retardation tion—the American Association on Mental Retardation are expected to attain a mental age of between 4 and 6 years, (AAMR), the American Psychological Association (APA), although language abilities will be at a lower level. During and the World Health Organization (WHO)—accepts the adulthood, assistance is required for self-care skills. Individ- same three criteria, but implements them in slightly differ- uals with profound mental retardation attain a mental age of ent ways (see Luckasson et al. 1992; American Psychologi- between birth and 4 years. Many of these individuals are cal Association 1996; World Health Organization 1996). medically fragile, with a very high early mortality rate. Dur- The DSM-IV, APA, and WHO ICD-10 definitions further ing adulthood, some individuals will be able to walk and to divide mental retardation into four levels: mild (IQ between produce single words. Pervasive supervision is required 50–55 and approximately 70); moderate (IQ between 35–40 throughout the life span. and 50–55); severe (IQ between 20–25 and 35–40); and pro- More recently, some researchers have begun to empha- found (IQ below 20). The AAMR definition also recognizes size the importance of etiology. There are more than 500 four levels of mental retardation, based on the intensity of known genetic causes of mental retardation (Flint and support needed to enhance independence, productivity, and Wilkie 1996), in addition to a wide range of teratogenic community integration: intermittent, limited, extensive, and causes (e.g., prenatal alcohol exposure). Each of these may pervasive. be expected to affect brain structure and function. The par- Epidemiological research indicates a prevalence of men- ticular areas and functions impacted will vary due to differ- tal retardation at between 0.8 and 1.2 percent. Per 1,000 ences in which genes are affected and the roles the particular individuals, approximately 3–6 have mild mental retarda- genes play in development, or which aspects of the brain tion, 2 have moderate mental retardation, 1.3 have severe were developing most rapidly at the time of exposure to a mental retardation, and 0.4 have profound mental retarda- particular teratogen. It is likely that some aspects of cogni- tion. The cause of mental retardation is known for approxi- tion will be more severely impacted than others, and that mately 33–52 percent of individuals with IQs between 50 areas of severe impact will vary from syndrome to syn- and 69: chromosomal: 4–8 percent; prenatal (multifactorial drome. If so, the overall mental age attributed to a given or environmental): 11–23 percent; perinatal or postnatal: 21 individual (which is used to indicate level of mental retarda- percent. For individuals with IQs below 50, the cause of tion) may not accurately reflect his or her abilities in specific mental retardation is known for 60–75 percent: chromo- domains. For example, consider Williams syndrome, which somal: 20–40 percent; prenatal: 20–30 percent; perinatal or is caused by a hemizygous microdeletion of chromosome postnatal: less than 20 percent (Pulsifer 1996). The most 7q11.23, encompassing at least fifteen genes. Full-scale IQs common known causes of mental retardation are fetal alco- range from less than 40 to about 90, with a mean of 55–60. hol syndrome, Down’s syndrome, and fragile X syndrome. Despite the fact that this mean IQ is in the range of mild Individuals with mental retardation often have additional mental retardation, individuals with Williams syndrome evi- disabilities (Batshaw and Shapiro 1997). Seizure disorders dence a wide range of ability levels, as a function of domain. and cerebral palsy are present in about 10 percent of indi- Their auditory rote memory ability is typically within the viduals with mild mental retardation and more than 20 per- normal range (greater than second percentile), and about cent of individuals with severe mental retardation. half have vocabulary or grammatical abilities, or both, Individuals with mental retardation are three or four times within the normal range. In contrast, levels of visual-spatial more likely than the general population to have a psychiatric constructive ability typically fall within the moderate to disorder (Kymissis and Leven 1994), with the likelihood of severe range of mental retardation (Mervis et al. forthcom- comorbid psychiatric disorder increasing as a function of ing; see also Bellugi, Wang, and Jernigan 1994). In addition, severity of mental retardation (mild mental retardation: 25 unlike the typical characterization of individuals with mild percent, severe: 50 percent). Sensory impairments are also mental retardation, most individuals with Williams syn- frequent (mild mental retardation: 24 percent, severe: 55 drome evidence additional psychopathology: attention defi- percent). cit hyperactivity disorder, anxiety disorder, or both (e.g., Historically, psychologists and educators have focused Dykens and Hodapp 1997). Perhaps because of these prob- on level of mental retardation rather than cause (etiology), lems, individuals with Williams syndrome also differ from Mental Rotation 531 the typical characterization of individuals with mild mental Mervis, C. B., C. A. Morris, J. Bertrand, and B. F. Robinson. (Forthcoming). Williams syndrome: Findings from an inte- retardation, by seldom being able to live independently. grated program of research. In H. Tager-Flusberg, Ed., Neu- As evidenced by the example of Williams syndrome, rodevelopmental Disorders. Cambridge, MA: MIT Press. summary test scores often do not represent well the level of Pulsifer, M. B. (1996). The neuropsychology of mental retardation. ability within individual domains. Thus it is crucial to take Journal of the International Neuropsychological Society 2: into account etiology when planning either basic research or 159–176. intervention. At the same time, it is important to remember World Health Organization (1996). Multiaxial Classification of that there is within-syndrome variability, both in overall IQ Child and Adolescent Psychiatric Disorders: The ICD-10 Clas- and fit to the behavioral phenotype associated with the syn- sification of Mental and Behavioral Disorders in Children and drome. Explication of both within- and between-syndrome Adolescents. Cambridge: Cambridge University Press. variability depends on coordination of research efforts Further Readings among researchers studying cognition, personality, brain structure and development, and genetics. Such interdiscipli- Adams, J. (1996). Similarities in genetic mental retardation and nary efforts should lead to a deeper understanding of basic neuroteratogenic syndromes. Pharmacology Biochemistry and processes and their relation to intelligence and adaptive Behavior 55: 683–690. functioning, whether the goal is to explain more fully a spe- Batshaw, M. L., Ed. (1997). Children with Disabilities. 4th ed. Bal- cific etiology or to elucidate processes relevant to mental timore: Brookes. retardation as a whole. Bouras, N., Ed. (1994). Mental Health in Mental Retardation: See also AUTISM: LANGUAGE AND THOUGHT; LANGUAGE Recent Advances and Practices. Cambridge: Cambridge Uni- versity Press. IMPAIRMENT, DEVELOPMENTAL; LURIA Detterman, D. K. (1987). Theoretical notions of intelligence and —Carolyn B. Mervis and Byron F. Robinson mental retardation. American Journal of Mental Deficiency 92: 2–11. Hodapp, R. M., and E. Zigler. (1995). Past, present, and future References issues in the developmental approach to mental retardation and developmental disabilities. In D. Cicchetti and D. J. Cohen, American Psychiatric Association. (1994). Diagnostic and Statisti- Eds., Developmental Psychopathology, vol. 2, Risk, Disorder, cal Manual of Mental Disorders. 4th ed. Washington, DC. and Adaptation. New York: Wiley. American Psychological Association. (1996). Definition of mental O’Brien, G., and W. Yule. (1995). Behavioral Phenotypes. Clinics retardation. In J. W. Jacobson and J. D. Mulick, Eds., Manual in Developmental Medicine 138. London: MacKeith Press. of Diagnosis and Professional Practice in Mental Retardation. Simonoff, E., P. Bolton, and M. Rutter. (1996). Mental retardation: Washington, DC: American Psychological Association, Edito- Genetic findings, clinical implications, and research agenda. rial Board of Division 33, pp. 13–38. Journal of Child Psychology and Psychiatry 37: 259-280. Batshaw, M. L., and B. K. Shapiro (1997). Mental retardation. In Tager-Flusberg, H., Ed. (Forthcoming). Neurodevelopmental Dis- M. L. Batshaw, Ed., Children with Disabilities. 4th ed. Balti- orders. Cambridge, MA: MIT Press. more: Brookes. Bellugi, U., P. P. Wang, and T. L. Jernigan. (1994). Williams syn- drome: An unusual neuropsychological profile. In S. H. Bro- Mental Rotation man and J. Grafman, Eds., Atypical Cognitive Deficits in Developmental Disorders: Implications for Brain Function. Hillsdale, NJ: Erlbaum, pp. 23–56. In Douglas Adams’s (1988) novel Dirk Gently’s Holistic Burack, J. A., R. M. Hodapp, and E. Zigler. (1988). Issues in the Detective Agency, a sofa gets stuck on a stairway landing. classification of mental retardation: Differentiating among Throughout the remainder of the novel, Dirk Gently pon- organic etiologies. Journal of Child Psychology and Psychiatry 29: 765–779. ders how to get it unstuck by imagining the sofa rotating Dykens, E. M., and R. M. Hodapp. (1997). Treatment issues in into various positions (he eventually solves the problem genetic mental retardation syndromes. Professional Psychol- using a time machine). The well-known psychologist Roger ogy: Research and Practice 28: 263–270. Shepard once had a somewhat similar experience, awaken- Flint, J., and A. O. M. Wilkie. (1996). The genetics of mental retar- ing one morning to “a spontaneous kinetic image of three- dation. British Medical Bulletin 52: 453-464. dimensional structures majestically turning in space” (Shep- Goodman, J. F. (1990). Technical note: Problems in etiological ard and Cooper 1982, 7). That experience inspired Shepard classifications of mental retardation. Journal of Child Psychol- and his student Jacqueline Metzler to run what has become ogy and Psychiatry 31: 465–469. a seminal experiment in cognitive science—one that both Hodapp, R. M. (1997). Direct and indirect behavioral effects of defines and operationalizes mental rotation. different genetic disorders of mental retardation. American Journal on Mental Retardation 102: 67–79. Shepard and Metzler (1971) presented subjects with Kymissis, P., and L. Leven. (1994). Adolescents with mental retar- images of novel three-dimensional (3-D) objects at various dation and psychiatric disorders. In N. Bouras, Ed., Mental orientations—on each trial a pair of images appeared side- Health in Mental Retardation: Recent Advances and Practices. by-side and subjects decided whether the two images Cambridge: Cambridge University Press, pp. 102–107. depicted the same (figures 1a and 1b) or different objects Luckasson, R., D. L. Coulter, E. A. Polloway, S. Reiss, R. L. Scha- (figure 1c) regardless of any difference in orientation. A lock, M. E. Snell, D. M. Spitalnik, and J. A. Stark. (1992). given 3-D object had two “handedness” versions: its “stan- Mental Retardation: Definition, Classification, and System of dard” version and a mirror-reflected version (equivalent to Supports. 9th ed. Washington, DC: American Association on the relationship between left- and right-handed gloves). Mental Retardation. 532 Mental Rotation to make the discrimination. Moreover, because handedness is only defined relative to the viewer, subjects presumably needed to align each test character with their egocentric ref- erence frame in which left and right are defined. The results, quite similar to those obtained with 3-D objects, confirmed this assumption—mean response times for judging handed- ness increased monotonically as the test characters were misoriented farther and farther from their canonical upright orientations. Response times turned out to be symmetric around 180 degrees—the point at which the characters were exactly upside down. Thus subjects were apparently men- tally rotating the characters in the shortest direction to the upright regardless of whether this rotation was clockwise or counterclockwise. Why are these findings so important? Much of the theo- rizing about COGNITIVE ARCHITECTURE during the 1960s assumed a symbolic substrate (e.g., PRODUCTION SYSTEMS) in which all mental representations were thought to have a common amodal format. Shepard's results demonstrated that at least some cognitive processes were modal, and, in particular, tied to visual perception. The hypothesis that mental rotation is a continuous process akin to a real-world rotation also has implications for the nature of IMAGERY, namely, that humans have the capacity to make judgments using inherently spatial representations and that such repre- sentations are sophisticated enough to support PROBLEM SOLVING, spatial reasoning, and SHAPE PERCEPTION. Not surprisingly, this claim evoked a great deal of skepticism. In response, Shepard and Cooper went on to meticu- lously demonstrate that mental rotation indeed involved a continuous mental transformation. Three critical results pro- Figure 1. Illustrative pairs of perspective views, including a pair vide converging evidence for this conclusion. First, Cooper differing by an 80-degree rotation in the picture plane (a), a pair and Shepard (1973) ran a variant of their familiar characters differing by an 80-degree rotation in depth (b), and a pair differing by a reflection as well as a rotation (c). experiment in which they preceded each letter or digit with a cue to the test character's orientation, identity, or both. They found that neither orientation nor identity alone was Different objects were always mirror reflections of one sufficient to diminish the effect of stimulus orientation on another, so objects could never be discriminated using dis- response times. In contrast, providing both, along with suffi- tinctive local features. Shepard and Metzler measured the cient time for the subject to prepare, removed almost all time it took subjects to make same/different discriminations effects of orientation. Thus it appears that mental rotation as a function of the angular difference between them. What operates on a representation that depicts a particular shape they found was a remarkably consistent pattern across both at a particular position in space—properties associated with picture plane and depth rotations—mean response times an image. increased linearly with increasing angular separation. This Second, Cooper and Shepard (1973; see also Cooper outcome provides evidence that subjects mentally rotate 1976) ran an experiment in which they controlled an indi- one or both objects until they are (mentally) aligned with vidual subject's putative rate of rotation across a series of one another. Shepard and Metzler suggest that the mental trials. Given this information, for a misoriented test letter or rotation process is an internal analogue of physical rotation, digit, they could predict the instantaneous orientation of the that is, a continuous shortest path 3-D rotation that would rotating mental image of that character. On each trial, a test bring the objects into alignment. character was presented, followed at some point by a probe A variation on Shepard and Metzler’s experiment dem- character. The task was to begin rotating the test character, onstrated that mental rotation is also used when subjects but then to judge the handedness of the probe. Under these judge the handedness of familiar objects. Another student of conditions, response times were essentially unrelated to the Shepard’s, Lynn Cooper, presented subjects with single absolute orientation of the probe, but increased monotoni- English letters or digits (Cooper and Shepard 1973). On cally with increasing angular distance from the presumed each trial a standard (e.g., “R”) or a mirror-reflected version orientation of the rotating mental image of the test character. (e.g., “ R ”) of a letter or digit was shown at some misorien- For example, when the predicted orientation of the test char- tation in the picture plane. Because subjects had to judge acter image and the visible probe corresponded, response whether the misoriented character was of standard or mirror times were independent of the actual orientation of the handedness they could not use local distinguishing features probe. Thus the changing image actually passes through all Metacognition 533 of the intermediate orientations—a property expected for a Further Readings truly analog transformation mechanism. Carpenter, P. A., and M. A. Just. (1978). Eye fixations during men- Third, Cooper and Shepard ran an experiment in which tal rotation. In J. W. Senders, D. F. Fisher, and R. A. Monty, subjects were given extensive practice judging the handed- Eds., Eye Movements and the Higher Psychological Functions. ness of characters mentally rotated in only one direction, for Hillsdale, NJ: Erlbaum, pp. 115–133. example, 0 degrees to 180 degrees counterclockwise. When Cohen, M. S., S. M. Kosslyn, H. C. Breiter, G. J. Digirolamo, W. these subjects were tested with characters misoriented L. Thompson, A. K. Anderson, S. Y. Bookheimer, B. R. Rosen, slightly past 180 degrees, say 190 degrees, the distribution and J. W. Belliveau. (1996). Changes in cortical activity during of response times had two peaks—one corresponding to mental rotation: A mapping study using functional MRI. Brain mentally rotating the short way around (clockwise) and one 119: 89–100. Cohen, D., and M. Kubovy. (1993). Mental rotation, mental repre- corresponding to mentally rotating the long way around sentation and flat slopes. Cognitive Psychology 25: 351–382. (counterclockwise and consistent with the practice subjects Cooper, L. A., and R. N. Shepard. (1984). Turning something over had received). Thus it was the actual angular distance tra- in the mind. Scientific American 251(6): 106–114. versed by a rotation that determined the time consumed— Georgopoulos, A. P., J. T. Lurito, M. Petrides, A. B. Schwartz, and again consistent with an analog mechanism. J. T. Massey. (1989). Mental rotation of the neuronal population In summary, there is compelling evidence for the use of a vector. Science 243: 234–236. continuous mental rotation process that brings images of Jolicoeur, P., S. Regehr, L. B. J. P. Smith, and G. N. Smith. (1985). two objects into correspondence or the image of a single Mental rotation of representations of two-dimensional and object into alignment with an internal representation. The three-dimensional objects. Canadian Journal of Psychology 39: existence of such a mechanism suggests that models of cog- 100–129. Pinker, S. (1984). Visual cognition: An introduction. Cognition 18: nitive architecture should include modality-specific mecha- 1–63. nisms that can support mental imagery. Mental rotation may Rock, I. (1973). Orientation and Form. New York: Academic also play a role in HIGH-LEVEL VISION, for example, in shape Press. perception (Rock and Di Vita 1987), FACE RECOGNITION Rock, I., D. Wheeler, and L. Tudor. (1989). Can we imagine how (Hill, Schyns, and Akamatsu 1997; Troje and Bülthoff objects look from other viewpoints? Cognitive Psychology 21: 1996), and OBJECT RECOGNITION (Jolicoeur 1985; Tarr 185–210. 1995; Tarr and Pinker 1989). Tarr, M. J., and S. Pinker. (1990). When does human object recog- See also MENTAL MODELS; MENTAL REPRESENTATION; nition use a viewer-centered reference frame? Psychological Science 1: 253–256. MODULARITY OF MIND —Michael Tarr Metacognition References Broadly defined, metacognition is any knowledge or cogni- Adams, D. (1988). Dirk Gently’s Holistic Detective Agency. New tive process that refers to, monitors, or controls any aspect of York: Pocket Books. cognition. Although its historical roots are deep (e.g., James Cooper, L. A. (1976). Demonstration of a mental analog of an 1890), the study of metacognition first achieved widespread external rotation. Perception and Psychophysics 19: 296–302. prominence in the 1970s through the work of Flavell (1979) Cooper, L. A., and R. N. Shepard. (1973). Chronometric studies of and others on developmental changes in children’s cognition the rotation of mental images. In W. G. Chase, Ed., Visual about MEMORY (“metamemory”), understanding (“metacom- Information Processing. New York: Academic Press, pp. 75– 176. prehension”), and communication (“metacommunication”). Hill, H., P. G. Schyns, and S. Akamatsu. (1997). Information and Metacognition is now seen as a central contributor to many viewpoint dependence in face recognition. Cognition 62: 201– aspects of cognition, including memory, ATTENTION, com- 222. munication, PROBLEM SOLVING, and INTELLIGENCE, with Jolicoeur, P. (1985). The time to name disoriented natural objects. important applications to areas like EDUCATION, aging, neu- Memory and Cognition 13: 289–303. ropsychology, and eyewitness testimony (Flavell, Miller, and Rock, I., and J. Di Vita. (1987). A case of viewer-centered object Miller 1993; Metcalfe and Shimamura 1994). In this sense at perception. Cognitive Psychology 19: 280–293. least, metacognition is a domain-general facet of cognition. Shepard, R. N., and L. A. Cooper. (1982). Mental Images and Although theorists differ in how to characterize some Their Transformations. Cambridge, MA: MIT Press. aspects of metacognition (see Schneider and Pressley 1989), Shepard, R. N., and J. Metzler. (1971). Mental rotation of three- dimensional objects. Science 171: 701–703. most make a rough distinction between metacognitive knowl- Tarr, M. J. (1995). Rotating objects to recognize them: A case edge and metacognitive regulation. Metacognitive knowledge study of the role of viewpoint dependency in the recognition of refers to information that individuals possess about their own three-dimensional objects. Psychonomic Bulletin and Review 2: cognition or cognition in general. Flavell (1979) further 55–82. divides metacognitive knowledge into knowledge about per- Tarr, M. J., and S. Pinker. (1989). Mental rotation and orientation- sons (e.g., knowing that one has a very good memory), tasks dependence in shape recognition. Cognitive Psychology 21, (e.g., knowing that categorizable items are typically easier to 233–282. recall than noncategorizable items), strategies (e.g., knowl- Troje, N., and H. H. Bülthoff. (1996). Face recognition under vary- edge of mnemonic strategies such as rehearsal or organiza- ing pose: The role of texture and shape. Vision Research 36: tion), and their interactions (e.g., knowing that organization 1761–1771. 534 Metacognition is usually superior to rehearsal if the task involves categoriz- mind and metacognition are often thought of as separate able items). Although even preschool children have some research domains. Certainly, the two areas of inquiry tend to metacognitive knowledge, marked developmental progress have different foci: Prototypical theory of mind studies occurs in all these areas and, indeed, continues to be made in assess younger children’s appreciation of the role of mental adolescence and beyond (e.g., Brown et al. 1983; Schneider states in the prediction and explanation of other people’s and Pressley 1989). behavior, whereas classical metacognitive studies examine Metacognitive regulation includes a variety of executive older children’s knowledge of mental processes in the self, functions such as planning, resource allocation, monitoring, often in an academic context. Still, no absolute distinction checking, and error detection and correction (Brown et al. between the two areas should be drawn: Both are centrally 1983). Nelson and Narens (1990) divide metacognitive regu- concerned with the study of cognition about cognition. lation into monitoring and control processes, defined in Metacognitive impairments are not limited to very young terms of whether information is flowing to or from the children. Poor metacognitive skills can also be found in “meta-level.” In monitoring (e.g., tracking one’s comprehen- learning disabled and mentally retarded individuals. Con- sion of material while READING), the meta-level receives in- versely, gifted individuals often have excellent metacogni- formation from ongoing, “object-level” cognition, whereas tive abilities (Jarman, Vavrik, and Walton 1995) which may in control (e.g., allocating effort and attention to important be especially evident in domains where they have special rather than trivial material), the meta-level modifies cogni- expertise (Alexander, Carr, and Schwanenflugel 1995). tion. Again, important developmental advances occur in Some aspects of metacognition may also be deficient in the both these metacognitive processes (e.g., Garner 1987). aged, although whether the impairments are a function of Although monitoring may occur without explicit aware- aging or some other factor (e.g., lack of college experience) ness, it often produces, and is in turn affected by, conscious is not always clear (Nelson 1992). Finally, individuals with metacognitive experiences (Flavell 1979): for example, the frontal lobe damage frequently show metacognitive deficits, feeling of knowing something but being unable to recall it. whereas those with damage to other parts of the cortex typi- Even two-year-olds may have some metacognitive experi- cally do not (Shimamura 1996). For example, frontal lobe ences, although older children and adults appear to be much patients are often unaware of their cognitive deficits, they better at interpreting and taking advantage of them (Flavell lack knowledge of and are impaired in their use of metacog- 1987). An important question concerns whether metacogni- nitive strategies, and the accuracy of their feelings of know- tive experiences such as feelings of knowing are actually ing is poor. That the locus of metacognition in the brain veridical indicators of underlying cognition. This issue has might be the frontal lobes should come as no surprise, given received close attention in recent years in the field of adult the extensive overlap between metacognitive regulation and cognition (e.g., Metcalfe and Shimamura 1994; Nelson executive functioning, an aspect of cognition long associ- 1992). The findings suggest that the presence or absence of ated with the prefrontal cortex. feelings of knowing does predict later recognition memory Much of the interest in metacognition derives from the (see Hart 1965 for the seminal finding). However, although belief that metacognitive skills significantly influence cogni- the accuracy of such feelings is typically above chance, it is tive performance. Of course, many factors are likely to affect far from perfect and appears to be somewhat task- behavior in a specific cognitive situation (Flavell, Miller, and dependent. Moreover, the mechanisms underlying feelings Miller 1993), including implicit processes of which the indi- of knowing are not yet fully clear: Individuals may have vidual is unaware (Reder 1996). Nevertheless, in the case of partial access to the unrecalled item, or alternatively, they memory (where the issue has been examined most thor- may simply infer the likelihood of knowing from other oughly), a moderately high correlation is often found related information that is accessible (Nelson 1992). between metamemory and task performance (Schneider and Metacognitive knowledge and regulation are often Pressley 1989). The association tends to be stronger for older closely intertwined. For example, knowing that a task is dif- children, for more difficult tasks, and for certain types of ficult can lead an individual to monitor cognitive progress metamemory (e.g., memory monitoring). Not surprisingly, very carefully. Conversely, successful cognitive monitoring the correlation between metamemory (e.g., strategy knowl- can lead to knowledge of which tasks are easy and which edge) and strategy use is typically higher than that between difficult. metamemory and performance, confirming that the links Precisely how metacognitive abilities are acquired is not between metacognition and task success are indeed complex. known, but the process is almost certainly multifaceted. Given that metacognitive abilities actually do enhance Likely contributors include general advances in self- cognitive performance, their acquisition should have far- regulation and reflective thinking, the demands of formal reaching educational implications. In this respect, it is schooling, and the modeling of metacognitive activity by encouraging that metacognitive strategies can sometimes be parents, teachers, and peers (see Flavell 1987). A critical successfully taught (e.g., Brown and Campione 1990). The precursor to the development of metacognition is the acqui- teaching is most effective if individuals are explicitly taught sition of initial knowledge about the existence of the mind how a strategy works and the conditions under which to use and mental states (i.e., the development of a THEORY OF it, and if they attribute performance gains to the strategy (e.g., Schneider and Pressley 1989). Importantly, it is under MIND). Well established by the end of the preschool years, these teaching conditions that individuals are most likely to such knowledge continues to develop in tandem with meta- maintain and generalize their newly acquired metacognitive cognition throughout middle childhood and adolescence skills. (Moses and Chandler 1992). Somewhat curiously, theory of Metaphor 535 See also Brown, A. L., and A. S. Palincsar. (1989). Guided cooperative COGNITIVE DEVELOPMENT; FOLK PSYCHOLOGY; learning and individual knowledge acquisition. In L. B. INTROSPECTION; LITERACY; METAREASONING; METAREPRE- Resnick, Ed., Knowing, Learning, and Instruction: Essays in SENTATION Honor of Robert Glaser. Hillsdale, NJ: Erlbaum, pp. 393–451. Duell, O. K. (1986). Metacognitive skills. In G. D. Phye and T. —Louis J. Moses and Jodie A. Baird Andre, Eds., Cognitive Classroom Learning: Understanding, Thinking, and Problem Solving. Orlando, FL: Academic Press, References pp. 205–242. Flavell, J. H., and H. M. Wellman. (1977). Metamemory. In R. V. Alexander, J. M., M. Carr, and P. J. Schwanenflugel. (1995). Kail, Jr., and J. W. Hagen, Eds., Perspectives on the Development Development of metacognition in gifted children: Directions of Memory and Cognition. Hillsdale, NJ: Erlbaum, pp. 3–33. for future research. Developmental Review 15: 1–37. Forrest-Pressley, D. L., G. E. MacKinnon, and T. G. Waller, Eds. Brown, A. L., J. D. Bransford, R. A. Ferrara, and J. C. Campione. (1985). Metacognition, Cognition, and Human Performance, 2 (1983). Learning, remembering, and understanding. In J. H. vols. Orlando, FL: Academic Press. Flavell and E. M. Markman, Eds., Handbook of Child Psychol- Garner, R., and P. A. Alexander. (1989). Metacognition: Answered ogy. Vol. 3, Cognitive Development. 4th ed. New York: Wiley, and unanswered questions. Educational Psychology 24: 143– pp. 77–166. 158. Brown, A. L., and J. C. Campione. (1990). Communities of learn- Kluwe, R. H. (1982). Cognitive knowledge and executive control: ing and thinking, or a context by any other name. In D. Kuhn, Metacognition. In D. R. Griffin, Ed., Animal Mind-Human Ed., Developmental Perspectives on Teaching and Learning Mind. New York: Springer, pp. 201–224. Thinking Skills. Basel: Karger, pp. 108–126. Markman, E. M. (1981). Comprehension monitoring. In W. P. Flavell, J. H. (1979). Metacognition and cognitive monitoring: A Dickson, Ed., Children’s Oral Communication Skills. New new area of cognitive-developmental inquiry. American Psy- York: Academic Press, pp. 61–84. chologist 34: 906–911. McGlynn, S. M., and D. L. Schacter. (1989). Unawareness of defi- Flavell, J. H. (1987). Speculations about the nature and develop- cits in neuropsychological syndromes. Journal of Clinical and ment of metacognition. In F. E. Weinert and R. H. Kluwe, Eds., Experimental Neuropsychology 11: 143–205. Metacognition, Motivation, and Understanding. Hillsdale, NJ: Paris, S. G., and P. Winograd. (1990). How metacognition can pro- Erlbaum, pp. 21–29. mote academic learning and instruction. In B. F. Jones and L. Flavell, J. H., P. H. Miller, and S. A. Miller. (1993). Cognitive Idol, Eds., Dimensions of Thinking and Cognitive Instruction. Development. 3rd ed. Englewood Cliffs, NJ: Prentice Hall. Hillsdale, NJ: Erlbaum, pp. 15–51. Garner, R. (1987). Metacognition and Reading Comprehension. Pressley, M., J. J. Borkowski, and W. Schneider. (1987). Cognitive Norwood, NJ: Ablex. strategies: Good strategy users coordinate metacognition and Hart, J. T. (1965). Memory and the feeling-of-knowing experience. knowledge. In R. Vasta and G. Whitehurst, Eds., Annals of Journal of Educational Psychology 56: 208–216. Child Development, vol. 5. Greenwich, CT: JAI Press, pp. 89– James, W. (1890). Principles of Psychology, vol. 1. New York: 129. Holt. Schneider, W., and F. E. Weinert, Eds. (1990). Interactions among Jarman, R. F., J. Vavrik, and P. D. Walton. (1995). Metacognitive Aptitudes, Strategies, and Knowledge in Cognitive Perfor- and frontal lobe processes: At the interface of cognitive psy- mance. New York: Springer-Verlag. chology and neuropsychology. Genetic, Social, and General Schraw, G., and D. Moshman. (1995). Metacognitive theories. Psychology Monographs 121: 153–210. Educational Psychology Review 7: 351–371. Metcalfe, J., and A. P. Shimamura, Eds. (1994). Metacognition: Schwanenflugel, P. J., W. V. Fabricius, and C. R. Noyes. (1996). Knowing about Knowing. Cambridge, MA: MIT Press. Developing organization of mental verbs: Evidence for the Moses, L. J., and M. J. Chandler. (1992). Traveler’s guide to chil- development of a constructivist theory of mind in middle child- dren’s theories of mind. Psychological Inquiry 3: 286–301. hood. Cognitive Development 11: 265–294. Nelson, T. O., Ed. (1992). Metacognition: Core Readings. Boston: Weinert, F. E., and R. H. Kluwe, Eds. (1987). Metacognition, Moti- Allyn and Bacon. vation, and Understanding. Hillsdale, NJ: Erlbaum. Nelson, T. O., and L. Narens. (1990). Metamemory: A theoretical Wellman, H. M. (1983). Metamemory revisited. In M. T. H. Chi, framework and new findings. In G. Bower, Ed., The Psychology Ed., Trends in Memory Development Research. Basel: Karger, of Learning and Motivation, vol. 26. New York: Academic pp. 31–51. Press, pp. 125–141. Yussen, S. R., Ed. (1985). The Growth of Reflection in Children. Reder, L. M., Ed. (1996). Implicit Memory and Metacognition. Orlando, FL: Academic Press. Mahwah, NJ: Erlbaum. Schneider, W., and M. Pressley. (1989). Memory Development Metaphor between 2 and 20. New York: Springer. Shimamura, A. P., (1996). The role of the prefrontal cortex in con- trolling and monitoring memory processes. In L. M. Reder, Ed., Metaphor, from the Greek for “transference,” is the use of Implicit Memory and Metacognition. Mahwah, NJ: Erlbaum, language that designates one thing to designate another in pp. 259–274. order to characterize the latter in terms of the former. Nomi- nal metaphors use nouns in this way, as in “My daughter is Further Readings an angel.” Predicative metaphors use verbs, as in “The dog flew across the back yard.” In addition to single words Borkowski, J. G., M. Carr, E. Rellinger, and M. Pressley. (1990). being used metaphorically, phrases, sentences, and more Self-regulated cognition: Interdependence of metacognition, extended texts can also function as metaphors, as in the attributions, and self-esteem. In B. F. Jones and L. Idol, Eds., assertion “Bravely the troops carried on” to refer to tele- Dimensions of Thinking and Cognitive Instruction. Hillsdale, phone operators who continued to work during a natural NJ: Erlbaum, pp. 53–92. 536 Metaphor Thus, while metaphors can suggest a comparison, they disaster. Sometimes a metaphor can be recognized because are primarily attributive assertions, not merely comparisons. it is literally false. When a proud father says, “My daughter To say that someone’s job is a jail is to attribute (i.e., trans- is an angel,” no one believes that she has wings. But a met- fer, in the original Greek sense) salient properties of the cat- aphor need not be literally false. The opposite assertion— egory jail to a particular job (Ortony 1979). That particular that one’s daughter is no angel—is literally true; she does job is now included in the general, abstract category of jail, not have wings. Yet this is not likely to be the speaker’s and as a consequence of that categorization is now similar in intended meaning, nor is it likely to be a hearer’s interpreta- relevant respects to literal jails (Glucksberg, McGlone, and tion. In each of these two cases, hearers must go beyond the Manfredi 1997). Predicative metaphors, in which verbs are literal meaning to arrive at the speaker’s intention—what used figuratively, function similarly. The verb to fly literally the hearer is intended to understand (see PSYCHOLINGUIS- entails movement in air. Because flying through the air epit- TICS and PRAGMATICS). omizes speed, expressions such as “He hopped on his bike Does the need to go beyond literal meanings imply that and flew home” are readily understood, just as nominal met- literal meanings have unconditional priority? The standard aphors, such as “His bike was an arrow,” are readily under- pragmatic theory of metaphor assumes that literal meanings stood. Arrows are prototypical members of the category of are always computed first, and only when a literal meaning speeding things; flying is a prototypical member of the cate- makes no sense in context are alternative, metaphorical gory of fast travel. For both nominal and predicative meta- meanings derived (Searle 1979). If this is so, then metaphor- phors, prototypical members of categories can be used as ical meaning should be ignored whenever a literal meaning metaphors to attribute properties to topics of interest. makes sense. However, people cannot ignore metaphors. Why are metaphors used instead of comparable literal Whenever metaphorical meanings are available, they are expressions? Often there are no comparable literal expres- automatically processed, even when there is no apparent sions (Black 1962), particularly when metaphor is used sys- need to do so (Glucksberg, Gildea, and Bookin 1982). Fur- tematically to describe one domain in terms of another. thermore, metaphors are no more difficult to understand Perceptual metaphors enable us to describe experiences in than comparable literal expressions (Ortony et al. 1978), one sense modality in terms of another, as in bright sound. suggesting that literal meanings do not have priority. Theories can be described in terms of structures, with corre- Metaphors have traditionally been viewed as implicit spondences between the blueprints and foundations of a comparisons. According to this view, metaphors of the form structure on the one hand, and those of a theory on the other. X is a Y are understood by converting them into simile form, Once a target domain (e.g., theories) has been described in X is like a Y. The simile is then understood by comparing terms of a source domain (e.g., buildings), then new corre- the properties of X and Y. This view has been challenged on spondences can be introduced, as in “The theory’s super- both theoretical and empirical grounds. One finding is par- structure is collapsing of its own weight.” Whether such ticularly telling. Metaphors in class inclusion form, such as systematic correspondences constitute conceptual knowl- “My lawyer is a shark” take less time to understand than edge per se or are primarily a means of describing and trans- when in simile form, such as “My lawyer is like a shark” mitting such knowledge remains an unresolved issue (see (Johnson 1996). That metaphors can be understood more easily than COGNITIVE LINGUISTICS and METAPHOR AND CULTURE; similes argues that metaphors are exactly what they seem Lakoff and Johnson 1980; McGlone 1996; Murphy 1996; to be, namely, class inclusion assertions (Glucksberg and Quinn 1991). Keysar 1990). In such assertions, the metaphor vehicle Domains for which metaphors seem particularly apt (e.g., shark) is used to refer to the category of predatory include science, emotions, personality characteristics, and creatures in general, not to the marine creature that is also politics (Cacciari forthcoming). Indeed, any domain can be named “shark.” This dual reference function of metaphor effectively framed by choice of metaphor. Immigration, for vehicles is clear in metaphors such as “Cambodia was example, can be viewed either as an invigorating process Vietnam’s Vietnam.” Here, the first mention of Vietnam (“New blood has been pumped into the city’s economy”) or refers to the nation of Vietnam. In contrast, the second as a threat (“The tide of refugees will soon drown us”). Sim- mention of Vietnam does not refer to that nation, but ilarly, different interpretations of feelings and interpersonal instead to the American involvement in Vietnam, which relations can be effectively revealed and communicated via has come to epitomize the category of disastrous military metaphor in clinical settings (Rothenberg 1984). interventions. That intervention has become a metaphor for Given the importance and ubiquity of metaphor, it is not such disasters, and so the word Vietnam can be used as a surprising that the beginnings of metaphorical thought and metaphor vehicle to characterize other ill-fated military language appear early in children’s cognitive and linguistic actions, such as Vietnam’s invasion of Cambodia. More development. Infants as young as two months can detect generally, metaphor vehicles such as Vietnam can be used intermodal correspondences (Starkey, Spelke, and Gelman as names for categories that have no names of their own 1983). Such correspondences represent a rudimentary form (Brown 1958). With continued use, once-novel metaphors of metaphorical conceptualization (Marks 1982). Children become frozen, their original metaphorical meanings as young as two years use and understand more abstract become literal, and their senses become dictionary entries. metaphorical correspondences, such as between the shoul- The word “butcher” is a case in point: It can be taken to ders of a person and those of a mountain, although sophisti- mean a meat purveyor, a bungler, or a vicious murderer, cated use of metaphors comes only with complex depending on the context. knowledge of relations among CONCEPTS and facility in Metaphor and Culture 537 analogical reasoning (Gentner and Markman 1977). As chil- Further Readings dren learn to distinguish between figurative and literal lan- Black, M. (1962). Models and Metaphors. Ithaca, NY: Cornell guage, they use the same “psychological mechanisms” to University Press. understand the one as they do the other (Miller 1979; 248). Cacciari, C., and S. Glucksberg. (1994). Understanding figurative Literal and nonliteral understanding develop hand in hand. language. In M. A. Gernsbacher, Ed., Handbook of Psycholin- See also ANALOGY; DISCOURSE; FIGURATIVE LANGUAGE; guistics. New York: Academic Press, pp. 447–477. MEANING Fernandez, J. W., Ed. (1991). Beyond Metaphor: The Theory of Tropes in Anthropology. Stanford, CA: Stanford University —Sam Glucksberg Press. Gentner, D. (1983). Structure-mapping: A theoretical framework References for analogy. Cognitive Science 7: 155–170. Gibbs, R. W., Jr. (1994). The Poetics of Mind: Figurative Thought, Black, M. (1962). Models and Metaphors. New York: Cornell Uni- Language and Understanding. Cambridge: Cambridge Univer- versity Press. sity Press. Black, M. (1979). More about metaphor. In A. Ortony, Ed., Meta- Keysar, B. (1989). On the functional equivalence of literal and phor and Thought. Cambridge: Cambridge University Press, metaphorical interpretations in discourse. Journal of Memory pp. 19–43. and Language 28: 375–385. Brown, R. (1958). Words and Things. New York: Free Press. Kittay, E. (1987). Metaphor: Its Cognitive Force and Linguistic Cacciari, C. (Forthcoming). Why do we speak metaphorically? Structure. Oxford: Clarendon Press. Reflections on the functions of metaphor in discourse and rea- Ortony, A. (1979). Beyond literal similarity. Psychological Review soning. In A. Katz, Ed., Figurative Language and Thought. 86: 151–180 New York: Oxford University Press. Ortony, A. (1993). Metaphor and Thought. 2nd ed. Cambridge: Gentner, D., and A. B. Markman. (1977). Structure-mapping in Cambridge University Press. analogy and similarity. American Psychologist 52: 45–56. Way, E. C. (1991). Knowledge Representation and Metaphor. Dor- Glucksberg, S., P. Gildea, and H. A. Bookin. (1982). On under- drecht: Kluwer. standing nonliteral speech: Can people ignore metaphors? Winner, E. (1988). The Point of Words: Children’s Understanding Journal of Verbal Learning and Verbal Behavior 21: 85–98. of Metaphor and Irony. Cambridge, MA: Harvard University Glucksberg, S., and B. Keysar. (1990). Understanding metaphoric Press. comparisons: Beyond similarity. Psychological Review 97: 3–18. Glucksberg, S., M. S. McGlone, and D. A. Manfredi. (1997). Prop- erty attribution in metaphor comprehension. Journal of Mem- Metaphor and Culture ory and Language 36: 50–67. Johnson, A. T. (1996). Comprehension of metaphor and similes: A reaction time study. Journal of Psycholinguistic Research 11: How culture might figure in the conceptual domain-to- 145–160. domain mappings that characterize METAPHOR has gone Lakoff, G., and M. Johnson. (1980). Metaphors We Live By. Chi- largely unaddressed. On the one hand, this is because cago: University of Chicago Press. anthropologists who study metaphor, and who belong to the Marks, L. E. (1982). Bright sneezes and dark coughs, loud sunlight interpretivist school and its offshoots, take the position that and soft moonlight. Journal of Experimental Psychology: culture resides in metaphors, as it does in other symbols— Human Perception and Performance 8: 77–193. and not in the use and sense people make of these. These McGlone, M. S. (1966). Conceptual metaphors and figurative lan- scholars draw on literary criticism, semiotics, structuralism, guage interpretation: Food for thought? Journal of Memory and and the like to interpret metaphors and other tropes (Linger Language 35: 544–565. 1994). Miller, G. A. (1979). Images and models: Similes and metaphors. In A. Ortony, Ed., Metaphor and Thought. Cambridge: Cam- On the other hand, the role of culture in the production bridge University Press, pp. 202–250. and comprehension of metaphor tends to be crowded out of Murphy, G. L. (1996). On metaphoric representation. Cognition systematic consideration by linguists, many of whom, per- 60: 173–204. haps understandably, have treated the metaphors occurring in Ortony, A. (1979). Beyond literal similarity. Psychological Review language as direct reflections of deeper conceptual struc- 86: 161–180. tures. On grounds of the ubiquity and automaticity of meta- Ortony, A., D. Schallert, R. Reynolds, and S. Antos. (1978). Inter- phor in speech, Lakoff and his colleagues (e.g., Lakoff and preting metaphors and idioms: Some effects of context on com- Johnson 1980) have made broad claims for the indispensable prehension. Journal of Verbal Learning and Verbal Behavior role of what they call “conceptual metaphors” in comprehen- 17: 465–478. sion. In a characteristic assertion of this position, Lakoff and Quinn, N. (1991). The cultural basis of metaphor. In J. W. Fernan- dez, Ed., Beyond Metaphor: The Theory of Tropes in Anthro- Turner (1989, xi) propose that “metaphor allows us to under- pology. Stanford, CA: Stanford University Press, pp. 56–93. stand our selves and our world in ways that no other modes Rothenberg, A. (1984). Creativity and psychotherapy. Psychoanal- of thought can.” One challenge to this view, from COGNITIVE ysis and Contemporary Thought 7: 233–268. ANTHROPOLOGY (Quinn 1991; 1997), holds that the meta- Searle, J. (1979). Metaphor. In A. Ortony, Ed., Metaphor and phors expressed in language are underlain by cultural under- Thought. Cambridge: Cambridge University Press, pp. 92–123. standings, which cannot be read directly from linguistic Starkey, P., E. Spelke, and R. Gelman. (1988). Detection of inter- metaphors but must be investigated independently. modal correspondences by human infants. Science 222: 179–181. Cultural understandings govern metaphor use in two Vosniadou, S., and A. Ortony. (1983). The emergence of the literal- ways. Sometimes a given domain of experience is under- metaphorical-anomalous distinction in young children. Child stood by analogy to another domain. Such an analogy and Development 54: 154–161. 538 Metaphor and Culture the extensive metaphorical language it provides may be cul- it!”). Each metaphor exemplifies a different kind of lasting turally and historically quite distinctive. Yet the analogy thing, and all convey the expectation that marriage is last- may be so well established that it is naturalized in thinking, ing—a key piece of Americans’ model of it. That different and the metaphors it provides have become standard parts of metaphors are used to capture the same shared understand- language, making it, not impossible, but difficult, for those ing is strong evidence that a speaker must have had this who have learned to conceptualize the world in this way to point already in mind and selected the metaphor to match it. think and talk in any other terms (Reddy 1979). Perhaps the Indeed, speakers will occasionally concatenate two or three most famous case is that of the “conduit” metaphor (Reddy different metaphors to emphasize a point and also readily 1979) for talking in English about meanings as transmitted convey the same understanding nonmetaphorically. Far in words—as in “Did I get my point across?” Various from following the entailments of a chosen metaphor, rea- authors have pointed to the force of the conduit model and soning in discourse on marriage commonly follows the ide- its metaphorical language, arguing that it has seriously con- alized cultural model of marriage, and employs different strained mathematical information theory (Reddy 1979); led metaphors or no metaphor at all, or at times switches from the aforementioned interpretive anthropologists to mistak- one metaphor to another in midstream to reach the conclu- enly locate culture in symbols (Linger 1994); and bedeviled sion being reasoned to (Quinn 1991). linguists themselves (Langacker 1991), possibly including Cultural exemplars such as being married in the Brett those who study metaphor. quote, or ongoing journeys, durable materials, and firmly Cultural understandings enter the use of metaphors in a held possessions in the examples from discourse about mar- second way, one that depends on their intentional selection. riage, can usefully be viewed as COGNITIVE ARTIFACTS, Metaphors are commonly employed in ordinary speech to though wholly internalized ones. That is to say, they medi- clarify to their audiences points that speakers are trying to ate performance of a commonplace cognitive task—in this convey. This communication task depends on knowledge case, the task of communicating accurately and efficiently. that the audience can be counted on to share intersubjec- The psychological processing required for this task lends tively with the speaker. Cultural knowledge is reliably so itself to a straightforward connectionist interpretation. Con- shared. A common misconception has been that metaphoric nections built up from experience between properties of the target domains are less well understood, perhaps because world and their known exemplars permit rapid, automatic they are abstract or intangible or unseen or unfamiliar, and identification of apposite metaphors. Of course, for meta- that metaphoric source domains are better understood (e.g., phors to do their work of clarification, members of a speech Lakoff and Turner 1989), perhaps because they are physical community must, and do, share a large stock of such cul- in nature or otherwise concretely experienced. Rather, meta- tural exemplars. Knowledge of these is accumulated from a phors intended for clarification are typically selected from variety of experience, both first- and secondhand. Crucial is among cultural exemplars of that feature of the target the ongoing experience of hearing and using metaphors in domain under discussion (Glucksberg 1991; Quinn 1997). speech, not only because it presents individuals with many Indeed, this is how metaphors do their work of clarifying, more exemplars than could possibly be encountered other- by introducing an outstanding and unambiguous instance of wise, but also because it weeds out more idiosyncratic the point being made. choices that would be ill understood by audiences, in favor Thus marriage, in the following example, is no more of more widely agreed upon cultural exemplars that com- concrete, tangible, knowable, familiar, or well understood to municate well. Through their repeated use as metaphors, the speaker than baseball. In a newspaper story on his retire- these more readily understood examplars gain even wider ment, Kansas City Royals third baseman George Brett was acceptance, sometimes becoming wholly conventional. quoted as saying, “I compare it to a marriage. We’ve had our See also ANALOGY; COGNITIVE LINGUISTICS; CULTURAL problems, but overall, we have had a good relationship. I SYMBOLISM; CULTURAL PSYCHOLOGY; FIGURATIVE LAN- never, ever want to put on another uniform” (USA Today, GUAGE; LANGUAGE AND CULTURE; LANGUAGE AND GENDER Wednesday, May 5, 1993). What is the case is that marriage —Naomi Quinn is exemplary, for Brett and his American audience, of a rela- tionship that is meant to endure and that does so (when it References does so) because it is rewarding despite its difficulties. This is why his metaphor gives readers a surer sense of the com- Glucksberg, S. (1991). Beyond literal meanings: The psychology plex idea Brett wants to convey about his relationship with of allusion. Psychological Science 2: 146–152. the Royals. Lakoff, G., and M. Johnson (1980). Metaphors We Live By. Chi- cago: University of Chicago Press. When metaphors serve in this way to clarify what we Lakoff, G., and M. Turner. (1989). More Than Cool Reason: A mean to say, the cultural understandings that underlie what Field Guide to Poetic Metaphor. Chicago: University of Chi- we mean may lend considerable regularity to the metaphors cago Press. chosen. Thus, for example, in Americans’ DISCOURSE, met- Langacker, R. W. (1991). Foundations of Cognitive Grammar. Vol. aphors for marriage all fall into eight classes that reflect an 2, Descriptive Application. Stanford: Stanford University Press. underlying cultural model of marriage. For instance, mar- Linger, D. T. (1994). Has culture theory lost its minds? Ethos 22: riage is seen as an ongoing journey (“Once the marriage was 284–315. formalized it was an unalterable course”), a durable material Quinn, N. (1991). The cultural basis of metaphor. In J. W. Fernan- (“You really have to start out with something strong if it’s dez, Ed., Beyond Metaphor: The Theory of Tropes in Anthro- going to last”), and a firmly held possession (“I think we got pology. Stanford: Stanford University Press. Metareasoning 539 fundamental mode of computation is based on PROBLEM Quinn, N. (1997). Research on shared task solutions. In C. Strauss and N. Quinn, Eds., A Cognitive Theory of Cultural Meaning. SOLVING. Whenever Soar does not have an unambiguous Cambridge: Cambridge University Press. rule telling it which problem-solving step to take next, it Reddy, M. J. (1979). The conduit metaphor: A case of frame con- invokes universal subgoaling to set up a metalevel problem flict in our language about language. In A. Ortony, Ed., Meta- space that will resolve the issue. As might be imagined from phor and Thought. Cambridge: Cambridge University Press. these examples, designers of such systems must take care to avoid an infinite regress of metameta . . . reasoning. Does metareasoning differ from “ordinary” reasoning? Metareasoning In all metalevel architectures, the metalevel is given direct access to object-level data structures. Thus metareasoning Metareasoning is reasoning about reasoning—in its broad- (at least in computers) can assume a completely and per- est sense, any computational process concerned with the fectly observable object-level state—which is seldom the operation of some other computational process within the case with ordinary reasoning about the external world. Fur- same entity. The term relies on a conceptual distinction thermore, it is possible to represent fully and exactly the between object-level deliberation about external entities, for nature of the available object-level computations. Thus it is example, considering the merits of various opening moves possible for the metalevel to simulate completely the object- one might make in a game of chess, and metalevel delibera- level computations under consideration (as is done in Soar). tion about internal entities (computations, beliefs, and so This would seem counterproductive, however, as a way of on), for example, deciding that it is not worth spending selecting among object-level computations because simulat- much time deliberating about which opening move to make. ing a computation (and hence knowing its outcome) is just a Genesereth and Nilsson (1987) provide formal definitions very slow way of doing the computation itself—knowledge along these lines. Smith (1986) makes a further distinction of the outcome of a computation is the outcome. For this between INTROSPECTION about purely internal entities and reason, Soar always compiles the results of subgoaling into reflection relating internal and external entities. In this view, a new rule, thereby avoiding deliberation in similar cases in a proposition such as “If I open the window I will know if future. Compilation of metareasoning into more efficient the birds are singing” is reflective, because it relates a phys- forms is perhaps the principal way an agent’s computational ical action to a future state of knowledge. performance can improve over time. The capacity for metareasoning serves several purposes In the research outlined thus far, the metareasoning con- in an intelligent agent. First, it allows the agent to control its sisted mostly of applying simple “IF-THEN” rules encoding object-level deliberations—to decide which ones to under- the system designer’s computational EXPERTISE; no standard take and when to stop deliberating and act. This is essential, of rationality for metareasoning was provided. The concept given the pervasive problem of COMPUTATIONAL COMPLEX- of rational metareasoning (Horvitz 1989; Russell and ITY in decision making, and the consequent need for Wefald 1989) had its roots in early work by I. J. Good BOUNDED RATIONALITY. In GAME-PLAYING SYSTEMS, for (1971) on “Type II rationality” and in information value the- example, the alpha-beta ALGORITHM makes a simple meta- ory (Howard 1966), which places a value on acquiring a level decision to avoid certain lines of deliberation about piece of information based on the expected improvement in future moves, taking advantage of a metalevel theorem to decision quality that results from its acquisition. A COMPU- the effect that these lines cannot affect the ultimate object- TATION can be viewed as the process of making explicit level decision. Second, metareasoning allows the agent to some information that was previously implicit, and there- generate computational and physical behaviors, such as fore value can be placed on computations in the same way. planning to obtain information, that require introspective or That is, a computation can be viewed as an action whose reflective reasoning. Third, it allows the agent to recover benefit is that it may result in better external decisions, and from errors or impasses in its object-level deliberations. whose cost is the delay it incurs. Thus, given a model of the Most early work on metareasoning focused on designing effects of computations and information about object-level an INTELLIGENT AGENT ARCHITECTURE (see also COGNITIVE utility, the metalevel can infer the value of computations. It ARCHITECTURE) that could support introspection and reflec- can decide which computations to do and when computa- tion. The use of metareasoning to control deduction seems tion should give way to action. to have been proposed first by Hayes (1973), although the The simplest applications of rational metareasoning arise first implementation was in the TEIRESIAS system (Davis in the context of anytime algorithms (Horvitz 1987; Dean 1980), which used metarules to control deliberation within a and Boddy 1988), that is, algorithms that can be interrupted rule-based expert system. The Metalevel Representation at any time and whose output quality improves continuously System, or MRS, (Genesereth and Smith 1981) used LOGIC with time. Each such algorithm has an associated perfor- PROGRAMMING for both object and metalevel inference and mance profile describing its output quality as a function of provided a very flexible interface between the two. Because time. The availability of the profile makes the metalevel MRS allowed reasoning about which procedure to use for decision problem—which algorithm to run and when to ter- each object-level inference, and about which representation minate—fairly trivial. The use of anytime algorithms to use for each object-level fact, it enabled many different devised for a wide variety of computational tasks has representations and reasoning methods to operate together resulted in a widely applicable methodology for building seamlessly. By far the most ambitious metalevel architec- complex, real-time decision-making systems (Zilberstein ture is Soar (Laird, Newell, and Rosenbloom 1987), whose and Russell 1996). 540 Metareasoning Figure 1. The consequence of computation: Lookahead reveals that B may in fact be better than A. A finer-grained approach to metareasoning can be but from the interaction of a general capacity for rational obtained by evaluating individual computation steps within metareasoning with object-level domain knowledge. More an algorithm. Consider the decision-making situation shown efficient, domain-specific computational behaviors might in figure 1a. An agent has two possible actions, A and B. then result from processes of compilation and metalevel Based on a quick assessment, the outcome of A appears to REINFORCEMENT LEARNING. be worth 10 with a standard deviation of 1, whereas the out- See also LOGIC; METACOGNITION; METAREPRESENTA- come of B seems to be worth 8 with a standard deviation of TION; MODAL LOGIC; RATIONAL AGENCY 4. The agent can choose A immediately, or it can refine its —Stuart J. Russell estimates by looking further into the future. For example (figure 1b), it can consider the actions B1 and B2, with the References outcomes shown. At this point, action B (followed by B1) seems to lead to a state with value 12; thus the lookahead Davis, R. (1980). Meta-rules: Reasoning about control. Artificial computation has changed the agent’s decision, with an Intelligence 15(3): 179–222. apparent benefit of 2. Obviously, this is a post hoc analysis, Dean, T., and M. Boddy. (1988). An analysis of time-dependent but, as shown by Russell and Wefald (1991), an expected planning. In Proceedings of the Seventh National Conference value of computation can be computed efficiently—prior to on Artificial Intelligence (AAAI-88). St. Paul, MN: Kaufmann, performing the lookahead. In figure 1a, this value is 0.3 for pp. 49–54. lookahead from A and 0.82 for lookahead from B. If the ini- Genesereth, M. R., and D. Smith. (1981). Meta-level architecture. Memo HPP-81-6. Computer Science Department, Stanford tial estimated outcome of A were 12, however, these values University. would drop to 0.002 and 0.06, respectively. Hence, as one Genesereth, M. R., and N. J. Nilsson. (1987). Logical Foundations would expect, the value of computation depends strongly on of Artificial Intelligence. San Mateo, CA: Kaufmann. whether a clear choice of action has already emerged. If, Ginsberg, M. L., and D. F. Geddis. (1991). Is there any need for however, the initial estimates for A and B were both 10, with domain-dependent control information? In Proceedings of the standard deviations of 0.1, then the value of computation Ninth National Conference on Artificial Intelligence, vol. 1. becomes 0.03. Computation is worthless when it does not (AAAI-91). Anaheim, California: AAAI Press, pp. 452–457. matter which action one eventually chooses. Good, I. J. (1971). Twenty-seven principles of rationality. In V. Rational metareasoning can be applied to control deliber- P. Godambe and D. A. Sprott, Eds., Foundations of Statisti- ations in a wide variety of object-level algorithms including cal Inference. Toronto: Holt, Rinehart and Winston, pp. 108– 141. HEURISTIC SEARCH and game playing (Russell and Wefald Hayes, P. J. (1973). Computation and deduction. In Proceedings of 1991), LOGICAL REASONING SYSTEMS (Smith 1989), and the Second Symposium on Mathematical Foundations of Com- MACHINE LEARNING (Rivest and Sloan 1988). An important puter Science Czechoslovakia. Czechoslovakian Academy of insight to emerge from this work is that a metareasoning Science. capability can, in principle, be domain independent (Russell Horvitz, E. J. (1987). Problem-solving design: Reasoning about and Wefald 1991; Ginsberg and Geddis 1991) because the computational value, trade-offs, and resources. In Proceedings necessary domain-specific information (such as the utility of the Second Annual NASA Research Forum. Moffett Field, function) can be extracted from the object level. One can CA: NASA Ames Research Center, pp. 26–43. therefore view successful computational behavior as emerg- Horvitz, E. J. (1989). Reasoning about beliefs and actions under ing not from carefully crafted, domain-specific algorithms computational resource constraints. In L. N. Kanal, T. S. Levitt, Metarepresentation 541 flicted with experimental studies of primates’ metarepresen- and J. F. Lemmer, Eds., Uncertainty in Artificial Intelligence, vol. 3. Amsterdam: Elsevier, pp. 301–324. tational abilities that have started with Premack and Howard, R. A. (1966). Information value theory. IEEE Transac- Woodruff’s pioneering article “Does the Chimpanzee Have a tions on Systems Science and Cybernetics SSC-2: 22–26. Theory of Mind?” (1978). Though the level of metarepresen- Laird, J. E., A. Newell, and P. S. Rosenbloom. (1987). SOAR: An tational sophistication of other primates is still disputed, that architecture for general intelligence. Artificial Intelligence of human beings is not. The human lineage may be the only 33(1): 1–64. one in which a true escalation of metarepresentational abili- Rivest, R. L., and R. Sloan. (1988). A new model for inductive ties has taken place. inference. In M. Vardi, Ed., Proceedings of the Second Confer- Humans are all spontaneous psychologists. They have ence on Theoretical Aspects of Reasoning about Knowledge. some understanding of cognitive functions such as perception San Mateo, CA: Kaufmann, pp. 13–27. and MEMORY (see METACOGNITION). They also attribute to Russell, S. J., and E. H. Wefald. (1989). On optimal game-tree search using rational metareasoning. Proceedings of the Elev- one another PROPOSITIONAL ATTITUDES such as beliefs and enth International Joint Conference on Artificial Intelligence desires, and do so as a matter of course. While philosophers (IJCAI-89). Detroit: Kaufmann, pp. 334–340. have described the basic tenets of this commonsense or FOLK Russell, S. J., and E. H. Wefald. (1991). Do the Right Thing: Stud- PSYCHOLOGY and discussed its empirical adequacy, psycholo- ies in Limited Rationality. Cambridge, MA: MIT Press. gists have focused on the development of this cognitive ability, Smith, B. C. (1986). Varieties of self-reference. In J. Halpern, Ed., often described as a THEORY OF MIND. Philosophers and psy- Theoretical Aspects of Reasoning about Knowledge. San chologists have been jointly involved in discussing the mecha- Mateo, CA: Kaufmann. nism through which humans succeed in metarepresenting Smith, D. E. (1989). Controlling backward inference. Artificial other people’s thoughts and their own. This investigation has Intelligence 39(2): 145–208. taken the form of a debate between those who believe attribu- Zilberstein, S., and S. Russell. (1996). Optimal composition of real-time systems. Artificial Intelligence 83. tion of mental states to others is done by simulation (e.g., Goldman 1993; Gordon 1986; Harris 1989), and those who believe it is done by inference from principles and evidence Metarepresentation (e.g., Gopnik 1993; Leslie 1987; Perner 1991; Wellman 1990). In this debate (see SIMULATION VS. THEORY-THEORY), much Cognitive systems are characterized by their ability to con- attention has been paid to different degrees of metarepresenta- struct and process representations of objects and states of tional competence that may be involved in attributing mental affairs. MENTAL REPRESENTATIONS and public representations states to others. In particular, the ability to attribute false such as linguistic utterances are themselves objects in the beliefs is seen as a sufficient, if not necessary, proof of basic world, and therefore potential objects of second-order repre- metarepresentational competence. This metarepresentational sentations, or “metarepresentations.” Under this or another competence can be impaired—the basis of a new, cognitive name, (e.g., “higher-order representations”), metarepresenta- approach to AUTISM. Conversely, the study of autism has con- tions are evoked in evolutionary approaches to INTELLIGENCE, tributed to the development of a finer-grained understanding in philosophical and developmental approaches to common- of metarepresentations (see Baron-Cohen 1995; Frith 1989). sense psychology, in pragmatic approaches to communication, Cognitive approaches have stressed the metarepresenta- in theories of consciousness, and in the study of reasoning. tional complexity of human communication. The very act of It is assumed that the members of most animal species are communicating involves, on the part of the communicator incapable of recognizing in themselves or attributing to con- and addressee, mutual metarepresentations of each other’s specifics mental representations such as beliefs or desires: mental states. In ordinary circumstances, the addressee of a they utterly lack metarepresentational abilities. Highly intel- speech act is interested in the linguistic MEANING of the utter- ligent social animals such as primates, on the other hand, are ance only as a means of discovering the speaker’s meaning. believed to have evolved an ability to interpret and predict Speaker’s meaning has been analyzed by the philosopher Paul the behavior of others by recognizing their mental states. GRICE (1989) in terms of several layers of metarepresenta- Indeed, Dennett (1987) has described some primates as tional intentions, in particular the basic metarepresentational “second-order intentional systems,” capable of having intention to cause in the addressee a certain mental state (e.g., “beliefs and desires about beliefs and desires.” Second-order a belief), and the higher-order metarepresentational intention intentional systems are, for instance, capable of deliberate to have that basic intention recognized by the addressee. deception. In a population of second-order intentional sys- Grice’s analysis of metarepresentational intentions involved tems, a third-order intentional system would be at a real in communication has been discussed and developed by advantage, if only because it would be able to see through numerous philosophers and linguists (e.g., Bach and Harnish deception. Similarly, in a population of third-order inten- 1979; Bennett 1976; Recanati 1986; Schiffer 1972; Searle tional systems, a fourth-order intentional system would be a 1969; Sperber and Wilson 1986). greater advantage still, with greater abilities to deceive others It has long been observed that human languages have the and avoid being deceived itself, and so on. Hence the semantic and syntactic resources to serve as metalanguages. hypothesis, supported by some ethological evidence, that pri- In direct and indirect quotations, utterances and meanings mates have developed a kind of strategic MACHIAVELLIAN are metarepresented. The study of such metalinguistic INTELLIGENCE (Byrne and Whiten 1988) involving higher- devices has been developed in semiotics (see SEMIOTICS AND order metarepresentational abilities. These evolutionary and COGNITION), in philosophy of language, and in PRAGMATICS. ethological arguments have in part converged, in part con- In particular, in the study of FIGURATIVE LANGUAGE, irony 542 Metarepresentation has been described as a means of distancing oneself from Gopnik, A. (1993). How we know our minds: The illusion of first- person knowledge of intentionality. Behavioral and Brain Sci- some propositional attitude by metarepresenting it (see ences 16: 1–14. Gibbs 1994; Sperber and Wilson 1986). Gordon, R. M. (1986). Folk psychology as simulation. Mind and The ability to metarepresent one’s own mental states Language 1: 158–171. plays an important role in CONSCIOUSNESS, and may even be Grice, H. P. (1989). Studies in the Way of Words. Cambridge, MA: seen as defining it. For David Rosenthal (1986; 1997) in Harvard University Press. particular, a mental state is conscious if it is represented in a Harris, P. L. (1989). Children and Emotion: The Development of higher-order thought. When a thought itself is conscious, Psychological Understanding. Oxford: Blackwell. then the higher-order thought that represents it is a straight- Johnson-Laird, P. N., and R. M. J. Byrne. (1991). Deduction. forward metarepresentation. These higher-order thoughts Hove: Erlbaum. may themselves be the object of yet higher-order thoughts: Leslie, A. M. (1987). Pretence and representation: The origins of “theory of mind.” Psychological Review 94: 412–426. The reflexive character of consciousness (i.e., that one can Perner, J. (1991). Understanding the Representational Mind. Cam- be conscious of being conscious) is then explained in terms bridge, MA: MIT Press. of a hierarchy of metarepresentations. While many philoso- Premack, D., and G. Woodruff. (1978). Does the chimpanzee have phers do not accept this “higher-order thought” theory of a theory of mind? Behavioral and Brain Sciences 1: 515–526. consciousness, the role of metarepresentations at least in Recanati, F. (1986). On defining communicative intentions. Mind aspects of consciousness and in related phenomena such as and Language 1(3): 213–242. INTROSPECTION is hardly controversial. Rips, L. (1994). The Psychology of Proof. Cambridge, MA: MIT Much of spontaneous human reasoning is about states of Press. affairs and how they relate to one another. But some reason- Rosenthal, D. M. (1986). Two concepts of consciousness. Philo- ing, especially deliberate reasoning as occurs in science or sophical Studies 49(3): 329–359. Rosenthal, D. M. (1997). A theory of consciousness. In N. Block, philosophy, is about hypotheses, theories, or claims—repre- O. Flanagan, and G. Güzeldere, Eds., The Nature of Conscious- sentations—and only indirectly about the state of affairs ness: Philosophical Debates. Cambridge, MA: MIT Press, pp. represented in these representations. In the psychology of 729–753. DEDUCTIVE REASONING, growing attention has been paid to Schiffer, S. (1972). Meaning. Oxford: Clarendon Press. such metarepresentational reasoning, in particular by exper- Searle, J. (1969). Speech Acts. Cambridge: Cambridge University imenting with liar and truth teller problems, either from the Press. point of view of “mental logic” (Rips 1994) or from that of Sperber, D., and D. Wilson. (1986). Relevance: Communication MENTAL MODELS (Johnson-Laird and Byrne 1991). In artifi- and Cognition. Oxford: Blackwell. cial intelligence, too, there is a growing interest in modeling Wellman, H. M. (1990). The Child’s Theory of Mind. Cambridge, such METAREASONING. MA: MIT Press. Whiten, A. (1991). Natural Theories of Mind: Evolution, Develop- This rapid overview does not exhaust the areas of cogni- ment and Simulation of Everyday Mindreading. Oxford: Black- tive science where metarepresentation (whether so named or well. not) plays an important role. In a great variety of cognitive activities, humans exhibit unique metarepresentational vir- Further Readings tuosity. This, together with the possession of language, may be their most distinctive cognitive trait. Baron-Cohen, S., A. Leslie, and U. Frith. (1985). Does the autistic See also INTENTIONAL STANCE; PRIMATE COGNITION; child have a “theory of mind”? Cognition 21: 37–46 RELEVANCE AND RELEVANCE THEORY Baron-Cohen, S., H. Tager-Flusberg, and D. J. Cohen, Eds. (1993). Understanding Other Minds: Perspectives from Autism. —Dan Sperber Oxford: Oxford University Press. Bogdan, R. J. (1997). Interpreting Minds: The Evolution of a Prac- tice. Cambridge, MA: MIT Press. References Carruthers, P. (1996). Language, Thought and Consciousness. Cambridge: Cambridge University Press. Bach, K. and R. Harnish. (1979). Linguistic Communication and Carruthers, P., and P. Smith, Eds. (1996). Theories of Theories of Speech Acts. Cambridge, MA: MIT Press. Mind. Cambridge: Cambridge University Press. Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Clark, H., and Gerrig, R. (1990). Quotations as demonstrations. Theory of Mind. Cambridge, MA: MIT Press. Language 66: 764–805. Bennett, J. (1976). Linguistic Behaviour. Cambridge: Cambridge Davies, M., and T. Stone, Eds. (1995). Folk Psychology: The The- University Press. ory of Mind Debate. Oxford: Blackwell. Byrne, R. W., and A. Whiten. (1988). Machiavellian Intelligence: Davies, M., and Humphreys, G., Eds. (1993). Consciousness. Social Expertise and the Evolution of Intellect in Monkeys, Oxford: Blackwell. Apes and Humans. Oxford: Oxford University Press. Davies, M., and T. Stone, Eds. (1995). Mental Simulation: Evalua- Dennett, D. (1987). The Intentional Stance. Cambridge, MA: MIT tions and Applications. Oxford: Blackwell. Press. Frith, U., and F. Happé. (1994b). Autism: Beyond “theory of Frith, U. (1989). Autism: Explaining the Enigma. Oxford: Black- mind.” Cognition 50: 115–132. well. Happé, F. (1994). Autism: An Introduction to Psychological The- Gibbs, R. (1994). The Poetics of Mind: Figurative Thought, Lan- ory. London: UCL Press. guage and Understanding. Cambridge: Cambridge University Lehrer, K. (1990). Metamind. Oxford: Oxford University Press. Press. Mithen, S. (1996). The Prehistory of the Mind. London: Thames Goldman, A. (1993). The psychology of folk psychology. Behav- and Hudson. ioral and Brain Sciences 16: 15–28. Meter and Poetry 543 poetry from Hans Sachs to Rilke, and Russian poetry from Sperber, D. (1994). Understanding verbal understanding. In J. Khalfa, Ed., What is Intelligence? Cambridge: Cambridge Uni- the eighteenth century to the present. versity Press. To illustrate the more complex machinery of foot-count- Sperber, D., Ed. (Forthcoming). Metarepresentations. Oxford: ing meters, we look at the so-called syllabo-tonic meters of Oxford University Press. English and, in particular, iambic pentameter. This meter is Sperber, D., and D. Wilson. (1981). Irony and the use-mention dis- made up, as its name suggests, of five feet. An iambic foot is tinction. In P. Cole, Ed., Radical Pragmatics. New York: Aca- made up of two syllables followed by a boundary. There is a demic Press, pp. 295–318. specific procedure that divides the line into iambic feet. Whiten, A., and R. W. Byrne. (1997). Machiavellian Intelligence. Consider the opening line of Gray’s “Elegy Written in a Vol. 2, Evaluations and Extensions. Cambridge: Cambridge Country Churchyard”: University Press. (3) The curfew tolls the knell of parting day. Meter and Poetry * ** * * * * ** * We represent each syllable in the verse by an asterisk beneath the line. The procedure is: insert right parenthesis The fundamental formal distinction between poetry and all from left to right, starting at the left edge so as to group other forms of literary art is this: poems are made up of asterisks into pairs: lines. But how long is a line? Lines of metrical verse are subject to measurement just as surely as if they were made (4) The curfew tolls the knell of parting day. of cloth and both poet and reader has a yardstick. In the case )* *)* *) * *)* *)* *) of cloth, you measure physical distance by counting in Although there are six parentheses in this line, there are yards. What do you use to measure the length of a line of only five feet because an iambic foot is defined as a poetry? The simplest kinds of meters are measured in sylla- sequence of two syllables followed by a parenthesis. The bles. There are only so many in a line. rule that inserts parentheses is: Such meters occur in the poetry of languages all over the world. Much of the poetry in the Romance languages counts (5) Insert a right parenthesis “)” at the beginning of the line syllables (Halle and Keyser 1980). Hebrew poetry of the and, proceeding rightward, after every other syllable Old Testament does as well (Halle 1997). So, too, do the (thus generating binary feet). Japanese verse forms known as tanka and haiku (Halle Readers familiar with iambic verse know that many lines 1970). In the five-line tanka, lines are 5-7-5-7-7 syllables of iambic pentameter verse are often longer than 10 syllables. long. In the three-line haiku, the lines are 5-7-5 syllables These exhibit what the textbooks call “feminine rhymes.” long. As shown by the poems in (1) and (2), syllables with The following couplet from Byron’s “Don Juan” (Canto 65) double vowels (i.e., long syllables) are counted as two units; is illustrative: all other syllables count as one. (6) Yet he was jealous, though he did not show it, (1) Haru tateba 5 When spring comes For jealousy dislikes the world to know it. Kiyuru koori no 7 the ice melts away Nokori naku 5 without trace The line-final rhyming pair is show it: know it. Each mem- Kimi ga kokoro mo 7 your heart ber of the pair has a so-called extrametrical syllable— Ware ni tokenamu 7 melts into me namely, it. Rule (5) automatically accounts for the possibil- (Kokinshuu) ity of such syllables in iambic verse: (2) Kaki kueba 5 Eating persimmons (7) Yet he was jealous, though he did not show it. Kanega narunari 7 the bell rings ) * *) * * ) * *) * *) * *) * Hooryuu-ji 5 Hooryuuji For jealousy dislikes the world to know it. (Masaoka) ) * * )* * ) * * ) * *) * *) * Japanese poets and their readers both know what a sylla- As before, right parentheses are inserted, beginning at ble is. They also know the difference between a long sylla- the left. Notice, however, that a right parenthesis cannot be ble and a short one. This shared knowledge is what Japanese inserted to the right of the final * in (7) because (5) requires meter depends upon. Segmenting a word into syllables that two syllables are skipped before a right parenthesis can requires a great deal of sophisticated knowledge about such be inserted. Here only a single * follows the last parenthesis. things as the difference between a vowel, a consonant, a liq- Therefore, no parenthesis is inserted line finally and the line uid, and a glide. No modern computer can reliably segment is correctly scanned as containing only five iambic feet. a stretch of speech into syllables. Yet speakers of all lan- Just as readers familiar with poetry written in iambic guages do this constantly and unconsciously. pentameter recognize that some lines are longer than 10 syl- Much more complex linguistic machinery is involved in lables, they also know certain lines are not possible iambic verse where line length is measured by counting feet rather pentameter lines, even though they may be composed of 10 than syllables. Foot-counting verse is encountered in a wide syllables. For example, a line like (8) is not a possible iam- range of poetic traditions, among them Homer and much of bic pentameter line. the poetry of classical Greek and Latin antiquity, the Old Norse bards, English poetry from Chaucer to Frost, German (8) On cutting Lucretia Borgia’s bright hair 544 Meter and Poetry This is not a well-formed configuration. We exclude it by This causes a problem for the account given so far means of prohibition (18): because (5) scans the line without difficulty: (9) On cutting Lucretia Borgia’s bright hair (18) Feet may not end on consecutive syllables. )* *)* *)* *) * *) * *) Rule (18) applies after the insertion of parentheses by (5) (5) must be modified in such a way that it will continue and (12). All scansions that conform to it are well-formed. to scan lines like those in (4) and (7) while ruling out lines Scansions that fail to do so are not. like (9). The theory of meter proposed contains two rules, one It is well known that in foot-based meters not only does output constraint, and a definition of stress maximum. The line length play a role but so does the placement of certain rules insert right foot boundaries, either to the right of stress marked syllables. In the English iambic meters, these are the maxima or else iteratively from left to right. The constraint syllables that bear the main stress of the word. We mark such rules out scansions with unary feet. The grammar is summa- metrically important syllables by inserting a bracket before rized in (19): or after them. The marked syllables in stress-based verse are (19) Rule: a. Insert a right bracket “]” to the right of a stress called stress maxima. A stress maximum is the main stressed maximum. syllable in a polysyllabic word. In the history of English, b. Insert a right parenthesis “)” from left to right, poets made use of slightly different definitions, extending the skipping two consecutive syllables. definition in some cases and restricting it in others. For Constraint: In iambic verse feet may not end on con- present purposes, the stress maximum is defined as: secutive syllables. (10) The stressed syllable of a polysyllabic word that is Stress Maximum: The stressed syllable of a polysyl- preceded by a lesser stressed syllable is a stress maxi- labic word preceded by a lesser stressed syllable. mum. Just as two speakers of a language share a body of lin- The stressed syllables in the following words are in- guistic knowledge called a grammar, which enables them to stances of stress maxima: speak to one another, so, too, do poets and their readers share a body of knowledge that enables the one to write (11) Lucrétia meándering metrically and the other to scan what has been put into autobiográphic pellúcid meter. The rules in (19) are an attempt at illustrating what Rule (12) accounts for the placement of such syllables in this body of shared metrical knowledge looks like. an iambic pentameter line. Up to now, nothing has been said about the machinery used to scan iambic pentameter lines. That machinery is (12) In an iambic line, insert a right bracket after (to the identical to that needed to account for the way ordinary right of) a stress maximum. speakers assign stress to the words of English (see STRESS, Let us see how rules (5) and (12) construct feet within a LINGUISTIC), and is part of the natural endowment of human line like (13). beings that enables them to speak a language—what lin- guists have come to call Universal Grammar (UG). In other (13) The curfew tolls the knell of parting day. words, poets who write metrically do so using the same the- oretical apparatus provided by UG that speakers use to The square brackets called for by stress maxima are inserted assign stress to the words of their language. This conver- first. They are no different from parentheses, but we use gence of the machinery of meter with the machinery of them to help the reader keep track of which rule is responsi- stress assignment is the essence of PROSODY. ble for a given boundary: See also LINGUISTIC UNIVERSALS AND UNIVERSAL GRAM- (14) The cúrfew tolls the knell of párting day. MAR; FIGURATIVE LANGUAGE; PHONOLOGICAL RULES AND * *]* ** * * *] * * PROCESSES; PHONOLOGY; PROSODY AND INTONATION, PRO- CESSING ISSUES Next come the parentheses inserted by (5): —Samuel Jay Keyser (15) The curfew tolls the knell of parting day. * *]) * *) * *) * *]) * *) References The line is correctly divided into five feet and, significantly, Halle, M. (1970). What is meter in poetry? In R. Jakobson and S. where right parentheses and right brackets occur in the line, Hattori, Eds., Sciences of Language: The Journal of the Tokyo they coincide. Now consider how (5) and (12) assign bound- Institute for Advanced Studies in Languages, vol. 2. Japan: aries to the unmetrical (8): Tokyo Institute for Advanced Studies of Language, pp. 124– 138. (16) On cutting Lucretia Borgia’s bright hair Halle, M. (1997). Metrical verse in the Psalms. In V. van der Meij, * *])* *)*])* *]) * *) * Ed., India and Beyond: Aspects of Literature, Meaning, Ritual and Thought. Essays in Honour of Frits Staal. London: Kegan Unlike the metrical lines discussed thus far, this line con- Paul International, in association with the International Institute tains two feet that end on consecutive syllables, namely: for Asian Studies, Leiden, pp. 207–225. (17) -ing Lucrétia Halle, M., and S. J. Keyser. (1980). Metrica. Enciclopedia IX. * *) *] * Torino, Italy: Einaudi. Mid-Level Vision 545 Further Readings Halle, M., and S. J. Keyser. (1998). Robert Frost’s loose and strict iambics. In E. Iwamoto, M. Muraki, M. Tokunaga, and N. Hasegawa, Eds., Festschrift in Honor of Kazuko Inoue. Tokyo, Japan: Kanda University of International Studies. Mid-Level Vision Mid-level vision refers to a putative level of visual process- Figure 1. ing, situated between the analysis of the image (lower-level vision) and the recognition of specific objects and events (HIGH-LEVEL VISION). It is largely a viewer-centered pro- cess, seemingly concerned explicitly with real-world scenes, not simply images (see Nakayama, He, and Shimojo 1995). Yet, in distinction to high-level vision, mid-level vision represents the world only in a most general way, dealing primarily with surfaces and objects and the fact that they can appear at different orientations, can be variously illuminated, and can be partially occluded. Vision as we understand it today is far more complicated than had been recognized even thirty to forty years ago. Figure 2. Despite the seeming unity of our visual experience, there is mounting evidence that vision is not a single function but is likely to be a conglomerate of functions, each acting with These characteristics, while not delineating mid-level considerable autonomy (Goodale 1995; Ungerleider and vision in its entirety, provide sufficient basis for characteriz- Mishkin 1982). Along with this new appreciation of vision’s ing it as qualitatively different from low- and high-level complexity comes the striking fact that from a purely ana- vision. Consider the “aperture” problem for motion and its tomical point of view, the portion of the brain devoted to solution, something that until recently has been considered vision is also much greater than previously supposed (All- as within the province of low-level vision. Since Wallach’s man and Kaas 1975; Felleman and van Essen 1991). For work (1935/1997), it has been recognized that there is an example, about 50 percent of the CEREBRAL CORTEX of pri- inherent ambiguity of perception if motion is analyzed mates is devoted exclusively to visual processing, and the locally, as would be the case for directionally selective estimated territory for humans is nearly comparable. So receptive fields (see circles in figure 2). Thus in the case of a vision by itself looms very large even when stacked up rightward-moving diamond (figure 2a), the local motions of against all other conceivable functions of the brain. As such, the edges are very different from the motion of the whole subdivisions in vision, particularly principled ones that figure. Yet, we are unaware of these local motions and see delineate qualitatively different processes, are sorely unified motion to the right. Computational models based on needed, and Marr’s (1982) seminal argument for three levels local motion measurements alone can recover the horizontal provides the broad base for what we outline here. motion of the single figure on the left, but they cannot Let us consider what processes might constitute mid- account for the perceived motion of one figure moving dif- level vision, and then contrast them with low-level and ferently from another on the right (figure 2b). Although the high-level vision. Good examples of mid-level visual pro- local motions here are essentially identical, our visual sys- cessing can be seen in the work of Kanizsa (1979). Com- tem sees the motion in each case to be very different. It sees pare figure 1a where we see many isolated fragments with rightward motion of a single object versus opposing vertical figure 1b where the same fragments are accompanied by motion of two objects. Only by the explicit parsing of the additional diagonal line segments. In figure 1b there is a moving scene into separate surfaces can the correct motion dramatic shift in what is perceived. The isolated pieces be recovered. Thus, directionally selective neurons by them- seen in figure 1a now form a single larger figure, the famil- selves cannot supply reliable information regarding the iar Necker cube. motion of objects. Mid-level vision, with its explicit encod- The phenomenon just described is characterized by sev- ing of distinct surfaces, is required. eral things, which all appear to be related to objects and sur- How might we distinguish mid-level from high-level faces and their boundaries. Furthermore, they are examples vision? Consider figure 3. Most obvious is the reversal of of occlusion, the partial covering of one surface by another. the duck and the rabbit. From the above discussion, it There is also the indication of inferences being made, should be clear that this reversal cannot be happening at the enabling us to represent something that has been made level of mid-level vision, which concerns itself more gener- invisible. We are thus aware of something continuing ally with surfaces and objects, but at higher levels where behind, which in turn enables us to see a single figure, not specific objects, like rabbits and ducks, are represented. For isolated fragments. mid-level vision there is no reversal. Here mid-level vision’s 546 Mind-Body Problem Figure 3. Figure 5. of surface representation, is required for a range of pro- cesses more traditionally associated with early vision, including motion perception (see MOTION, PERCEPTION OF), forms of stereopsis, TEXTURE segregation and saliency cod- Figure 4. ing. More speculatively, there has been a proposal that mid- level vision is the first level of processing, the results of job is to make sure we see a single thing or surface, despite which are available to conscious awareness (Jackendoff its division into four separate image fragments by the over- 1987; Nakayama, He, and Shimojo 1995), thus implying lying occluder and despite the change in its identity (the that mid-level vision is the earliest level to which ATTEN- rabbit vs. the duck). TION can be deployed. Another job of mid-level vision is to cope effectively See also CONSCIOUSNESS; GESTALT PERCEPTION; ILLU- with the characteristics of reflected light as it plays across SIONS; SHAPE PERCEPTION; SURFACE PERCEPTION; VISUAL surfaces in natural scenes. Surfaces can appear in various PROCESSING STREAMS guises in the image, the result of being illuminated from —Ken Nakayama various angles, being shaded by themselves or other sur- faces, and by being viewed through transparent media. It References would thus seem natural that various visual mechanisms would have developed or evolved to deal with these issues Allman, J. M., and J. H. Kaas. (1975). The dorsomedial cortical of illumination just as they have for cases of occlusion. visual area: a third tier area in the occipital lobe of the owl This view is strengthened by the existence of perceptual monkey (Aotus trivirgatus). Brain Research 100: 473–487. phenomena that provide at least some hint as to how such Felleman, D. J., and D. C. van Essen. (1991). Distributed hierarchi- processes may be occurring, also demonstrating the exist- cal processing in the primate cerebral cortex. Cerebral Cortex ence of processing that cannot be explained by low-level 1: 1–47. vision, say by lateral interactions of neurons with various Goodale, M. A. (1995). The cortical organization of visual percep- tion and visuomotor control. In S. M. Kosslyn and D. N. Osher- types of receptive fields. Consider White’s illusion shown son, Eds., Visual Cognition. Cambridge, MA: MIT Press. in figure 4 where the apparent difference in brightness of Jackendoff, R. (1987). Consciousness and the Computational the gray squares (top vs. bottom row) is very large despite Mind. Cambridge, MA: MIT Press. being of equal luminance. Each identical gray patch is Kanizsa, G. (1979). Organization in Vision: Essays on Gestalt Per- bounded by identical amounts of black and white areas, ception. New York: Praeger. thus ruling out any explanation based on simultaneous Marr, D. (1982). Vision. San Francisco, CA: Freeman. contrast or lateral inhibition. The major difference is the Nakayama, K., Z. J. He, and S. Shimojo. (1995). Visual surface nature of the junction structure bounding the areas, prop- representation: a critical link between lower-level and higher- erties very important in mid-level vision processing. Fig- level vision. In S. M. Kosslyn and D. N. Osherson, Eds., Visual ure 5 suggests that mid-level vision’s role is the Cognition. Cambridge, MA: MIT Press, pp. 1–70. Ungerleider, L. G., and M. Mishkin. (1982). Two cortical visual processing of shadows, showing how specific are the systems. In D. J. Ingle, M. A. Goodale, and R. J. W. Mansfield, requirements for a dark region to be categorized as Eds., Analysis of Visual Behavior. Cambridge, MA: MIT Press. shadow and how consequential this categorization is for Wallach, H. (1935). Über visuell wahrgenommene Bewegung- higher-level recognition. On the left we see a 3-D figure, a srichtung. Psychol. forschung 20: 325–380. Translated (1997) face. On the right, it looks more 2-D, where the outline by S. Wuenger, R. Shapley, and N. Rubin. On the visually per- around the dark region diminishes the impression that the ceived direction of motion. Perception 25: 1317–1368. figure contains shadows. Although phenomena related to mid-level vision have Mind-Body Problem been well-known, starting with GESTALT PSYCHOLOGY and more recently with work by Kanizsa (1979), the scope and positioning of mid-level vision in the larger scheme of The mind-body problem is the problem of explaining how visual processing has been unclear. Recently, Nakayama et our mental states, events, and processes are related to the al. (1995) have suggested that mid-level vision, in the form physical states, events, and processes in our bodies. A ques- Mind-Body Problem 547 tion of the form “How is A related to B?” does not by itself properties are identical with physical states or properties. pose a philosophical problem. To pose such a problem, there Sometimes called the “type-identity theory,” this view is has to be something about A and B that makes the relation considered an empirical hypothesis, awaiting confirmation between them seem problematic. Many features of mind by science. The model for such an identity theory is the and body have been cited as responsible for our sense of the identification of properties such as the heat of a gas with the problem. Here I will concentrate on two: the apparent causal mean kinetic energy of its constituent molecules. Because interaction of mind and body, and the distinctive features of such an identification is often described as part of the reduc- tion of thermodynamics to statistical mechanics, the parallel CONSCIOUSNESS. A long tradition in philosophy has held, with René DES- claim about the mental is often called a “reductive” theory CARTES, that the mind must be a nonbodily entity: a soul or of mind, or “reductive physicalism” (see Lewis 1994). mental substance. This thesis is called “substance dualism” Because it seems committed to the implausible claim that or “Cartesian dualism” because it says that there are two all creatures who believe that grass is green have one physi- kinds of substance in the world, mental and physical or cal property in common—the property identical to the belief material. Belief in such dualism is based on belief that the that grass is green—many philosophers find reductive phys- soul is immortal, and that we have free will, which seems to icalism an excessively bold empirical speculation. For this require that the mind be a nonphysical thing because all reason (and others), some physicalists adopt a weaker ver- physical things are subject to the laws of nature. sion of physicalism, which holds that all particular objects To say that the mind (or soul) is a mental substance is not and events are physical, but that there are mental properties to say that the mind is made up of nonphysical “stuff” or not identical to physical properties. (Davidson 1970 is one material. Rather, the term substance is used in the tradi- inspiration for such views; see ANOMALOUS MONISM.) Such tional philosophical sense: a substance is an entity that has “non-reductive physicalism” is a kind of dualism because it properties and that persists through change in its properties. holds there are two kinds of properties, mental and physical, A tiger, for instance, is a substance, whereas a hurricane is but it is not substance dualism because it holds that all sub- not. To say there are mental substances—individual minds stances are physical substances. or souls—is to say there are objects that are nonmaterial or Nonreductive physicalism is also sometimes called a nonphysical, and these objects can exist independently of “token identity theory” because it identifies mental and physical objects, such as a person’s body. These objects, if physical particulars or tokens, and it is invariably supple- they exist, are not made of nonphysical “stuff”—they are mented by the claim that mental properties supervene on not made of “stuff” at all. physical properties. Though the notion can be refined in But if there are such objects, then how do they interact many ways, SUPERVENIENCE is essentially a claim about the with physical objects? Our thoughts and other mental states dependence of the mental on the physical: There can be no often seem to be caused by events in the world external to difference in mental facts without a difference in some our minds, and our thoughts and intentions seem to make physical facts (see Kim 1993; Horgan 1993). our bodies move. A perception of a glass of wine can be If the problem of psychophysical causation was the caused by the presence of a glass of wine in front of me, and whole of the mind-body problem, then it might seem that my desire for some wine plus the belief that there is a glass physicalism is a straightforward solution to that problem. If of wine in front of me can cause me to reach toward the the only question is, How do mental states have effects in glass. But many think that all physical effects are brought the physical world?, then it seems that the physicalist can about by purely physical causes: The physical states of my answer this by saying that mental states are identical with brain are enough to cause the physical event of my reaching physical states. toward the glass. So how can my mental states play any But there is a complication here. For it seems that physi- causal role in bringing about my actions? calists can only propose this solution to the problem of psy- Some dualists react to this by denying that such psycho- chophysical causation if mental causes are identical with physical causation really exists (this view is called EPIPHE- physical causes. Yet if properties or states are causes, as NOMENALISM). Some philosophers have thought that mental many reductive physicalists assume, then nonreductive states are causally related only to other mental states, and physicalists are not entitled to this solution because they do physical states are causally related only to other physical not identify mental and physical properties. This is the prob- states: The mental and physical realms operate indepen- lem of MENTAL CAUSATION for nonreductive physicalists dently. This “parallelist” view has been unpopular in the (see Davidson 1993; Crane 1995; Jackson 1996). twentieth century, as have most dualist views. For if we find On the other hand, even if the physicalist can solve this dualism unsatisfactory, there is another way to answer the problem of mental causation, there is a deeper reason why question of psychophysical causation. We can say that men- there is more to the mind-body problem than the problem of tal states have effects in the physical world precisely psychophysical interaction. The reason is that, according to because they are, contrary to appearances, physical states many philosophers, physicalism is not the “solution” to the (see Lewis 1966). This is a monist view because it holds that mind-body problem, but something that gives rise to a par- there is only one kind of substance, physical or material ticular version of that problem. They reason as follows: substance. Therefore it is also known as PHYSICALISM or Because we know that the world is completely physical, if “materialism.” the mind exists, it too must be physical. However, it seems Physicalism comes in many forms. The strongest form is hard to understand how certain aspects of mind—notably, the form just mentioned, which holds that mental states or consciousness—could just be physical features of the brain. 548 Mind Design How can the complex subjectivity of a conscious experience Horgan, T. (1993). From supervenience to superdupervenience: Meeting the demands of a material world. Mind 102: 555–586. be produced by the gray matter of the brain? As McGinn Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quar- (1989) puts it, neurons and synapses seem “the wrong kind” terly 32: 127–136. of material to produce consciousness. The problem here is Jackson, F. (1996). Mental causation. Mind 105: 377–413. one of intelligibility: Because we know that the mental is Kim, J. (1993). Supervenience and Mind. Cambridge: Cambridge physical, consciousness must have its origins in the brain. University Press. But how can we make sense of this mysterious fact? Lewis, D. (1966). An argument for the identity theory. Journal of Thomas Nagel (1974) dramatized this in a famous paper, Philosophy 63: 17–25. saying that when a creature is conscious, there is something Lewis, D. (1983). Mad pain and Martian pain. In D. Lewis, Philo- it is like to be that creature: There is something it is like to sophical Papers, vol. 1. Oxford: Oxford University Press, pp. be a bat, but there is nothing it is like to be a stone. The heart 122–132. Lewis, D. (1990). What experience teaches. In W. Lycan, Ed., of the mind-body problem for Nagel was the apparent fact Mind and Cognition. Oxford: Blackwell, pp. 499–519. that we cannot understand how consciousness can just be a Lewis, D. (1994). Reduction of mind. In S. Guttenplan, Ed., A physical property of the brain, even though we know that in Companion to the Philosophy of Mind. Oxford: Blackwell, pp. some sense physicalism is true (see also Chalmers 1996). 412–431. Some physicalists respond by saying that this problem is McGinn, C. (1989). Can we solve the mind-body problem? Mind illusory: if physicalism is true, then consciousness is just a 98: 349–366. physical property, and it simply begs the question against Nagel, T. (1974). What is it like to be a bat? Philosophical Review physicalism to wonder whether this can be true (see Lewis 4: 435–450. 1983). But Nagel’s criticism can be sharpened, as it has been Robinson, H. (1982). Matter and Sense. Cambridge: Cambridge by what Frank Jackson calls the “knowledge argument” University Press. (Jackson 1982; see also Robinson 1982). Jackson argues that even if we knew all the physical facts about, say, pain, we Further Readings would not ipso facto know what it is like to be in pain. Some- Armstrong, D. M. (1968). A Materialist Theory of the Mind. Lon- one omniscient about the physical facts about pain would don: Routledge and Kegan Paul. learn something new when they learn what it is like to be in Campbell, K. (1970). Body and Mind. New York: Doubleday. pain. Therefore there is some knowledge— knowledge of Churchland, P. M. (1986). Matter and Consciousness. Cambridge, WHAT-IT’S-LIKE—that is not knowledge of any physical fact. MA: MIT Press. Hence not all facts are physical facts. (For physicalist Flanagan, O. (1992). Consciousness Reconsidered. Cambridge, responses to Jackson’s argument, see Lewis 1990; Dennett MA: MIT Press. 1991; Churchland 1985.) Foster, J. (1991). The Immaterial Self: A Defence of the Cartesian In late-twentieth-century philosophy of mind, discus- Dualist Conception of the Mind. London: Routledge. sions of the mind-body problem revolve around the twin Levine, J. (1983). Materialism and qualia: The explanatory gap. Pacific Philosophical Quarterly 64: 354–361. poles of the problem of psychophysical causation and the Szubka, T., and R. Warner, Eds. (1994). The Mind-Body Problem: problem of consciousness. And while it is possible to see A Guide to the Current Debate. Oxford: Blackwell. these as independent problems, there is nonetheless a link between them, which can be expressed as a dilemma: if the mental is not physical, then how can we make sense of its Mind Design causal interaction with the physical? But if it is physical, how can we make sense of the phenomena of conscious- See INTRODUCTION: COMPUTATIONAL INTELLIGENCE; IN- ness? These two questions, in effect, define the contempo- TRODUCTION: PHILOSOPHY; COGNITIVE ARCHITECTURE rary debate on the mind-body problem. See also CAUSATION; CONSCIOUSNESS, NEUROBIOLOGY OF; EXPLANATORY GAP; QUALIA Minimalism —Tim Crane Minimalism is the latest (though still programmatic) devel- References opment of an approach to SYNTAX—transformational GEN- ERATIVE GRAMMAR—first developed by Noam Chomsky in Chalmers, D. (1996). The Conscious Mind: In Search of a Funda- the 1950s and successively modified in the four decades mental Theory. Oxford: Oxford University Press. Churchland, P. M. (1985). Reduction, qualia and the direct intro- since. The fundamental idea was and continues to be that a spection of brain states. Journal of Philosophy 82: 8–28. sentence is the result of some sort of computation producing Crane, T. (1995). The mental causation debate. Proceedings of the a derivation, beginning with an abstract structural represen- Aristotelian Society (supp. vol.) 69: 211–236. tation, sequentially altered by structure-dependent transfor- Davidson, D. (1970). Mental events. In L. Foster and J. Swanson, mations. The Minimalist Program maintains that these Eds., Experience and Theory. London: Duckworth, pp. 79–101. derivations and representations conform to an “economy” Davidson, D. (1993). Thinking causes. In J. Heil and A. Mele, criterion demanding that they be minimal in a sense deter- Eds., Mental Causation. Oxford: Oxford University Press, pp. mined by the language faculty: no extra steps in derivations, 1–17. no extra symbols in representations, and no representations Dennett, D. C. (1991). Consciousness Explained. Harmondsworth: beyond those that are conceptually necessary. Allen Lane. Minimalism 549 As articulated by Chomsky (1995), minimalism can best proceeds bottom-up: the most deeply embedded structural be understood in juxtaposition to its predecessor, the “Gov- unit is created first, then combined with the head of which ernment-Binding” (GB) model of Chomsky (1981, 1982). it is the complement to create a larger unit, and so on. Con- (It should be pointed out that Chomsky prefers the name sider first the following simplified example (assuming the “principles and parameters” for the model, reasoning that widely accepted “VP-internal subject hypothesis,” under government and binding are just two among many techni- which the subject is initially introduced into the structure cal devices in the theory, and not necessarily the most inside VP, then moves to an external position): important ones. For some discussion, see Chomsky and Las- (2) [IP The woman [I' will [VP t [ V' see [DP the man]]]]] nik 1993, a work that can be regarded as the culmination of the GB framework or the beginnings of minimalism.) In In the derivation of (2), first the noun (N) man is combined that model, there are four significant levels of representa- with the determiner (D) the to form the determiner phrase tion, related by derivation as in following diagram: (DP) the man. This DP then combines with the verb see to produce an intermediate projection V'. (Phrase labels of the (1) D(eep)-Structure X' type, while convenient for exposition, are largely hold- overs from earlier models, with no particular significance S(urface)-Structure within the minimalist approach.) The DP the woman is cre- ated in the same fashion as the man, and is combined with PF LF the V' to produce the VP. Next, this VP merges with the (Phonetic Form) (Logical Form) tense/inflectional element will producing I'. The DP the Items are taken from the LEXICON and inserted into the D- woman finally moves to the specifier position of I', yielding Structure in accord with their thematic (θ) relations the full clausal projection IP. In a more complicated deriva- (roughly, subject of . . . object of . . . etc.). Transformations tion, such as that yielding (3), the derivation of the embed- alter this D-Structure representation, the movement transfor- ded clause proceeds exactly as in the case of (2): mations leaving traces that mark the positions from which (3) I think the woman will see the man movement took place, eventually producing an S-Structure. Transformations of the same character (and arguably the (2) has combined with the verb think to produce a V', and so same transformations) continue the derivation to LF, the on. Notice that the movement of the woman to the embed- SYNTAX-SEMANTICS INTERFACE with the conceptual- ded subject position precedes the merger of the embedded intentional system of the mind (cf. LOGICAL FORM). Rules of sentence into the larger V', so that there is no one represen- the phonological component continue the derivation from S- tation following all lexical insertion and preceding all trans- Structure to PF, the interface with the articulatory-perceptual formations. That is, there is no D-Structure. system. The portion of the derivation between D-Structure On the other hand, S-Structure, persists in one trivial and S-Structure is often called “overt syntax”; that between sense: it is the point where the derivation divides, branching S-Structure and LF is called “covert syntax” because opera- toward LF on one path and toward PF on the other. The tions in that portion of the derivation have no phonetic more significant question is whether it has any of the further effects, given the organization in (1). Under the traditional properties it has in the GB framework. One of the primary view that a human language is a way of relating sound (more technical goals of the minimalist research program is to generally, gesture, as in SIGN LANGUAGES) and meaning, the establish that these further properties (involving Case and interface levels PF and LF are assumed to be ineliminable. binding, for instance) are actually properties of LF, contrary Minimalism seeks to establish that these necessary levels of to previous arguments (as suggested in Chomsky 1986, con- representation are the only levels. tra Chomsky 1981). The attempts to attain this goal gener- Introduced into syntactic theory by Chomsky (1965), D- ally involve more operations attributed to covert syntax than Structure was stipulated to be the locus of all lexical inser- in previous models. tion, the input to the transformational component, and, most Another technical goal is to reduce all constraints on rep- importantly, the representation determining thematic rela- resentation to bare output conditions, determined by the tions, as indicated above. Given traces, already a central part properties of the external systems that PF and LF must inter- of the GB theory, the role of D-Structure in determining the- face with. Internal to the computational system, the desider- matic relations becomes insignificant. Being theory internal, atum is that constraints on transformational derivations will the other arguments for its existence disappear under more be reduced to general principles of economy. Derivations recent developments of the theory. S-Structure, the terminus beginning from the same lexical choices (the numeration, in of overt syntax within the GB framework, has a number of Chomsky’s term) are compared in terms of number of steps, central properties, particularly concerning abstract case length of movements, and so on, with the less economical (dubbed “Case” by Chomsky 1980) and binding (structural ones being rejected. An example is the minimalist deduction constraints on anaphoric relations; cf. BINDING THEORY and of the Chomsky (1973) Superiority Condition, which demands that when multiple items are available for WH- ANAPHORA). In a partial return to the technical apparatus of pre-1965 MOVEMENT in a language, such as English, allowing only transformational theory (as in Chomsky 1955), minimal- one item to move, it is the “highest” item that will be chosen: ism has lexical items inserted “on-line” in the course of the (4) Who t will read what syntactic derivation, roughly in accord with the fundamen- tal notions of X-BAR THEORY and θ-theory. The derivation (5) *What will who read t. 550 Minimum Description Length Economy, in the form of “Shortest Move,” selects (4) over Boskovi c, Z. (1997). The Syntax of Nonfinite Complementation: é An Economy Approach. Cambridge, MA: MIT Press. (5) because the sentence-initial interrogative position Collins, C. (1997). Local Economy. Cambridge, MA: MIT Press. “needs” a Wh-expression, and is closer to the subject than it Freidin, R. (1997). Review of Noam Chomsky. The Minimalist is to the object. Many of the movement constraints falling Program, Language 73:571–582. under the Relativized Minimality Constraint of Rizzi (1990) Kayne, R. S. (1994). The Antisymmetry of Syntax. Cambridge, are susceptible to a parallel analysis. This constraint, which MA: MIT Press. had an important impact on the developing Minimalist Pro- Kitahara, H. (1997). Elementary Operations and Optimal Deriva- gram, forbids movement to a position of a certain type: head tions. Cambridge, MA: MIT Press. position, A(rgument type)-position, A' (non-A)-position Lasnik, H. (1993). Lectures on minimalist syntax. University of across an intervening position of the same type. Within the Connecticut Occasional Papers in Linguistics, 1. Storrs: Uni- minimalist approach, the effects of this constraint are taken versity of Connecticut. Lasnik, H., and M. Saito. (1991). On the subject of infinitives. In to fall under general economy constraints on derivation. L. M. Dobrin, L. Nichols, and R. M. Rodriguez, Eds., Papers Theoretical developments in the minimalist direction, from the Twenty-seventh Regional Meeting of the Chicago Lin- many well before minimalism was formulated as a program, guistic Society. Pt. 1, The General Session. Chicago: Chicago have generally led to greater breadth and depth of under- Linguistic Society, University of Chicago, pp. 324–343. standing. Thus there is reason to expect that the Minimalist Lasnik, H., and J. Uriagereka. (1988). A Course in GB Syntax: Program may eventually give rise to an articulated theory of Lectures on Binding and Empty Categories. Cambridge, MA: linguistic structure, one that can resolve the traditional ten- MIT Press. sion in linguistic theory between descriptive adequacy (the Uriagereka, J. (1998). Rhyme and Reason: An Introduction to Min- need to account for the phenomena of particular languages) imalist Syntax. Cambridge, MA: MIT Press. and explanatory adequacy (the goal of explaining how lin- Watanabe, A. (1996). Case Absorption and WH-Agreement. Dor- drecht: Kluwer. guistic knowledge arises in the mind so quickly and on the Zwart, C. J.-W. (1997). Morphosyntax of Verb Movement. A Mini- basis of such limited evidence). malist Approach to the Syntax of Dutch. Dordrecht: Kluwer. See also HEAD MOVEMENT; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; PARAMETER-SETTING APPROACHES TO ACQUISITION, CREOLIZATION, AND DIACHRONY Minimum Description Length —Howard Lasnik Minimum message length (MML) is a criterion for compar- ing competing theories about, or inductive inferences from, References a given body of data. A very similar criterion, also to be Chomsky, N. (1955). The logical structure of linguistic theory. described, is minimum description length (MDL). The basic Harvard University and Massachusetts Institute of Technology. concept behind both criteria is an operational form of Revised 1956 version published in part by Plenum Press, New Occam’s razor (see PARSIMONY AND SIMPLICITY). A “good” York (1975) and by University of Chicago Press, Chicago, theory induced from some data should enable the data to be (1985). encoded briefly. In human terms, a good theory introduces Chomsky, N. (1965) Aspects of the Theory of Syntax. Cambridge, concepts, assumptions, and inference rules or “laws” that, if MA: MIT Press. taken as true, allow much of the data to be deduced. Chomsky, N. (1973). Conditions on transformations. In S. Ander- For example, suppose the data to be measurements of the son and P. Kiparsky, Eds., A Festschrift for Morris Halle. New York: Holt, Rinehart and Winston, pp. 232–286. forces applied to several physical bodies, and their resulting Chomsky, N. (1980). On binding. Linguistic Inquiry 11: 1–46. accelerations, where each body is subjected to a number of Chomsky, N. (1981). Lectures on Government and Binding. Dor- experiments using different forces. If we propose the con- drecht: Foris. cept that each body has a “mass,” and the law that accelera- Chomsky, N. (1982). Some Concepts and Consequences of the tion is given by force divided by mass, then the given data Theory of Government and Binding. Cambridge, MA: MIT may be restated more briefly. If for each body, we state an Press. assumed value for its mass, then for each experiment on that Chomsky, N. (1986). Knowledge of Language. New York: Praeger. body we need only state the applied force because its accel- Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: eration can then be deduced. In practice, matters are a little MIT Press. more complicated: we cannot expect the measured accelera- Chomsky, N., and H. Lasnik. (1993). The theory of principles and parameters. In J. Jacobs, A. von Stechow, W. Sternefeld, and T. tion in each case to equal the deduced value exactly because Vennemann, Eds., Syntax: An International Handbook of Con- of inaccuracies of measurement (and because our proposed temporary Research, vol. 1. Berlin: Walter de Gruyter, pp. 506– “law” is not quite right). Thus, for each experiment, the 569. Reprinted in Chomsky (1995). restatement of the data must include a statement of the small Rizzi, L. (1990). Relativized Minimality. Cambridge, MA: MIT amount by which the measured acceleration differs from the Press. deduced value, but if these corrections are sufficiently small, writing them out will need much less space than writ- Further Readings ing out the original data values. Note that the restated data are unintelligible to a reader Abraham, W., S. D. Epstein, H. Thráinsson, and C. Jan-Wouter who does not know the “law” we have induced and the body Zwart, Eds. (1996). Minimal Ideas: Syntactic Studies in the masses we have estimated from the original data. Thus we Minimalist Framework. Amsterdam: Benjamins. Mobile Robots 551 insist that the restated data be preceded by a statement of cations of MML in MACHINE LEARNING include clustering, whatever theory, laws, and quantities we have inferred and DECISION TREES, factor analysis, ARMA processes, function assumed in the restating. estimation, and BAYESIAN NETWORKS. This leads to the idea of a special form of message for Message length can also be quantified via Kolmogorov encoding data, termed an EXPLANATION. An explanation is complexity: The information in a given binary string is the a message that first states a theory (or hypothesis) about the length of the shortest input to a TURING machine causing data, and perhaps also some values estimated for unob- output of the given string. This approach is equivalent to served concepts in the theory (e.g., the masses of the bodies MML, with the set of possible “theories” including all com- in our example), and that then states the data in a short putable functions. Practical applications, while very gen- encoding that assumes the theory and values are correct. eral, are limited by the undecidability of the “halting Equivalently, the second part of the message states all those problem,” although such quantification may provide a basis details of the data which cannot be deduced from the first for a descriptive account of scientific process, which seems part of the message. to evolve and retain those theories best able to give short The MML principle is to prefer that theory which leads explanations. to the shortest explanation of the data. As the explanation See also COMPUTATIONAL COMPLEXITY; FORMAL SYS- contains a statement of the theory as well as the data TEMS, PROPERTIES OF; FOUNDATIONS OF PROBABILITY encoded by assuming the theory, overly complex theories —Chris Wallace will not give the shortest explanations. A complex theory, by implying much about the data, gives a short second part, Further Readings but a long first part. An overly simple theory needs only a short first part for its assertion, but has few or imprecise Barron, A. R., and T. M. Cover. (1991). Minimum complexity den- sity estimation. IEEE Transactions on Information Theory 37: implications for the data, needing a long second part. 1034–1054. Message length can be quantified using INFORMATION Dowe, D. L., K. B. Korb, and J. J. Oliver, Eds. (1996). ISIS: Infor- THEORY: An event of probability P can be encoded in –log mation, Statistics and Induction in Science. Singapore: World P binary digits, or bits, using base 2 logs. Hence the length Scientific. of the second part of an explanation is just –log (probability Kolmogorov, A. N. (1965). Three approaches to the quantitative of data given the theory). Also, if a Bayesian “prior proba- definition of information. Problems of Information Transmis- bility” over theories is assumed, the length of the first part sion 1: 4–7. is –log (prior probability of theory). The shortest explana- Li, M., and P. M. B. Vitanyi. (1997). An Introduction to Kolma- tion then yields the theory of highest posterior probability, gorov Complexity and its Applications. 2nd ed. New York: as do some other Bayesian methods. However, MML actu- Springer. Rissanen, J. (1978). Modeling by shortest data description. Auto- ally achieves a shorter message by choosing a code for the- matica 14: 465–471. ories that, in general, does not provide for all possible Rissanen, J., C. S. Wallace, and P. R. Freeman. (1987). Stochastic theories. Theories so similar that the available data cannot complexity: Estimation and inference by compact coding. J. be expected to reliably distinguish between them are amal- Statist. Soc. B49(3): 223–265. gamated and represented in the code by a single theory, Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. credited with the sum of their prior probabilities. In particu- Singapore: World Scientific. lar, when a theory includes a real-valued parameter, only a Solomonoff, R. J. (1964). A formal theory of inductive inference, subset of the possible values for the parameter is allowed parts 1 and 2. Information and Control 7: 1–22, 224–254. for in the code. Although the prior over the parameter must Wallace, C. S., and D. M. Boulton. (1968). An information mea- be a density, this amalgamation gives a nonzero prior prob- sure for classification. Computer Journal 11: 185–194. ability to each of the subset of parameter values. This makes MML invariant under transformations of the param- Mobile Robots eter space, unlike Bayesian maximum A-posterior density (MAP) methods. MDL differs from MML in replacing Bayesian coding of Mobile robots have long held a fascination, from science theories with coding deemed to be efficient and “natural.” In fiction (Star Wars’ R2D2 and C3PO, Forbidden Planet’s practice, there is usually little difference in the two Robby) to science fact. For artificial intelligence research- approaches, but in some cases MML gives better parameter ers, the lure has been to enable a machine to emulate the estimates than MDL, which usually relies on maximum like- behavioral, perceptual, and cognitive skills of humans, lihood estimates. and to investigate how an artifact can successfully inter- General theoretical results show MML optimally sepa- act, in real time, with an uncertain, dynamic environment. rates information about general patterns in the data, which While most research has focused on autonomous naviga- appears in the first part, from patternless “noise,” which tion and on software architectures for autonomous robots, appears in the second part. The method is statistically con- mobile robots have also been used for investigating plan- sistent: if the “true” model of the data is in the set of theo- ning, MANIPULATION AND GRASPING, learning, perception, ries considered, enough data will reveal it. It also is not human-robot interaction (such as gesture recognition), and misleading: If there is no pattern in the data, no theory- robot-robot (multiagent) interaction. Representative appli- based explanation will be shorter than the original statement cations for mobile robots include service robots (mail of the data, thus no theory will be inferred. Successful appli- delivery, hospital delivery, cleaning), security, agriculture, 552 Mobile Robots mining, waste remediation and disposal, underwater while continuing to move toward some desired goal. The exploration, and planetary exploration, such as NASA’s potential field approach (Khatib 1985; Arkin 1989) treats Sojourner Rover on Mars. obstacles as repulsive forces and goals as attractors. The At the most basic level, mobile robots must perceive the vector sum of all forces is used to determine the desired world, decide what to do based on their goals and percep- robot heading. This calculation can be done very quickly, tions, and then act. The first two issues are the most chal- enabling robots to avoid moving obstacles, as well. The vec- lenging: how to reliably perceive the world with uncertain, tor field histogram approach (Borenstein 1991) essentially unreliable sensors, and how to make rational decisions looks for wide openings through which to head. It over- about what to do in the face of UNCERTAINTY about the per- comes some of the problems the potential field approach has ceived world and the effects of actions (both the robot’s and in dealing with crowded environments. More recently, real- those of other agents). For example, a mobile robot for the time obstacle avoidance algorithms have been developed home must perceive furniture, stairs, children, pets, toys, that take the robot’s dynamics into account, enabling much and so on, and decide how to act correctly in a myriad of higher travel speeds (Fox, Burgard, and Thrun 1997). Also, different situations. approaches using machine learning techniques to learn how Early work in indoor mobile robots demonstrated some to avoid obstacles have proven effective, especially for high- basic capabilities of robots to perceive their environment speed highway travel (Thorpe 1990; see also ROBOTICS AND (e.g., Stanford’s CART; Moravec 1983), plan how to accom- LEARNING and REINFORCEMENT LEARNING) plish relatively simple tasks (e.g., SRI’s Shakey; Fikes, Hart, Early approaches to global, map-based navigation were and Nilsson 1972), and successfully move about in (rela- based on metric maps—geometrically accurate, usually tively) unstructured environments. Early outdoor mobile ro- grid-based, representations. To estimate position with bots, such as Carnegie Mellon’s Navlab (Thorpe 1990), dem- respect to the map, robots would typically use dead reckon- onstrated similar capabilities for on-road vehicles. For these ing, (also called “internal odometry”—measuring position robot systems, speed was not much of an issue, nor was and orientation by counting wheel rotations). Such reacting to a dynamically changing environment—it was approaches were not very reliable: Position, and especially enough just to operate successfully using a sequential per- orientation, error would gradually increase until the robot ceive/decide/act cycle in a previously unknown, albeit static, was hopelessly lost. (On the other hand, widespread use of world. These and similar efforts resulted in new planning and the Global Positioning System or GPS and inertial naviga- plan execution algorithms (e.g., STRIPS; Fikes, Hart, and tion units or INUs have made position estimation a non- Nilsson 1972; see also PLANNING), new techniques to per- issue for many outdoor robots.) ceive the environment (Moravec 1983, 1988), and new soft- To avoid this problem of “dead reckoning error,” and to ware architectures for integrating and controlling complex avoid the need for geometrically accurate maps, researchers robot systems (Thorpe 1990). developed landmark-based navigation schemes (Kuipers Starting in the mid-1980s, research focused more heavily and Byun 1993; Kortenkamp and Weymouth 1994). In these on improving capabilities for navigating in more unstruc- approaches, the map is represented as a topological graph, tured, dynamic environments, which typically involved han- with nodes representing landmarks (“important places,” dling greater levels of uncertainty. Notable advances were such as corridor junctions or doorways) and with arcs repre- made in perceiving obstacles, avoiding obstacles, and in senting methods for traveling from landmark to landmark map-based navigation (following a map to get from location (e.g., “Turn right and travel forward, using local obstacle to location, without getting lost). avoidance”). Each landmark is associated with a set of fea- A key component to successful navigation is reliable per- tures that can be used by the robot to determine when it has ception of objects (obstacles) that may impede motion. For arrived at that place. Researchers have used both sonar (e.g., indoor mobile robots, a ring of ultrasonic sensors is often to detect corridor junctions) and vision as the basis for defin- used for obstacle detection. These sensors give reasonably ing landmarks (Kortenkamp and Weymouth 1994). Land- good range estimates, and the data can be improved by inte- mark-based navigation can be fairly reliable and is readily grating measurements over time to reduce sensor noise amenable to learning a map of the environment. It has diffi- (e.g., the occupancy grids of Moravec 1988). Outdoor culties, however, in situations where landmarks can be easily robots often use stereo vision (see also MACHINE VISION and confused (e.g., two junctions very near one another) or STEREO AND MOTION PERCEPTION). While computationally where the robot misses seeing particular landmarks. expensive, stereo has the advantages of being able to detect To combine the best of the metric and landmark-based objects at greater distances and with good resolution. It also schemes, some researchers have investigated probabilistic gives three-dimensional data, which is important for out- approaches that explicitly represent map, actuator, and door, rough-terrain navigation. Color vision has also been sensor uncertainty. One approach uses partially observable successfully employed, especially to detect boundaries, Markov decision process (POMDP) models to model the such as between walls and floors. Other researchers have robot’s state—its position and orientation (Nourbakhsh, investigated using lasers (e.g., Sojourner), radar, and infra- Powers, and Birchfield 1995; Simmons and Koenig 1995). red sensors to detect objects in the environment. The robot maintains a probability distribution over what Deciding how to move is often divided into local naviga- state it is in, and updates the probability distribution based tion (obstacle avoidance), for reacting quickly, and global, on Bayes’s rule as it moves and observes features (see also map-based navigation, for planning good routes. Many HIDDEN MARKOV MODELS). The robot associates actions techniques have been developed for avoiding obstacles with either states or probability distributions, and uses its Mobile Robots 553 among behaviors do not occur. An executive (or sequencer) is often used to manage the flow of control in robot systems. The executive is typically responsible for decomposing tasks into subtasks, sequencing tasks and dispatching them at the right times, and providing constructs for monitoring execution and handling exceptions (Firby 1987; Gat 1996; Simmons 1994). Tiered (or layered) architectures integrate planners, an executive, and behaviors in a hierarchical fash- ion, to enable very complex and reliable behavior (Bonasso et al. 1997). While such layered architectures are relatively new, they have proven to be quite flexible and are rapidly gaining popularity. Other architectural approaches include layers of more and more abstract feedback loops (Albus 1991) and architectures where the role of the planner is to provide schedules that can be guaranteed to run on a real- time system (Musliner, Durfee, and Shin 1993). See also ANIMAL NAVIGATION; FOUNDATIONS OF PROBA- BILITY; WALKING AND RUNNING MACHINES —Reid G. Simmons References Albus, J. (1991). Outline for a theory of intelligence. IEEE Trans- actions on Systems, Man and Cybernetics 21(3): 473–509. Arkin, R. (1989). Motor schema-based mobile robot navigation. International Journal of Robotics Research 8(4): 92–112. Bonasso, R. P., R. J. Firby, E. Gat, D. Kortenkamp, D. Miller, and M. Slack. (1997). Experiences with an architecture for intelli- gent, reactive agents. Journal of Experimental and Theoretical Artificial Intelligence 9(2). Borenstein, J., and Y. Koren. (1991). The vector field histogram: Fast obstacle avoidance for mobile robots. IEEE Transactions on Robotics and Automation 7(3): 278–288. Brooks, R. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation RA-2(1). Figure 1. Fikes, R., P. Hart, and N. Nilsson. (1972). Learning and executing generalized robot plans. Artificial Intelligence 3: 1–4. belief state to decide how to best move. Such probabilistic Firby, R. J. (1987). An investigation into reactive planning in com- navigation approaches tend to be very reliable: While the plex domains. In Proceedings of the Sixth National Conference robot may never know exactly where it is, neither does it on Artificial Intelligence, San Francisco: Morgan Kaufman, pp. ever (or rarely) get lost. The approach has been used in an 202–206. Fox, D., W. Burgard, and S. Thrun. (1997). The dynamic window office delivery robot, Carnegie Mellon’s Xavier (Simmons approach to collision avoidance. IEEE Robotics and Automa- et al. 1997; see figure 1), which has traveled over 150 kilo- tion 4(1): 23–33. meters with a 95 percent success rate, and in a museum Gat, E. (1996). ESL: A language for supporting robust plan execu- tour guide robot, Bonn University’s Rhino, which has a tion in embedded autonomous agents. In Proceedings AAAI greater than 98 percent success rate in achieving its Fall Symposium on Plan Execution: Problems and Issues. Bos- intended tasks. ton, MA: AAAI Press, Technical Report FS-96-01. A sequential perceive/decide/act cycle is often inade- Khatib, O. (1985). Real-time obstacle avoidance for manipulators quate to ensure real-time response. Mobile robots must be and mobile robots. In Proceedings IEEE International Confer- able to do all this concurrently. To this end, much research ence on Robotics and Automation, St. Louis, pp. 500–505. in mobile robots has focused on execution architectures that Kortenkamp, D., and T. Weymouth. (1994). Topological mapping for mobile robots using a combination of sonar and vision sens- support concurrent perception, action, and planning (see ing. In Proceedings of the Twelfth National Conference on Arti- also INTELLIGENT AGENT ARCHITECTURE). Behavior-based ficial Intelligence, Seattle. architectures consist of concurrently operating sets of Kuipers, B., and Y. Byun. (1993). A robot exploration and map- behaviors that process sensor information and locally de- ping strategy based on a semantic hierarchy of spatial repre- termine the best action to take (Brooks 1986; see also sentations. Journal of Robotics and Autonomous Systems 8: BEHAVIOR-BASED ROBOTICS). The decisions of all applica- 47–63. ble behaviors are arbitrated using different types of voting Moravec, H. (1983). The Stanford Cart and the CMU Rover. Pro- mechanisms. Behavior-based systems tend to be very reac- ceedings of the IEEE 71: 872–884. tive to change in the environment and can be very robust, Moravec, H. (1988). Sensor fusion in certainty grids for mobile but it is often difficult to ensure that unintended interactions robots. AI magazine 9(2): 61–74. 554 Modal Logic meaning “entails” may be defined as α β = df L (α ⊃ β), Musliner D., E. Durfee, and K. Shin. (1993). CIRCA: A coopera- where ⊃ is classical material implication. In fact, any one of tive intelligent real-time control architecture. IEEE Transac- tions on Systems, Man, and Cybernetics 23(6). L, M, or can be taken as primitive, and the others defined Nourbakhsh, I., R. Powers, and R. Birchfield. (1995). DERVISH: in terms of it. An office-navigating robot. AI magazine 16: 53–60. Although Lp is usually read as “Necessarily p,” it need Simmons, R. (1994). Structured control for autonomous robots. not be restricted to a single narrowly conceived sense of IEEE Transactions on Robotics and Automation 10(1): 34–43. “necessarily.” Very often, for example, when we say that Simmons, R., and S. Koenig. (1995). Probabilistic navigation in something must be so, we can be taken to be claiming that it partially observable environments. In Fourteenth International is so; and if we take L to express “must be” in this sense, we Joint Conference on Artificial Intelligence, San Francisco: shall want to have it as a principle that whenever Lp is true, Morgan Kaufman, pp. 1080–1087. so is p itself. A system of logic that expresses this idea will Simmons. R, R. Goodwin, K. Z. Haigh, S. Koenig, and J. O’Sulli- have Lp ⊃ p as one of its valid formulas. On the other hand, van. (1997). A layered architecture for office delivery robots. In First International Conference on Autonomous Agents. Marina there are uses of words such as “must” and “necessary” that del Rey, CA. express not what necessarily is so but rather what ought to be Thorpe, C., Ed. (1990). Vision and Navigation: The Carnegie Mel- so, and if we interpret L in accordance with these uses, we lon Navlab. Norwell: Kluwer. shall want to allow the possibility that Lp may be true but p itself false because people do not always do what they ought Further Readings to do. And fruitful systems of logic have been inspired by the idea of taking the necessity operator to mean, for example, Arkin, R. (1998). Behavior-Based Robotics. Cambridge, MA: MIT “It will always be the case that,” “It is known that,” or “It is Press. Balabanovic, M., et al. (1994). The winning robots from the 1993 provable that.” In fact, one of the important features of Robot Competition. AI magazine. modal logic is that out of the same basic material can be con- Brooks, R. (1991). Intelligence without representation. Artificial structed a variety of systems that reflect a variety of interpre- Intelligence 47(1–3): 139–160. tations of L, within a range that can be indicated, somewhat Hinkle, D., D. Kortenkamp, and D. Miller. (1995). The 1995 Robot loosely, by calling L a “necessity operator.” Competition and Exhibition (plus related articles in same In the early days of modal logic, disputes centered round issue). AI magazine 17(1). the question of whether a given principle of modal logic was Jones, J., and A. Flynn. (1993). Mobile Robots: Inspiration to Im- correct. Typically, these disputes involved formulas in plementation. Wellesley: Peters. which one modal operator occurs within the scope of Kortenkamp, D., I. Nourbakhsh, and D. Hinkle. (1996). The 1996 another—formulas such as Lp ⊃ LLp. Is a necessary propo- AAAI Mobile Robot Competition and Exhibition (plus related articles in same issue). AI magazine 18(1). sition necessarily necessary? A number of different modal Kortenkamp, D., P. Bonasso, and R. Murphy, Eds. (1998). Artifi- systems were produced that reflected different views about cial Intelligence and Mobile Robots. Cambridge, MA: MIT which principles were correct. Until the early sixties, how- Press. ever, modal logics were discussed almost exclusively as axi- Raibert, M. (1986). Legged Robots That Balance. Cambridge, MA: omatic systems without access to a notion of validity of the MIT Press. kind used, for example, in the truth table method for deter- Simmons, R. (1994). The 1994 AAAI Robot Competition and mining the validity of wff of the classical propositional cal- Exhibition (plus related articles in same issue). AI magazine culus. The semantical breakthrough came by using the idea 16(2). that a necessary proposition is one true in all possible Song, S., and K. Waldron. (1989). Machines That Walk: The Adap- worlds. But whether another world counts as possible may tive Suspension Vehicle. Cambridge, MA: MIT Press. be held to be relative to the world of origin. Thus an inter- pretation or model for a modal system would consist of a set Modal Logic W of possible worlds and a relation R of accessibility between them. For any wff α and world w, Lα will be true at w iff α itself is true at every w′ such that wRw′. It can then In classical propositional LOGIC all the operators are truth- functional. That is to say, the truth or falsity of a complex happen that whether a principle of modal logic holds can formula depends only on the truth or falsity of its simpler depend on properties of the accessibility relation. Suppose propositional constituents. Modal logic is concerned to that R is required to be transitive, that is, suppose that, for understand propositions about what must or might be the any worlds w1, w2, and w3, if w1Rw2 and w2Rw3, then w1Rw3. If so, then Lp ⊃ LLp will be valid, but if nontransi- case. We might, for example, have two propositions alike in truth value, both true say, where one is true and could not tive models are permitted, it need not be. If R is reflexive, that is, if wRw for every world w, then Lp ⊃ p is valid. Thus possibly be false, while the other is true but might easily have been false. Thus it must be that 2 + 2 = 4, but while it is different systems of modal logic can represent different true that I am writing this entry, it might easily not have ways of restricting necessity. been. Modal logic extends the well-formed formulas (wff) It is possible to extend modal logic by having logics that of classical logic by the addition of a one-place sentential involve more than one modal operator. One particularly operator L (or ), interpreted as meaning “It is necessary important class of multimodal systems is the class of tense that.” Using this operator, a one-place operator M (or ◊) logics. A tense logic has two operators, L1 and L2, where L1 meaning “It is possible that” may be defined as ~L~, where means “it always will be the case that” and L2 means “it ~ is a (classical) negation operator, and a two-place operator always has been the case that”. (In a tense logic L1 and L2 Modeling Neuropsychological Deficits 555 are often written G and H, with their possibility versions as cognition provide us with prime evidence about the organi- P for ~H~, and F for ~G~.) More elaborate families of zation of cognitive systems in the human brain. Yet neuro- modal operators are suggested by possible interpretations of psychologists have long been aware that the relation modal logic in computer science. In these interpretations the between a behaviorally manifest cognitive deficit and an “worlds” are states in the running of a program. If π is a underlying cognitive lesion may be complex. As early as the computer program, then [π]α means that after program π nineteenth century, authors such as John Hughlings-Jackson has been run, α will be true. If w is any “world,” then wRπw′ (1873) cautioned that the brain is a distributed and highly means that state w′ results from the running of program π. interactive system, such that local damage to one part can This extension of modal logic is called “dynamic logic.” unleash new modes of functioning in the remaining parts of First-order predicate logic can also be extended by the the system. As a result, one cannot assume that a patient’s addition of modal operators. The most interesting conse- behavior following brain damage is the direct result of a quences of such extensions are those which affect “mixed” simple subtraction of one or more components of the mind, principles, principles that relate quantifiers and modal with those that remain functioning normally. More likely, it operators and that cannot be stated at the level of modal results from a combination of the subtraction of some com- propositional logic or nonmodal predicate logic. Thus ponents, and changes in the functioning of other compo- where α is any wff, ∃xLα ⊃ L∃xα is valid, but for some wff nents that had previously been influenced by the missing L∃xα ⊃ ∃xL α need not be. (Even if a game must have a components. At stake in deciding between these two types winner, there need be no one who must win.) In some cases of account is not only our understanding of cognition in the principles of the extended system will depend on the neurological patients but also the inferences we draw from propositional logic on which it is based. An example is the such patients about the organization of the normal cognitive schema ∀xLα ⊃ L∀xα (often known as the “Barcan for- system. mula”), which is provable in some modal systems but not in Computational modeling provides a conceptual frame- others. If both directions are assumed, so that we have work, and concrete tools, for reasoning about the effects of ∀xLα ≡ L∀xα, then this formula expresses the principle local lesions in distributed, interactive systems such as the that the domain of individuals is held constant as we move brain (Farah 1994). It has proved helpful in understanding a from one world to another accessible world. number of different neuropsychological disorders. In the When identity is added, even more questions arise. The second part of this article, three examples will be presented usual axioms for identity easily allow the derivation of (x = of computational models that provide alternative interpreta- y) ⊃ L(x = y), but should we really say that all identities are tions of a neuropsychological disorder, with correspond- necessary? Questions like this bring us to the boundary ingly different implications for theories of normal cogni- between modal logic and metaphysics and remind us of the tion. rich potential that the theory of possible worlds has for illu- Many of the computational models used in neuropsy- minating such issues. POSSIBLE WORLDS SEMANTICS can be chology are parallel distributed processing (PDP) models generalized to deal with any operators whose meanings are (see COGNITIVE MODELING, CONNECTIONIST and NEURAL operations on propositions as sets of possible worlds, and NETWORKS), which share certain features with what is form a congenial tool for those who think that the meaning known of brain function. These brain-like features include of a sentence is its truth conditions, and that these should be the use of distributed representations, the large number of taken literally as a set of possible worlds—the worlds in inputs to and outputs from each unit, the modifiable connec- which the sentence is true. Such generalizations give rise to tions between units, the existence of both inhibitory and fruitful tools in providing a framework for semantical theo- excitatory connections, summation rules, bounded activa- ries for natural languages. tions, and thresholds. Of course, there are many important differences between the computation of PDP models and See also LOGICAL FORM; NONMONOTONIC LOGICS; real brains; for example, even the biggest PDP networks are QUANTIFIERS tiny compared to the brain, PDP models have just one kind —Max Cresswell of “unit,” compared to a variety of types of neurons, and just one kind of activation (which can act excitatorily or inhibi- Further Readings torily) rather than a multitude of different neurotransmitters, and so on. Computational architectures other than PDP, Chellas, B. F. (1980). Modal Logic: An Introduction. Cambridge: which have fewer patent correspondences to real neural Cambridge University Press. computation, have also been used to mediate inferences Hughes, G. E., and M. J. Cresswell. (1996). A New Introduction to between the behavioral impairments of brain-damaged Modal Logic. London: Routledge. patients and theories of normal cognition. The final example to be summarized here is a production system model (see Modeling Neuropsychological Deficits also PRODUCTION SYSTEMS), which sacrifices some explicit resemblances to brain function in the service of making Patterns of cognitive breakdown after brain damage in explicit other key aspects of the theory used to explain humans can often be interpreted in terms of damage to par- patient behavior. ticular components of theories of normal cognition devel- Computational models in neuropsychology, like all oped within cognitive science. Along with the new methods models in science, are simplifications of reality, with some of functional neuroimaging, neurological impairments of theory-relevant features and some theory-irrelevant ones. 556 Modeling Neuropsychological Deficits Our models allow us to find out what aspects of behavior, about people, and names, but without any part of the net- normal and pathological, can be explained by the theory- work dedicated to awareness (Farah, O’Reilly, and Vecera relevant attributes, that is, those that are shared with real 1993). The dissociations between overt and covert recogni- brain function. Of course, some behavior may be explain- tion observed in three different tasks were simulated by able only with the incorporation of other features of neu- lesioning the visual face representations of the network. Our roanatomy and neurophysiology not used in current conclusion was that it is unnecessary to hypothesize sepa- computational models. But this is not a problem for mod- rate cognitive components for recognition and awareness of els that already account well for patient data. In such cases, recognition; covert recognition tasks are simply those that the only worry is that the model’s success might depend on can tap the residual knowledge of a damaged visual system. some theory-irrelevant simplification. We must be on the Frontal Lobe Impairments: Loss of an Executive lookout for such cases, but also recognize that it is unlikely System, or Working Memory? that the success of most models will happen to depend crit- ically on their unrealistic features. Studies of frontal lobe function in nonhuman primates have In closing, I provide pointers to three concrete examples overwhelmingly focused on WORKING MEMORY, the capac- of computational modeling in neuropsychology. Only the ity to hold information “on-line” for an interval of seconds barest outlines can be given here of the questions to which or minutes. By contrast, studies of frontal lobe function in the models are addressed, and the mechanisms by which the humans have documented a broad array of abilities, includ- models provide answers. ing PLANNING, PROBLEM SOLVING, sequencing, and inhibit- ing impulsive responses (Kimberg, D’Esposito, and Farah Deep Dyslexia: Interpreting Error Types 1997). The diversity of abilities affected, and their “high- Patients with a READING disorder known as “deep DYS- level” nature, has led many to infer that the cognitive system LEXIA” make two very different types of reading errors, contains a supervisory “executive,” residing in the frontal which have been interpreted as indicating that two function- lobes. ally distinct lesions are needed to account for the reading With the animal literature in mind, Dan Kimberg and I errors of these patients. Deep dyslexic patients make seman- wondered whether damage to working memory might pro- tic errors, that is, errors that bear a semantic similarity to the duce the varied and apparently high-level behavioral impair- correct word, such as reading cat as “dog.” They also make ments associated with frontal lobe damage (Kimberg and visual errors, that is, errors that bear a visual (graphemic) Farah 1993). We used a production system architecture similarity to the correct word, such as reading cat as “cot.” because it makes very explicit the process of weighing dif- The fact that both semantic and visual errors are common in ferent sources of information to select an action. We found deep dyslexia has been taken to imply that deep dyslexic that damaging working memory resulted in the system fail- patients have multiple lesions, with one affecting the visual ing a variety of frontal-sensitive tasks, and indeed commit- system and another affecting semantic knowledge. How- ting the same types of errors as frontal-damaged patients. ever, Hinton and Shallice (1991) showed that a single lesion This could be understood in terms of the decreased influ- (removal of units) in an attractor network that has been ence of working memory on action selection, and the conse- trained to associate visual patterns with semantic patterns is quently greater contribution of other influences, including sufficient to account for these patients’ errors. Indeed, they priming of recently executed actions and habit. We con- showed that mixtures of error types will be the rule, rather cluded that the behavior of frontal-damaged patients does than the exception, when a system normally functions to not imply the existence of an executive. transform the stimulus representation from one form that See also COGNITIVE ARCHITECTURE; COGNITIVE MODEL- has one set of similarity relations (e.g., visual, in which cot ING, SYMBOLIC; LEXICON, NEURAL BASIS OF; VISUAL and cat are similar) to another form with different similarity NEGLECT; WORKING MEMORY, NEURAL BASIS OF relations (e.g., semantic, in which cot and bed are similar). —Martha J. Farah Covert Face Recognition: Dissociation Without Separate Systems References Prosopagnosia is an impairment of FACE RECOGNITION that De Haan, E. H., R. M. Bauer, and K. W. Greve. (1992). Behavioral can occur relatively independently of impairments in object and physiological evidence for covert face recognition in a recognition (Farah, Klein, and Levinson 1995; see OBJECT prosopagnosic patient. Cortex 28: 77–95. RECOGNITION, HUMAN NEUROPSYCHOLOGY). Recently it Farah, M. J. (1994). Neuropsychological inference with an interac- tive brain: A critique of the “locality assumption.” Behavioral has been observed that some prosopagnosic patients retain a and Brain Sciences 17: 43–61. high degree of face recognition ability when tested in cer- Farah, M. J., K. L. Klein, and K. L. Levinson. (1995). Face percep- tain ways (“covert recognition”), while performing poorly tion and within-category discrimination in prosopagnosia. Neu- on more conventional tasks (“overt recognition”) and pro- ropsychologia 33: 661–674. fessing no conscious awareness of face recognition. This Farah, M. J., R. C. O’Reilly, and S. P. Vecera. (1993). Dissociated has been taken to imply that recognition and awareness overt and covert recognition as an emergent property of a depend on dissociable and distinct brain systems (De Haan, lesioned neural network. Pychological Review 100: 571–588. Bauer, and Greve 1992). My colleagues and I were able to Hinton, G. E., and T. Shallice, (1991). Lesioning an attractor net- account for covert recognition with a network consisting of work: Investigations of acquired dyslexia. Psychological units representing facial appearance, general information Review 98: 74–95. Modularity and Language 557 to use that same structure again even under circumstances Hughlings-Jackson, J. (1873). On the anatomical and physiologi- cal localization of movements in the brain. Lancet 1: 84–85, where there is no semantic relation between the two sen- 162–164, 232–234. tences. This result is expected if syntax is a specialized Kimberg, D.Y., M. D’Esposito, and M. J. Farah. (1997). Frontal component of a modular language processor in the narrow lobes: Cognitive neuropsychological aspects. In T. E. Feinberg sense. Similarly in comprehension, Clifton (1993) has and M. J. Farah, Eds., Behavioral Neurology and Neuropsy- shown that a phrase following an optionally transitive verb chology. New York: McGraw-Hill. is first analyzed as an object of that verb even if semantic Kimberg, D. Y., and M. J. Farah. (1993). A unified account of cog- properties of the clause dictate that ultimately it must be the nitive impairments following frontal lobe damage: The role of subject of a following clause. Both studies provide evidence working memory in complex, organized behavior. Journal of for the existence of autonomous syntactic structures that Experimental Psychology: General 112: 411–428. participate in language processing. Laying out a very specific modularity thesis, Fodor Modularity and Language (1983) hypothesizes that perceptual, or “input,” systems share several important characteristics. They apply only to a Is language a separate mental faculty? Or is it just part of a limited domain, visual inputs, for example. Only domain- monolithic general-purpose cognitive system? The idea that specific information (e.g., visual information) is applied in the human mind is composed of distinct faculties was hotly the input system. The operation of the input system is fast debated in the nineteenth century, and the debate continues and reflexive (automatic and mandatory). There is limited today. Genetic disorders such as Williams syndrome show access to the intermediate representations computed— that individuals who cannot count to three or solve simple essentially only the final output of the system is available to spatial tasks nevertheless develop remarkable language other systems. Each input system is biologically determined skills that resemble those of a fully fluent and proficient sec- in the sense that its development exhibits a characteristic ond-language learner (Bellugi et al. 1993; Karmiloff-Smith pace and sequence, it has an associated neural basis, and et al. 1997). The striking disparity in the levels of attainment when that neural basis is damaged, characteristic deficits of Williams syndrome individuals in different cognitive result. Fodor claims that language is an input system—it domains clearly argues for differentiated mental capacities. exhibits all of the properties of a perceptual system. Studies of the brain lead to the same conclusion. The left To argue that the language system, or more accurately hemisphere of the brain is the language-dominant hemi- the grammatical system, is domain-specific, Fodor notes sphere for right-handed individuals. In 1861, Paul BROCA that perceivers integrate auditory and visual information about a speech input, perceiving an “average” when the two identified the third frontal gyrus of the language-dominant conflict. Presented with a videotaped speaker forming a hemisphere as an important language area. Performing consonant in the back of the mouth (”ga”) and a synchro- autopsies on brain-damaged individuals with expressive dif- nized auditory input “ba,” perceivers will integrate the for- ficulties characterized by slow, effortful “telegraphic” ward (labial) articulation of the “ba” with the evidence speech, he found that their lesions involved the third frontal about the place of articulation of “ga” from the video and gyrus, now known as “Broca’s area.” Using modern neu- perceive “da”—a sound produced farther back in the mouth roimaging techniques, Smith and Jonides (1997) have impli- than labials such as “ba” but farther forward than sounds cated Broca’s area specifically in the rehearsal of material in such as “ga.” This is known as the “McGurk effect” verbal WORKING MEMORY in normal adults, showing an (McGurk and MacDonald 1976). The point about the increase in activation with increasing memory load. Spatial McGurk effect is that it is not simply a guess on the part of tasks requiring active maintenance of spatial information in confused perceivers about how they can resolve conflicting working memory do not activate Broca’s area in the left perceptual inputs. It is an actual perceptual illusion, as hemisphere but rather the premotor cortex of the right hemi- expected if SPEECH PERCEPTION is part of an input system sphere. Though the overall picture of language representa- tion in the brain is far from clear, the debates today mostly specialized for speech inputs. concern, not whether areas specialized for language (or The most controversial aspect of Fodor’s thesis is the object recognition or spatial relations) exist, but how these claim that language processing is “informationally encapsu- areas are distributed in the brain and organized. lated,” that only domain-specific information is consulted In studies of normal adult language processing, modular- within a module. The question is whether the massive ity is discussed in broad terms where the questions concern effects of nonlinguistic world knowledge actually occur the separability of grammatical processing from general after an initial hypothesis has already been identified within cognitive processing and in narrow terms where the ques- the language module on the basis of purely linguistic knowl- tions concern the isolability of distinct components of the edge or whether world knowledge can direct the grammati- grammar. One central question is whether a syntactic parser cal processing of the input. Word recognition is one area exists that is concerned only with the construction of syntac- where this debate has been played out. tic structure in production or the identification of syntactic Word recognition studies demonstrate that both mean- structure in comprehension. In studies of sentence produc- ings of an ambiguous word are activated (Swinney 1979), at tion, Bock (1989) presents evidence for purely syntactic least when the word occurs in a semantically neutral sen- priming not dependent on semantic content or the particular tence. In a semantically biased sentence favoring the more words in a sentence. In other words, having just produced a frequent meaning of the ambiguous word, only the domi- sentence with a particular syntactic structure, speakers tend nant (frequent) meaning of the word is activated, suggesting 558 Modularity of Mind that perhaps word recognition is not informationally encap- Reinhart, T. (1983). Anaphora and Semantic Interpretation. Lon- don: Croom Helm. sulated. On the other hand, in a sentence biased toward the Smith, E. E., and J. Jonides. (1997). Working memory: A view less frequent meaning of the ambiguous word, both the con- from neuroimaging. Cognitive Psychology 33: 5–42. textually appropriate and the contextually inappropriate Swinney, D. (1979). Lexical access during sentence comprehen- meaning of the ambiguous words are activated, as we would sion: (Re-)consideration of context effects. Journal of Verbal expect if word recognition were informationally encapsu- Learning and Verbal Behavior 18: 645–659. lated. Proponents of modularity take heart in this latter find- ing and explain the former finding in terms of the frequent Modularity of Mind meaning being activated and accepted so quickly that it can inhibit the activation of the less frequent meaning (Rayner Two influential theoretical positions have permeated cogni- and Frazier 1989). Opponents of modularity focus on the tive science: (1) that the mind/brain is a general-purpose former finding and note that it indicates that context can problem solver (NEWELL and Simon 1972; PIAGET 1971); influence word recognition under at least some circum- and (2) that it is made up of special-purpose modules stances (Duffy, Morris, and Rayner 1988, for example). (Chomsky 1980; Fodor 1983; Gardner 1985). The concept Ultimately, the survival of Fodor’s modularity thesis may of modular organization dates back to KANT (1781/1953) depend on its explanatory value. Precisely because a module and to Gall’s faculty theory (see Hollander 1920). But it was operates mandatorily and consults only restricted informa- the publication of Fodor’s Modularity of Mind (1983) that tion (identifiable in advance of any particular input), the set the stage for recent modularity theorizing and which identification, access, or computation of information can be provided a precise set of criteria about what constitutes a fast. The grammatical processor’s job is a structured and module. limited one. Fodor holds that the mind is made up of genetically spec- If the grammar or grammatical subsystems act as mod- ified, independently functioning modules. Information from ules, it also becomes less surprising that grammars have the the external environment passes first through a system of eccentric properties that they do, relying on strict module- sensory transducers that transform the data into formats internal notions of prominence such as “c-command” (Rein- each special-purpose module can process. Each module, in hart 1983), rather than on generally available notions based turn, outputs data in a common format suitable for central, on, say, precedence, loudness, or the importance of the domain-general processing. The modules are deemed to be information conveyed. For many linguists, it is the reappear- hardwired (not assembled from more primitive processes), ance within and across languages of the same peculiar of fixed neural architecture (specified genetically), domain- notion of prominence or locality that most convincingly specific (a module computes a constrained class of specific argues that grammars form, not loose associations of inputs bottom-up, focusing on entities relevant only to its “biases” or co-occurrence probabilities, but specialized sys- particular processing capacities), fast, autonomous, manda- tems or modules. tory (a module’s processing is set in motion whenever rele- See also DOMAIN SPECIFICITY; INNATENESS OF LAN- vant data present themselves), automatic, stimulus-driven, GUAGE; MODULARITY OF MIND; SPOKEN WORD RECOGNI- and insensitive to central cognitive goals. A further charac- TION; VISUAL WORD RECOGNITION teristic of modules is that they are informationally encapsu- lated. In other words, other parts of the mind can neither —Lyn Frazier influence nor have access to the internal workings of a mod- ule, only to its outputs. Modules only have access to infor- References mation from stages of processing at lower levels, not from Bellugi, U., S. Marks, A. Bihrle, and H. Sabo. (1993). Dissociation top-down processes. Take, for example, the Muller-Lyer between language and cognitive function in Williams syn- illusion, where, even if a subject explicitly knows that two drome. In D. Bishop and K. Mogford, Eds., Language Develop- lines are of equal length, the perceptual system cannot see ment in Exceptional Circumstances. Mawah, NJ: Erlbaum. them as equal. Explicit knowledge about equal line length, Bock, K. (1989). Closed class immanence in sentence production. available in what Fodor calls the “central system,” cannot Cognition 31: 163–186. infiltrate the perceptual system’s automatic, mandatory Clifton, C. (1993). Thematic roles in sentence parsing. Canadian computation of relative lengths. Journal of Psychology 47: 222–246. For Fodor, it is the co-occurrence of all the properties Duffy, S. A., R. K. Morris, and K. Rayner. (1988). Lexical ambigu- discussed above that defines a module. Alone, particular ity and fixation times in reading. Journal of Memory and Lan- guage 27: 429–446. properties do not necessarily entail modularity. For instance, Fodor, J. A. (1983). Modularity of Mind. Cambridge, MA: MIT automatic, rapid processing can also take place outside Press. input systems such as in skill learning (Anderson 1980). Karmiloff-Smith, A., J. Grant, I. Berthoud, M. Davies, P. Howlin, Task-specific EXPERTISE should not be confounded with the and O. Udwin. (1997). Language and Williams syndrome: How Fodorian concept of a module. Rather, each module is like a intact is “intact”? Child Development 68(2): 246–262. special-purpose computer with a proprietary database. A McGurk, H., and J. MacDonald. (1976). Hearing lips and seeing Fodorian module can only process certain types of data; it voices. Nature 264: 746–748. automatically ignores other, potentially competing input. Rayner, K., and L. Frazier. (1989). Selection mechanisms in read- This enhances automaticity and speed of computation by ing lexically ambiguous words. Journal of Experimental Psy- ensuring that the organism is insensitive to many potential chology: Learning, Memory and Cognition 15: 779–790. Modularity of Mind 559 classes of information from other input systems and to top- more general impairments have frequently been brought to down expectations from central processing. In other words, light (e.g., Bishop 1997; Frith 1989; Pennington and Welsh Fodor divides the mind/brain into two very different parts: 1995). In other words, abnormal development does not innately specified modules and the nonmodular central pro- point to isolated, prespecified modules divorced from the cesses responsible for deductive reasoning and the like. rest of the cognitive, motor, and emotional systems. Genetic Fodor’s modularity theory had a strong impact on impairments affect various aspects of the developmental researchers in cognitive development. Until the 1980s process, in some domains very subtly and in others more BEHAVIORISM and Piaget’s constructivism had been domi- seriously. nant forces in development. Both these theories maintain In normal development, too, new research is also point- that the infant and child learn about all domains—SYNTAX, ing to gradual specialization rather than prespecification. SEMANTICS, number, space, THEORY OF MIND, physics, and Take the case of syntax, a particularly popular domain for so forth—via a single set of domain-general mechanisms claimants of modularity. Brain imaging studies of infants (the actual types of mechanism invoked are very different in and toddlers have shown a changing pattern of HEMISPHERIC the two theories). By contrast, with Chomskyan linguistics SPECIALIZATION (Mills, Coffey-Corina, and Neville 1993, and Fodorian modularity, a sizable number of developmen- 1994). Initially, the infant processes syntax in various parts talists opted for an innately specified, modular view of the of the brain across both hemispheres. It is only with time infant mind. Not only did Chomskyan psycholinguists argue that parts of the left hemisphere become increasingly spe- for the innately specified modularity of syntax (e.g., Smith cialized. This also obtains for other aspects of language and and Tsimpli 1995; see Garfield 1987; but see also Marslen- for face processing in which infant imaging studies using Wilson and Tyler 1987 for a different view), but develop- high-density ERPs show progressive localization and spe- mentalists also supported a modular view of semantics cialization (Johnson 1997). The human cortex takes time to (Pinker 1994), of theory of mind (Anderson 1992; Baron- structure itself as a function of complex interactions at mul- Cohen 1995; Leslie 1988), of certain aspects of the infant’s tiple levels: differential timing of the development of parts knowledge of physics (Spelke et al. 1992; but see Bail- of cortex, the predispositions each part has for different largéon 1994 for a different view), and of number in the types of computation, and the structure of the inputs it form of a set of special-purpose, number-relevant principles receives (for detailed discussion, see for example Elman et (Gelman and Gallistel 1978). al. 1996; Johnson 1997; Quartz and Sejnowsky forthcom- Data from normal adults whose brains become damaged ing). While there may be prespecification at the cellular from stroke or accident seem to support the modular view level, this does not seem to hold for synaptogenesis at the (Butterworth, Cipolotti, and Warrington 1996; Caramazza, cortical level. Specialized circuitry and the rich network of Berndt, and Basili 1983). Indeed, brain-damaged adults connections between cells appear to develop as a function of often display dissociations where, say, face processing is experience, which challenges the notion of prespecified impaired, while other aspects of visual-spatial processing modules. are spared, or where semantics is spared in the face of Although the fully developed adult brain may include a impaired syntax, and so forth. On the other hand, several number of module-like structures, it does not follow that authors have now challenged these seemingly clear-cut dis- these must be innately specified. Given the lengthy period sociations, demonstrating, for instance, that supposedly of human postnatal brain development and what we know damaged syntax can turn out to be intact if one uses on-line about the necessary and complex interaction of the genome tasks tapping automatic processes rather than off-line, meta- with environmental influences (e.g., Elman et al. 1996; linguistic tasks (e.g., Tyler 1992), and that a single underly- Johnson, 1997; Quartz and Sejnowsky 1997; Rose 1997), ing deficit can give rise to behavioral dissociations (Farah modules could be the product in adulthood of a gradual and McClelland 1991; Plaut 1995). developmental process (Karmiloff-Smith 1992), rather than Evidence from idiots savants (Smith and Tsimpli 1995) being fully prespecified, as Fodorians maintain. This is not and from persons having certain developmental disorders a return to a general-purpose, equipotential view of the (e.g., Baron-Cohen 1995; Leslie 1988; Pinker 1994) has infant brain. On the contrary, an alternative to representa- also been used to lend support to the modularity view. There tional nativism (the innate knowledge position on which are, for instance, developmental disorders where theory of modularity theory is based) has been proposed by several mind is impaired in otherwise high functioning people with theorists who have formulated hypotheses about what AUTISM (Frith 1989), or where face processing scores are in might be innately specified in terms of computational and the normal range but visuo-spatial cognition is seriously timing constraints, while leaving ample room for epige- impaired, as in the case of people with Williams syndrome netic processes (Elman et al. 1996; Quartz and Sejnowsky (Bellugi, Wang and Jernigan 1994). These data have led 1997). some theorists to claim that such modules must be innately While the concept of prespecified modules has been chal- specified because they are left intact or impaired in genetic lenged on a number of fronts, it has also become increas- disorders of development. Yet this claim has also been ingly clear that the general-purpose view of the brain is recently challenged. In almost every case of islets of so- inadequate. The human mind/brain is not a single, domain- called intact modular functioning, serious impairments general processing system, either in infancy or in adulthood. within the “intact” domain have subsequently been identi- Nor is the alternative a return to simple behaviorism. The fied (e.g., Karmiloff-Smith 1998; Karmiloff-Smith et al. genome and sociophysical environment both place con- 1997), and in cases of purported singular modular deficits, straints on development. A different way to conceive of 560 Modularity of Mind modularity might therefore be to adopt a truly developmen- Fodor, J. A. (1983). The Modularity of Mind. Cambridge, MA: MIT Press. tal perspective and acknowledge that the structure of minds Frith, U. (1989). Autism: Explaining the Enigma. Oxford: Black- could emerge from dynamically developing brains, whether well. normal or abnormal, in interaction with the environment. Gardner, H. (1985). Frames of Mind: The Theory of Multiple Intel- The long period of human postnatal cortical development ligences. London: Heinemann. and the considerable plasticity it displays suggest that pro- Garfield, J. L., Ed. (1987). Modularity in Knowledge Representa- gressive modularization may arise simply as a consequence tion and Natural-Language Understanding. Cambridge, MA: of the developmental process. Variations in developmental MIT Press. timing and the brain’s capacity to carry out subtly different Gelman, R., and C. R. Gallistel. (1978). The Child’s Understand- kinds of computation, together with differential structures in ing of Number. Cambridge, MA: Harvard University Press. the environmental input, could suffice to structure the brain Hollander, B. (1920). In Search of the Soul. New York: Dutton. Johnson, M. H. (1997). Developmental Cognitive Neuroscience. (Elman et al. 1996; Karmiloff-Smith 1992, 1995; Quartz and Oxford: Blackwell. Sejnowsky 1997; Rose 1997). Nativists of course recognize Kant, E. (1781/1953). Critique of Pure Reason. Translated N. K. that environmental input is essential to trigger developmen- Smith. New York: Macmillan. tal processes, but the environment only plays a very second- Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmental ary role to the genome in such theories. In the alternative Perspective on Cognitive Science. Cambridge, MA: MIT Press. framework suggested above, there is no need to invoke Karmiloff-Smith, A. (1995). Annotation: The extraordinary cogni- innate knowledge or representations to account for resulting tive journey from foetus through infancy. Journal of Child Psy- specialization, because of variations in developmental tim- chology and Child Psychiatry 36(8): 1293–1313. ing, different learning algorithms together with information Karmiloff-Smith, A. (1998). Development itself is the key to inherent in different environmental inputs would together understanding developmental disorders. Trends in Cognitive Sciences 2(10): 389–398. play a central role in the dynamics of development and in the Karmiloff-Smith, A., J. Grant, I. Berthoud, M. Davies, P. Howlin, gradual formation of module-like structures. and O. Udwin. (1997). Language and Williams syndrome: How See also LANGUAGE, NEURAL BASIS OF; MODULARITY intact is “intact”? Child Development 68: 246–262. AND LANGUAGE; NAIVE PHYSICS; NEURAL PLASTICITY Leslie, A. M. (1988). The necessity of illusion: perception and thought in infancy. In L. Weiskrantz, Ed., Thought without lan- —Annette Karmiloff-Smith guage. Oxford: Oxford University Press. Marslen-Wilson, W. D., and L. K. Tyler. (1987). Against modular- ity. In J. L. Garfield, Ed., Modularity in Knowledge Representa- References tions and Natural-Language Understanding. Cambridge, MA: MIT Press. Anderson, J. R. (1980). Cognitive Psychology and its Implications. Mills, D. L., S. A. Coffey-Corina, and H. J. Neville. (1993). Lan- San Francisco: Freeman. guage acquisition and cerebral specialization in 20-month-old Anderson, M. (1992). Intelligence and Development: A Cognitive infants. Journal of Cognitive Neuroscience 5(3): 317–334. Theory. Oxford: Blackwell. Mills, D. L., S. A. Coffey-Corina, and H. J. Neville. (1994). Vari- Baillargéon, R. (1994). How do infants reason about the physical ability in cerebral organization during primary language acqui- world? Current Directions in Psychological Science 3: 133–140. sition. In G. Dawson and K. Fischer, Eds., Human Behavior Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and and the Developing Brain. New York: Guilford Press. Theory of Mind. Cambridge, MA: MIT Press. Newell, A., and H. Simon. (1972). Human Problem Solving. Bellugi, U., P. P. Wang, and T. L. Jernigan. (1994). Williams syn- Englewood Cliffs, NJ: Prentice Hall. drome: An unusual neuropsychological profile. In S. H. Bro- Pennington, B. F., and M. C. Welsh. (1995). Neuropsychology and man and J. Grafman, Eds., Atypical Cognitive Deficits in developmental psychopathology. In D. Cicchetti and D. J. Developmental Disorders: Implications for Brain Function. Cohen, Eds., Manual of Developmental Psychopathology, vol. Hillsdale, NJ: Erlbaum, pp. 23–56. 1. New York: Wiley, pp. 254–290. Bishop, D. V. M. (1997). Uncommon Understanding: Develop- Piaget, J. (1971). Biology and Knowledge. Chicago: University of ment and Disorders of Language Comprehension in Children. Chicago Press. Originally published 1967. Hove, England: Psychology Press. Pinker, S. (1994). The Language Instinct. London: Lane. Butterworth, B., L. Cipolotti, and E. K. Warrington. (1996). Short- Plaut, D. (1995). Double dissociation without modularity: Evi- term memory impairment and arithmetical ability. Quarterly dence from connectionist neuropsychology. Journal of Clinical Journal of Experimental Psychology: Human Experimental and Experimental Neuropsychology 17: 291–231. Psychology 49A(1): 251–262. Quartz, S. R., and T. J. Sejnowsky. (1997). A neural basis of cogni- Caramazza, A., R. S. Berndt, and A. G. Basili. (1983). The selec- tive development: A constructivist manifesto. Behavioral and tive impairment of phonological processing: A case study. Brain Sciences 20:537–556. Brain and Language 18: 128–174. Rose, S. (1997). Lifelines: Biology, Freedom, Determinism. Lon- Chomsky, N. (1980). Rules and Representations. New York: don: Lane. Columbia University Press. Smith, N. V., and I. M. Tsimpli. (1995). The Mind of a Savant: Elman, J. L., E. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Language Learning and Modularity. Oxford: Blackwell. Parisi, and K. Plunkett. (1996). Rethinking Innateness: A Con- Spelke, E. S., K. Breinlinger, J. Macomber, and K. Jacobson. nectionist Perspective on Development. Cambridge, MA: MIT (1992). Origins of knowledge. Psychological Review 99: 605– Press. 632. Farah, M. J., and J. L. McClelland. (1991). A computational model Tyler, L. K. (1992). Spoken Language Comprehension: An Experi- of semantic memory impairment: Modality-specificity and mental Approach to Disordered and Normal Processing. Cam- emergent category-specificity. Journal of Experimental Psy- bridge, MA: MIT Press. chology: General 120: 339–357. Moral Psychology 561 losophers, among whom KANT is foremost in the modern Modulation of Memory period, defend the former. On their view, reason works not only to instruct one about the moral quality of one’s actions See MEMORY STORAGE, MODULATION OF but also to produce motivation to act morally. Human beings, on this view, are moved by two fundamental kinds Monism of desire, rational and nonrational. Rational desires have their source in the operations of reason, nonrational in ani- mal appetite and passion. Accordingly, moral motivation, See ANOMALOUS MONISM; MIND-BODY PROBLEM on this position, is a species of rational desire, and reason not only produces such desire but is also capable of invest- Monte Carlo Simulation ing it with enough strength to suppress the conflicting impulses of appetite and passion. Moral agency in human beings thus consists in the governance of appetite and pas- See GREEDY LOCAL SEARCH; RECURRENT NETWORKS sion by reason, and the possession of reason is therefore alone ordinarily sufficient to make one responsible for one’s Morality actions. The chief opposition to this view comes from philoso- phers such as HUME and Mill. They deny that reason is ever See CULTURAL RELATIVISM; ETHICS AND EVOLUTION; the source of moral motivation and restrict its role in moral MORAL PSYCHOLOGY agency to instructing one about the moral quality of one’s actions. On this view, all desires originate in animal appetite Moral Psychology and passion, and reason works in the service of these desires to produce intelligent action, action that is well aimed for Moral psychology is a branch of ethics. It concerns the fea- attaining the objects of the desires it serves. Consequently, tures of human psychology whose study is necessary to the the primary forms of moral motivation, on this position, the examination of the main questions of ethics, questions about desire to act rightly, the aversion to acting wrongly, are not what is inherently valuable, what constitutes human well- products of reason but are instead acquired through some being, and what justice and decency toward others demand. mechanical process of socialization by which their objects Adequate examination of these questions requires an under- become associated with the objects of natural desires and standing of the primary motives of human behavior, the aversions. Moral agency in human beings thus consists in sources of pleasure and pain in human life, the capacity cooperation among several forces, including reason, but also humans have for voluntary action, and the nature of such including a desire to act rightly and an aversion to acting psychological states and processes as desire, emotion, con- wrongly that originate in natural desires and aversions. science, deliberation, choice, character or personality, and Hence, because the acquisition of these desires and aver- volition. The study of these phenomena in relation to the sions is not guaranteed by the maturation of reason, the pos- main questions of ethics defines the field of moral psychol- session of reason is never alone sufficient to make one ogy. responsible for one’s actions. At the heart of this study are questions about the intellec- This anti-rationalist view is typically inspired by, when tual and emotional capacities in virtue of which human not grounded in, the methods and theories of natural science beings qualify as moral agents. Humans, in being capable of as applied to human psychology. In this regard, the most moral agency, differ from all other animals. This difference influential elaboration of the view in twentieth century explains why human action, unlike the actions of other ani- thought is Freud’s. Applying the general principles of per- mals, is subject to moral assessment and why humans, sonality development central to his mature theory, FREUD unlike other animals, are morally responsible for their gave an account of the child's development of a conscience actions. At the same time, not every human being is morally and a sense of guilt that explained the independence and responsible for his or her actions. Some like the very young seeming authority of these phenomena consistently with and the utterly demented are not. They lack the capacities their originating in emotions and drives that humans like that a person must have to be morally responsible, capaci- other animals possess innately. His account in this way ties that equip people for understanding the moral quality of speaks directly to the challenge that the rationalist view rep- their actions and for being motivated to act accordingly. Full resents, for rationalists, such as Kant, make the indepen- possession of these capacities is what qualifies a person as a dence and seeming authority of conscience the basis for moral agent, and it is the business of moral psychology to attributing the phenomena of conscience, including their specify what they are and to determine what full possession motivational force, to the operations of reason. of them consists in. A second dispute between rationalists and their oppo- In modern ethics the study of these questions has largely nents concerns the nature of moral thought. Rationalists concentrated on the role and importance of reason in moral hold that moral thought at its foundations is intelligible thought and moral motivation. The overarching issue is independently of all sensory and affective experiences. It is, whether reason alone, if fully developed and unimpaired, is in this respect, like arithmetic thought at its foundations. sufficient for moral agency, and the field divides into affir- Kant’s view again sets the standard. In brief, it is that the mative and negative positions on this issue. Rationalist phi- concepts and principles constitutive of moral thought are 562 Morphology formal and universal, that their application defines an atti- Dillon, R. S., Ed. (1995). Dignity, Character, and Self-Respect. New York: Routledge. tude of impartiality toward oneself and others, and that Flanagan, O. (1991). Varieties of Moral Personality: Ethics and through their realization in action, that is, by following the Psychological Realism. Cambridge, MA: Harvard University judgments one makes in applying them, one achieves a cer- Press. tain kind of freedom, which Kant called autonomy. This Flanagan, O., and A. Rorty, Eds. (1990). Identity Character and view, unlike Kant’s view about moral motivation, which has Morality: Essays in Moral Psychology. Cambridge, MA: MIT little currency outside of philosophy, deeply informs various Press. programs in contemporary developmental psychology, nota- Hoffman, M. L. (1982). Development of prosocial motivation: bly those of PIAGET and his followers, whose work on moral Empathy and guilt. In N. Eisenberg, Ed., The Development of judgment and its development draws heavily on the formal- Prosocial Behavior. New York: Academic Press, pp. 281– ist and universalist elements in Kant’s ethics. 313. Johnson, M. (1993). Moral Imagination: Implications of Cognitive Opponents of this view maintain that some moral Science for Ethics. Chicago: University of Chicago Press. thought is embodied by or founded on certain affective May, L., M. Friedman, and A. Clark, Eds. (1996). Mind and Mor- experiences. In this respect they follow common opinion. als: Essays on Ethics and Cognitive Science. Cambridge, MA: Sympathy, compassion, love, humanity, and attitudes of car- MIT Press. ing and friendship are commonly regarded as moral Morris, H. (1976). On Guilt and Innocence: Essays in Legal The- responses, and in the views of leading anti-rationalist think- ory and Moral Psychology. Berkeley and Los Angeles: Univer- ers one or another of these responses is treated as funda- sity of California Press. mental to ethics. Accordingly, the cognitions that each Rawls, J. (1971). A Theory of Justice. Cambridge, MA: Harvard embodies or the beliefs about human needs and well-being University Press. (or the needs and well-being of other animals) that each pre- Rorty, A. (1988). Mind in Action: Essays in the Philosophy of Mind. Boston: Beacon Press. supposes and to which each gives force count on these Rousseau, J.-J. (1763). Émile. New York: Basic Books. views as forms of foundational moral thought. Such Schoeman, F., Ed. (1988). Responsibility, Character, and the Emo- thought, in contrast to the rationalist conception, is not tions: New Essays in Moral Psychology. Cambridge: Cam- resolvable into formal concepts and principles, does not bridge University Press. necessarily reflect an attitude of impartiality toward oneself Stocker, M. (1990). Plural and Conflicting Values. Oxford: Oxford and others, and brings through its realization, not autonomy, University Press. but connection with others. In contemporary developmental Stocker, M. (1996). Valuing Emotions. Cambridge: Cambridge psychology, this view finds support in work on gender dif- University Press. ferences in moral thinking and on the origins of such think- Strawson, P. F. (1962). Freedom and resentment. Proceedings of ing in the child’s capacity for empathy. the British Academy 48: 1–25. Taylor, G. (1985). Pride, Shame and Guilt: Emotions of Self- —John Deigh Assessment. Oxford: Oxford University Press. Thomas, L. (1989). Living Morally: A Psychology of Moral Char- References acter. Philadelphia: Temple University Press. Eisenberg, N., and J. Strayer, Eds. (1987). Empathy and Its Devel- Williams, B. (1981). Moral Luck. Cambridge: Cambridge Univer- opment. Cambridge: Cambridge University Press. sity Press. Freud, S. (1923). The Ego and the Id. New York: Norton. Williams, B. (1993). Shame and Necessity. Berkeley and Los Freud, S. (1931). Civilization and Its Discontents. New York: Angeles: University of California Press. Norton. Wollheim, R. (1984). The Thread of Life. Cambridge, MA: Har- Gilligan, C. (1982). In a Different Voice: Psychological Theory vard University Press. and Women’s Development. Cambridge, MA: Harvard Univer- Wollheim, R. (1993). The Mind and its Depths. Cambridge, MA: sity Press. Harvard University Press. Hume, D. (1751). Enquiry Concerning the Principles of Morals. Indianapolis: Hackett. Kant, I. (1788). Critique of Practical Reason. Indianapolis: Bobbs- Morphology Merrill. Kohlberg, L. (1981). Essays on Moral Development, vol. 1. San Francisco: Harper and Row. Morphology is the branch of linguistics that deals with the Mill, J. S. (1861). Utilitarianism. Indianapolis: Hackett. Nagel, T. (1970). The Possibility of Altruism. Oxford: Oxford Uni- internal structure of those words that can be broken down versity Press. further into meaningful parts. Morphology is concerned Piaget, J. (1932). The Moral Judgment of the Child. New York: centrally with how speakers of language understand com- Free Press. plex words and how they create new ones. Compare the two English words marry and remarry. There is no way to break Further Readings the word marry down further into parts whose meanings Blum, L. A. (1995). Moral Perception and Particularity. Cam- contribute to the meaning of the whole word, but remarry bridge: Cambridge University Press. consists of two meaningful parts and therefore lies within Deigh, J. (1992). Ethics and Personality: Essays in Moral Psychol- the domain of morphology. It is important to stress that we ogy. Chicago: University of Chicago Press. are dealing with meaningful parts. If we look only at sound, Deigh, J. (1996). The Sources of Moral Agency: Essays in Moral then marry consists of two syllables and four or five pho- Psychology and Freudian Theory. Cambridge: Cambridge Uni- nemes, but this analysis is purely a matter of PHONOLOGY versity Press. Morphology 563 and has nothing to do with meaningful structure and hence verbs form their past tense by suffixation, but in some lan- is outside morphology. guages, most prominently the Semitic languages, this type The first part of remarry is a prefix (re-), which means of vowel substitution is normal. Consonants as well as vow- approximately ‘again’; it is joined together with the second els may be altered in a meaningful way. One example of this component, the verb marry, to form another verb with the pre- in English is the relation between nouns like cloth that end dictable meaning ‘marry again’. The same prefix re- occurs in in voiceless fricatives (in which there is no vibration of the many other words (e.g., reacquaint, redesign, refasten, and vocal folds) and corresponding verbs like clothe, which end recalibrate) and can also be used to form novel words like in the corresponding voiced fricative sound. Bound mor- redamage or retarget whose meaning is understood automati- phemes thus modify the sound shape of the words to which cally by speakers of English. The additional fact that verbs they attach, either by adding something to the stem or by like *reweep or *relike are impossible tells us that there are changing it. In the limiting case, known as conversion, there restrictions on this prefix. It is the morphologist's job to dis- is no change at all in the shape of the word. This device is cover the general principles that underlie our ability to form very common in English, where basic nouns like ship and and understand certain complex words but not others. sand are routinely turned into verbs, and verbs like run can Languages differ quite greatly in both the complexity and similarly be turned into nouns. the type of their morphology. Some languages (e.g., Manda- Linguists distinguish derivational morphology from rin Chinese and Vietnamese) have very little in the way of inflectional morphology. Derivational morphology, as just morphology. Others (e.g., Turkish, Sanskrit, Swahili, and discussed, deals with how distinct words (lexemes) are Navajo) are famous for their complex morphology. English related to one another; inflectional morphology is concerned falls somewhere in the middle. It is quite normal for an with the different forms that a word may take, depending on English sentence to contain no complex words. Even its role in a sentence. English is quite poor inflectionally. Shakespeare, who is commonly thought of as using com- Nouns have at most a singular and a plural form, and only plex language, tended to use morphologically simple words. pronouns have special case forms that depend on the role of That most famous of Shakespeare's sentences, “To be or not the word in a sentence; regular verbs have only four distinc- to be, that is the question,” is morphologically very simple. tive forms, and the different tenses, aspects, and moods are The atomic meaningful units of language are tradition- formed by means of auxiliary verbs. But in many other lan- ally called morphemes. Morphemes are classified into two guages, all nouns have distinct case forms, adjectives must very basic types: free morphemes and bound morphemes, often agree with the nouns that they modify, and verbs may the difference between them being that a bound morpheme have not only distinct forms for each TENSE AND ASPECT, is defined in terms of how it attaches to (or is bound to) voice and mood, but may also agree with their subject or another form (called its stem). The most common types of object. In Classical Greek, each noun will have 11 different bound morphemes are prefixes and suffixes. The other major forms, each adjective 30, and every regular verb over 300. device for forming complex words in English is compound- Other languages are even more complex in their inflection. ing, whereby free morphemes are put together to form a An important difference between morphology and SYN- word like doghouse or ready-made. Although these devices TAX is that morphological patterns vary greatly in their pro- are quite simple, repeated application allows for the forma- ductivity, the ease with which new words can be created and tion of fairly complex words by piling one prefix or suffix understood. If we compare the three English noun-forming on another or by successive compounding. The word suffixes -ness, -ity, and -th, we find that there are many unmanageableness contains three bound morphemes and existing nouns ending in -ness, a somewhat smaller number has been built up in stages from manage by first adding the ending in -ity, and only a dozen or so nouns ending in -th. suffix -able to produce manageable, then the prefix un- to The numbers correlate roughly with the productivity of new form unmanageable, and finally the suffix -ness, resulting in words: experiments show that a new noun ending in -ness [[un[[manage]Vable]A]Aness]N. will be more readily accepted as English than a new noun Among the world's languages, suffixation is the most ending in -ity, and no new noun in -th has been added to the common morphological device and there are quite a few language since about 1600. languages (including Japanese, Turkish, and the very large The study of productivity shows that the distinct lexemes Dravidian family of South India) that have no prefixes but and their forms comprise a complex network. A major permit long sequences of suffixes. Languages with many research focus for the experimental study of morphology prefixes and few or no suffixes are quite rare (Navajo is one has been the nature of this network. The prevailing model is example). Some languages use infixes, which are placed at a that speakers each have a mental LEXICON in which is stored specific place inside their stems. In Tagalog, the national every word of their language, inflected or derived, that has language of the Philippines, for example, the infix -um- may any unpredictable feature. Completely regular words are be added in front of the first vowel of a stem, after any con- produced on the fly by productive patterns as they are sonants that may precede it, to mark a completed event. needed and then discarded. Less productive patterns may Alongside takbuh ‘run’ and lakad ‘walk,’ we find tumakbuh also be used, but the words formed in these patterns are less ‘ran’ and lumakad ‘walked’. Another internal morphologi- predictable in their meaning and are more likely to be stored cal device is to change a sound in the stem. A fairly small once they have been used. number of English verbs form their past tenses by changing Morphology lies at the heart of language. It interacts the vowel: write/wrote, sing/sang, hold/held. These are with syntax, phonology, SEMANTICS, PRAGMATICS, and the irregular in English, because the vast majority of English lexicon. Through this interaction, it also relates to numerous 564 Motion, Perception of aspects of NATURAL LANGUAGE PROCESSING and cognition. Motion processing serves a number of behavioral goals, Morphology can thus provide a window onto detailed from which it is possible to infer a hierarchy of computa- aspects of language and cognition. The more we discover tional steps. An initial step common to all aspects of motion about the rest of language and cognition, the more important processing is detection of the displacement of retinal image the study of morphology will become. features, a process termed “motion detection.” In the pri- mate visual system, neurons involved in motion detection See also BINDING THEORY; LINGUISTIC UNIVERSALS are first seen at the level of primary VISUAL CORTEX (area AND UNIVERSAL GRAMMAR; POLYSYNTHETIC LANGUAGES; V1; Hubel and Wiesel 1968). Many V1 neurons exhibit STRESS, LINGUISTIC; WORD MEANING, ACQUISITION OF selectivity for the direction in which an image feature —Mark Aronoff moves across the retina and hence are termed “directionally selective.” These V1 neurons give rise to a larger subsystem Further Readings for motion processing that involves several interconnected regions of the dorsal (or “parietal”) VISUAL PROCESSING Aronoff, M. (1976). Word Formation in Generative Grammar. STREAMS (Felleman and Van Essen 1991). Most notable Cambridge, MA: MIT Press. among these cortical regions is the middle temporal visual Bauer, L. (1983). English Word-Formation. Cambridge: Cam- area, commonly known as area MT (or V5)—a small visual- bridge University Press. otopically organized area with a striking abundance of Carstairs-McCarthy, A. (1992). Current Morphology. London: directionally selective neurons (Albright 1993). Routledge. Several detailed models have been proposed to account Feldman, L. B., Ed. (1995). Morphological Aspects of Language for neuronal motion detection (Borst and Egelhaaf 1993). Processing. Hillsdale, NJ: Erlbaum. Matthews, P. H. (1991). Morphology. 2nd ed. Cambridge: Cam- The earliest was developed over forty years ago to explain bridge University Press. motion sensitivity in flying insects. According to this model Mel’cuk, I. A. (1992). Cours de Morphologie Générale, vol. 1. and its many derivatives, motion is computed through spa- Montreal: Presses de l'Université de Montréal. tiotemporal correlation. This COMPUTATION is thought to be Nida, E. A. (1949). Morphology: The Descriptive Analysis of achieved neuronally via convergence of temporally stag- Words. 2nd ed. Ann Arbor: University of Michigan Press. gered outputs from receptors with luminance sensitivity Scalise, S. (1986). Generative Morphology. Dordrecht: Foris. profiles that are spatially displaced. The results of electro- Spencer, A. (1991). Morphological Theory. Oxford: Blackwell. physiological experiments indicate that a mechanism of this Zwicky, A., and A. Spencer, Eds. (1998). Handbook of Morphol- type can account for directional selectivity seen in area V1 ogy. Oxford: Blackwell. (Ganz and Felder 1984). While motion detection is thus implemented at the earli- Motion, Perception of est stage of cortical visual processing in primates, a number of studies (Shadlen and Newsome 1996) have demonstrated The visual environment of most animals consists of objects a close link between the discriminative capacity of motion- that move with respect to one another and to the observer. sensitive neurons at subsequent stages—particularly area Detection and interpretation of these motions are not only MT—and perceptual sensitivity to direction of motion. crucial for predicting the future state of one’s dynamic Using a stimulus in which the “strength” of a motion signal world—as would be necessary to escape an approaching can be varied continuously, Newsome and colleagues have predator, for example—but also provide a wealth of infor- shown that the ability of individual MT neurons to discrimi- mation about the 3-D structure of the environment. Not sur- nate different directions of motion is, on average, compara- prisingly, motion perception is one of the most ble to that of the nonhuman primate observer in whose phylogenetically well conserved of visual functions. In pri- CEREBRAL CORTEX the neurons reside. In a related experi- mates, who rely heavily on vision, motion processing has ment, these investigators found that they could predictably reached a peak of computational sophistication and neu- bias the observer’s perceptual report of motion direction by ronal complexity. electrically stimulating a cortical column of MT neurons The neuronal processes underlying perceived motion that represent a known direction. Finally, direction discrimi- first gained widespread attention in the nineteenth century. nation performance was severely impaired by ablation of Our present understanding of this topic is a triumph of cog- area MT. In concert, the results of these experiments indi- nitive science, fueled by coordinated application of a variety cate that MT neurons provide representations of image of techniques drawn from the fields of COMPUTATIONAL motion upon which perceptual decisions can be made. Once retinal image motion is detected and discrimi- NEUROSCIENCE, ELECTROPHYSIOLOGY, ELECTRIC AND MAG- NETIC EVOKED FIELDS, PSYCHOPHYSICS, and neuroanatomy. nated, the resultant signals are used for a variety of pur- Most commonly the visual stimulus selectivities of individ- poses. These include (1) establishing the 3-D structure of a ual neurons are assessed via the technique of SINGLE- visual scene, (2) guiding balance and postural control, (3) NEURON RECORDING, and attempts are made to link selec- estimating the observer’s own path of locomotion and time tivities to well-defined computational steps, to behavioral to collision with environmental objects, (4) parsing retinal measures of perceptual state, or to specific patterns of neu- image features into objects, and—perhaps most obvi- ronal circuitry. The product of this integrative approach has ously—(5) identifying the trajectories of moving objects been a broad perspective on the neural structures and events and predicting their future positions in order to elicit an responsible for visual motion perception. appropriate behavioral response (e.g., ducking). Computa- Motion, Perception of 565 tional steps and corresponding neural substrates have been suit; many even do so through the momentary absence of a identified for many of these perceptual and motor func- pursuit target. The latter finding suggests that such neurons tions. receive a “copy” of the efferent motor command, which Establishing 3-D scene structure from motion and esti- may be used to interpret retinal motion signals during eye mating the path of locomotion, for example, both involve movements, as well as to perpetuate pursuit when the target detection of complex velocity gradients in the image (e.g., briefly passes behind an occluding surface. Finally, neurop- rotation, expansion, tilt; see STRUCTURE FROM VISUAL sychological studies have shown that smooth pursuit is INFORMATION SOURCES). Psychophysical studies demon- severely impaired following damage to areas MT and MST. strate that primates possess fine sensitivity to such gradients In concert, these studies demonstrate that cortical motion- (Van Doorn and Koenderink 1983) and electrophysiological processing areas—particularly MT and MST—forward evidence indicates that neurons selective for specific velocity precise measurements of object direction and speed to the gradients exist in the medial superior temporal (MST) area, oculomotor system to be used for pursuit generation. Simi- and other higher areas of the parietal stream (Duffy and lar visual-motor links are likely to be responsible for head, Wurtz 1991). limb, and body movements. Establishing the trajectory of a moving object—another As evident from the foregoing discussion, basic knowl- essential motion-processing function—is also an area of con- edge of the neural substrates of motion perception has come siderable interest. This task is fundamentally one of trans- largely from investigation of nonhuman primates and other forming signals representing retinal image motions, such as mammalian species. The general mechanistic and organiza- those carried by V1 neurons, into signals representing visual tional principles gleaned from this work are believed to hold scene motions. Computationally, this transformation is com- for the human visual system as well. Neuropsychological plex (indeed, the solution is formally underconstrained), studies, in conjunction with recent advances in functional owing, in part, to spurious retinal image motions that are gen- brain imaging tools such as MAGNETIC RESONANCE IMAG- erated by the incidental overlap of moving objects. Contex- ING (MRI) and POSITRON EMISSION TOMOGRAPHY (PET), tual cues for visual scene segmentation play an essential role have yielded initial support to this hypothesis. In particular, in achieving this transformation. This process has been clinical cases of selective impairment of visual motion per- explored extensively using visual stimuli that simulate retinal ception following discrete cortical lesions have been hailed images rendered by one object moving past another (Stoner as evidence for a human homologue of areas MT and MST and Albright 1993). A variety of real-world contextual cues, (Zihl, von Cramon, and Mai 1983). Neuronal activity- including brightness differences (indicative of shading, related signals (PET and functional MRI) recorded from transparency, or differential surface reflectance) and binocu- human subjects viewing moving stimuli have identified a lar positional disparity (“stereoscopic” cues), have been used motion-sensitive cortical zone in approximately the same in psychophysical studies to manipulate perceptual interpre- location as that implicated from the effects of lesions (Too- tation of the spatial relationships between the objects in the tell et al. 1995). scene. This interpretation has, in turn, a profound influence These observations from the human visual system, in upon the motion that is perceived. Electrophysiological combination with fine-scale electrophysiological, anatomi- experiments have been conducted using stimuli containing cal, and behavioral studies in nonhuman species, paint an similar contextual cues for scene segmentation. Neuronal increasingly rich portrait of cortical motion-processing activity in area MT is altered by context, such that the direc- substrates. Indeed, motion processing is now arguably the tion of motion represented neuronally matches the direction most well-understood sensory subsystem in the primate of object motion perceived (Stoner and Albright 1992). brain. As briefly revealed herein, one can readily identify These results suggest that the transformation from a repre- the computational goals of the system, link them to spe- sentation of retinal image motion to one of scene motion cific loci in a distributed and hierarchically organized neu- occurs in, or prior to, area MT and is modulated by signals ral system, and document their functional significance in a encoding the spatial relationships between moving objects. real-world sensory-behavioral context. The technical and The final utility of visual motion processing is, of conceptual roots of this success provide a valuable model course, MOTOR CONTROL—for example, reaching a hand to for the investigation of other sensory, perceptual, and cog- catch a ball, adjusting posture to maintain balance during nitive systems. figure skating, or using smooth eye movements to follow a See also ATTENTION IN THE HUMAN BRAIN; MACHINE moving target. The OCULOMOTOR CONTROL system is par- VISION; MID-LEVEL VISION; OBJECT RECOGNITION, HUMAN ticularly well understood and has served as a model for NEUROPSYCHOLOGY; SPATIAL PERCEPTION; VISUAL ANAT- investigation of the link between vision and action. The OMY AND PHYSIOLOGY motion-processing areas of the parietal cortical stream —Thomas D. Albright (e.g., areas MT and MST) have anatomical projections to brain regions known to be involved in control of smooth References pursuit EYE MOVEMENTS (e.g., dorsolateral pons). Electro- physiological data linking the activity of MT and MST neu- Albright, T. D. (1993). Cortical processing of visual motion. In J. rons to smooth pursuit are plentiful. For one, the temporal Wallman and F. A. Miles, Eds., Visual Motion and Its Use in characteristics of neuronal responses in area MT are corre- the Stabilization of Gaze. Amsterdam: Elsevier, pp. 177–201. lated with the dynamics of pursuit initiation, suggesting a Borst, A., and M. Egelhaaf. (1993). Detecting visual motion: The- causal role. MST neurons respond well during visual pur- ory and models. In J. Wallman and F. A. Miles, Eds., Visual 566 Motivation as a valuable goal object. Had he grasped her flanks with his Motion and Its Use in the Stabilization of Gaze. Amsterdam: Elsevier, pp. 3–26. forepaws during the first encounter, the female would have Duffy, C. J., and R. H. Wurtz. (1991). Sensitivity of MST neurons kicked him away, but when he delivers similar sensory stim- to optic flow stimuli, I. A continuum of response selectivity to ulation during the second encounter, she responds by adopt- large-field stimuli. J. Neurophysiol. 65(6): 1329–1345. ing a posture that allows him to mount her and mate. During Felleman, D. J., and D. C. Van Essen. (1991). Distributed hierar- the first encounter, the female’s gait is similar to that of the chical processing in the primate cerebral cortex. Cerebral Cor- male, whereas during the second encounter, her gait tex 1: 1–47. includes a distinctive combination of darts and hops fol- Ganz, L., and R. Felder. (1984). Mechanism of directional selectiv- lowed by pauses that may be accompanied by vigorous ear ity in simple neurons of the cat’s visual cortex analyzed with wiggling. Finally, her prodding of the male, distinctive stationary flash sequences. J. Neurophysiol. 51(2): 294–324. motor patterns, and postural response vary in intensity as a Hubel, D. H., and T. N. Wiesel. (1968). Receptive fields and func- tional architecture of monkey striate cortex. J. Physiol. 195: function of her hormonal status and recent experience. Such 215–243. coordinated changes in the evaluation of goal objects (Shiz- Shadlen, M. N., and W. T. Newsome. (1996). Motion perception: gal forthcoming), the impact of external stimuli, the prepo- Seeing and deciding. Proc. Natl. Acad. Sci. U.S.A. 93(2): 628– tency of sensorimotor units of action (Gallistel 1980), and 633. the vigor of performance can be said to reflect a common Stoner, G. R., and T. D. Albright. (1992). Neural correlates of per- motivational influence; the set of internal conditions respon- ceptual motion coherence. Nature 358: 412–414. sible for this common influence can be said to constitute a Stoner, G. R., and T. D. Albright. (1993). Image segmentation cues motivational state. in motion processing: Implications for modularity in vision. J. The response of the male to the solicitation of the female Cogn. Neurosci. 5(2): 129–149. illustrates the contribution of external as well as internal Tootell, R. B., J. B. Reppas, K. K. Kwong, R. Malach, R. T. Born, T. J. Brady, B. R. Rosen, and J. W. Belliveau. (1995). Func- inputs to the genesis and maintenance of a motivational tional analysis of human MT and related visual cortical areas state (Bindra 1969). Such external inputs, called “incentive using magnetic resonance imaging. J. Neurosci. 15(4): 3215– stimuli,” are important both to behavioral continuity and 3230. change. Positive feedback between incentive stimuli and Van Doorn, A. J., and J. J. Koenderink. (1983). Detectability of motivational states tends to lock in commitment to a partic- velocity gradients in moving random-dot patterns. Vision Res. ular course of action. The more the male interacts with the 23: 799–804. female, the more he exposes himself to olfactory, tactile, Zihl, J., D. von Cramon, and N. Mai. (1983). Selective disturbance and visual stimuli that increase the likelihood of further of movement vision after bilateral brain damage. Brain 106(2): interaction. Thus his initial hesitancy gives way to vigorous 313–340. pursuit. Moreover, sufficiently powerful incentive stimuli incompatible with a current objective can trigger an abrupt, Motivation self-reinforcing switch in the direction of the solicited behavior, as when the male rat is sidetracked from slaking Motivation is a modulating and coordinating influence on his thirst by the intervention of the female. the direction, vigor, and composition of behavior. This Motivational states not only modulate the stimulus con- influence arises from a wide variety of internal, environ- trol of behavior, they also act as intermediaries in its tempo- mental, and social sources and is manifested at many levels ral control by transducing internal and external signals of behavioral and neural organization. indicating the season and time of day into changes in the As an illustration of motivational influence, consider the likelihood of initiating different goal-directed activities. cyclical changes in sexual interest and receptivity shown by a Timing signals make it possible for behavior to anticipate female rat (McClintock 1984). Two days following the onset physiological need states, thus lessening the risk that sup- of her last period of sexual receptivity, she encounters one of plies will be depleted and that the physiological imbalance the male rats with which she shares a communal burrow. The will compromise the capacity of the animal to procure addi- rats sniff each other indifferently and continue on their sepa- tional resources. For example, migrating birds eat vora- rate ways. Two days later, their paths cross again. Levels of ciously and greatly increase their body weight prior to their gonadal hormones in the female’s blood (see NEUROENDO- departure. Nonetheless, anticipatory intake may prove CRINOLOGY) have increased markedly since her last encoun- insufficient to meet later needs or may exceed subsequent ter with the male, and she now responds in a strikingly expenditures. Thus the contribution of motivation to the reg- different fashion, approaching, nuzzling, and crawling over ulation of the internal environment depends both on signals him. Turning away, she runs a short distance and stops. The that predict future physiological states and on signals that male follows hesitantly, but then spies a puddle and stops to reflect current ones (Fitzsimons 1979). drink, appearing to lose interest in the female. Undeterred, To illustrate the regulatory challenges addressed by the the female returns and repeats the pattern of approach and motivational modulation of behavior, let us revisit the contact followed by turning, running away, and stopping. female rat as she begins to wean her litter, about six weeks She soon succeeds in attracting and holding the attention of after she has mated successfully. The total weight of her the male, and he follows her on a tortuous high-speed chase. pups now exceeds her own. First via her bloodstream and The female then halts abruptly, and coitus ensues. then via her milk, she has succeeded in providing the neces- During the first encounter, the female treats the male as a sary resources without seriously compromising her own via- neutral stimulus, whereas during the second, she treats him bility. Accomplishing this feat has required dramatic Motivation 567 Figure 1. The reproductive behavior of rats illustrates multiple facets of motivational influence (from McClintock 1987). Changes in hormonal status and experience alter the evaluation and selection of goal objects, the impact of external stimuli, the vigor of goal- directed behavior, and the prepotency of sensorimotor units of action. alteration in her intake patterns. For example, her caloric sexual partner, and each of the constituent acts may be var- intake and calcium consumption during lactation will have ied in intensity and duration or omitted entirely, depending reached 2–3 times postweaning levels, reflecting both hor- on the response of the male. Following removal of the CERE- monally driven anticipatory changes and feedback from the BRAL CORTEX, components of the behavior survive, but their physiological consequences of increased expenditures (Mil- patterning is disrupted and is no longer well coordinated lelire and Woodside 1989). with the behavior of the male. The motivational modulation of preferences can extend At higher levels of behavioral and neural organization, beyond the point of neutrality, rendering previously repul- motivational states interact with cognitive processes in sive stimuli attractive and vice versa (Cabanac 1971). Prior influencing behavior. For example, the information direct- to her first pregnancy, a female rat will treat a rat pup as an ing ongoing behavior may be drawn from COGNITIVE MAPS aversive stimulus, positioning herself as far away from it as of the environment rather than from current sensory input possible when placed together with the pup in an enclosure. (Gallistel 1990; Marlow and Tollestrup 1982; see ANIMAL In contrast, when she is in the maternal state, the female will NAVIGATION). Another point of contact between motivation actively retrieve pups, even if they are not her own (Fleming and cognition is the control of ATTENTION (Simon 1993). 1986). Changes in motivational state alter the likelihood that a Changes in motivational state are expressed at many lev- stimulus will attract attentional resources, and directing els of behavioral and neural organization. For example, the these resources at a stimulus can boost its incentive effects posture adopted by a receptive female rat during copulation (Shizgal 1996). By gating input to WORKING MEMORY, reflects the highly stereotyped operation of a spinal reflex attention can restrict the set from which goals are selected (Pfaff 1982). Provided the female is sexually receptive, the and control the access of goal-related information to the reflex can be triggered in response to pressure on the flanks processes involved in PLANNING. regardless of whether the stimulus is applied by the fore- In the examples provided above, the objects of evalua- paws of the male rat or the hand of a human. Although facil- tion and goal selection are physical resources and activi- itation from brain stem neurons is necessary for execution ties. In humans, and perhaps in other animals as well, of the reflex (Pfaff 1982), the integrity of the cerebral cor- abstractions can serve as goals and as the objects of evalu- tex is not (Beach 1944). In contrast, the organization of the ation (see MOTIVATION AND CULTURE). For example, our individual components of solicitation into the pattern of ap- evaluations of ourselves have profound motivational conse- proach, contact, withdrawal, and pausing is context- quences (Higgins, Strauman, and Klein 1986), and our sensitive, flexible (McClintock 1984), and dependent on objectives may be defined with respect to current and pro- cortical integrity (Beach 1944). The solicitation behavior of jected self-concepts (Cantor and Fleeson 1993). Nonethe- the intact female is directed preferentially at an appropriate less, the psychological and neural foundations for such 568 Motivation and Culture abstract expressions of motivational influence may have ogy of Reproductive Behavior: An Evolutionary Perspective. Englewood Cliffs, NJ: Prentice Hall. much in common with mechanisms, perhaps highly con- Millelire, L., and B. Woodside. (1989). Factors influencing the served across animal species, that modulate pursuit of con- self-selection of calcium in lactating rats. Physiology and crete biological goals. Behavior 46: 429–439. Motivational influences are incorporated in some artifi- Pfaff, D. W. (1982). Neurobiological mechanisms of sexual moti- cial intelligence models. For example, such signals provide vation. In D. W. Pfaff, Ed., The Physiological Mechanisms of contextual information in an important model of REINFORCE- Motivation. New York: Springer, pp. 287–317. MENT LEARNING (Barto 1995), although the manner in which Shizgal, P. (1996). The Janus faces of addiction. Behavioral and motivational signals are processed to modulate the impact of Brain Sciences 19(4): 595–596. rewards and to guide action tends to be left unspecified in Shizgal, P. (1997). Neural basis of utility estimation. Current such models. A hierarchical account of motor control (Gal- Opinion in Neurobiology 7(2): 198–208. Shizgal, P. (Forthcoming). On the neural computation of util- listel 1980) and recent modeling (Shizgal 1997, forthcom- ity: Implications from studies of brain stimulation reward. ing) of the neural and computational processes underlying In D. Kahneman, E. Diener, and N. Schwarz, Eds., Founda- goal evaluation and selection (see DECISION MAKING and tions of Hedonic Psychology: Scientific Perspectives on UTILITY THEORY) represent early steps toward formal Enjoyment and Suffering. New York: Russell Sage Founda- description of the motivational influence on behavior. tion. See also COMPARATIVE PSYCHOLOGY; RATIONAL AGENCY Simon, H. A. (1993). The bottleneck of attention: Connecting thought with motivation. In W. D. Spaulding, Ed., Integrative —Peter Shizgal Views of Motivation, Cognition and Emotion. Lincoln: Univer- sity of Nebraska Press, pp. 1–21. References Further Readings Barto, A. G. (1995). Adaptive critics and the basal ganglia. In J. C. Bindra, D. (1976). A Theory of Intelligent Behaviour. New York: Houk, J. L. Davis, and D. G. Beiser, Eds., Models of Informa- Wiley. tion Processing in the Basal Ganglia. Cambridge, MA: MIT Bolles, R. C. (1975). Theory of Motivation. New York: Harper and Press, pp. 215–232. Row. Beach, F. A. (1944). Effects of injury to the cerebral cortex upon Dienstbier, R., and W. D. Spaulding, Eds. (1993). Nebraska Sym- sexually receptive behavior in the female rat. Psychosomatic posium on Motivation. Vol. 41, Integrative Views of Motivation, Medicine 6: 40–55. Cognition and Emotion. Lincoln: University of Nebraska Press. Bindra, D. (1969). A unified interpretation of emotion and motiva- Mook, D. G. (1996). Motivation. 2nd ed. New York: Norton. tion. Annals of the New York Academy of Sciences 159: 1071– Pfaff, D. W., Ed. (1982). The Physiological Mechanisms of Moti- 1083. vation. New York: Springer. Cabanac, M. (1971). Physiological role of pleasure. Science 173: Satinoff, E., and P. Teitelbaum, Eds. (1983). Handbook of Behav- 1103–1107. ioral Neurobiology, vol. 6, Motivation. New York: Plenum Cantor, N., and W. Fleeson. (1993). Social intelligence and intel- Press. ligent goal pursuit: A cognitive slice of motivation. In W. D. Sorrentino, R. M., and E. T. Higgins, Eds. (1986). Handbook of Spaulding, Ed., Integrative Views of Motivation, Cognition Motivation and Cognition. 3 vols. New York: Guilford Press. and Emotion. Lincoln: University of Nebraska Press, pp. 125– Toates, F. (1986). Motivational Systems. Cambridge: Cambridge 179. University Press. Fitzsimons, J. T. (1979). The Physiology of Thirst and Sodium Appetite, vol. 35. Cambridge: Cambridge University Press. Fleming, A. (1986). Psychobiology of rat maternal behavior: How Motivation and Culture and where hormones act to promote maternal behavior at partu- rition. Annals of the New York Academy of Sciences 474: 234– 251. Studies of motivation try to explain the initiation, persis- Gallistel, C. R. (1980). The Organization of Action: A New Synthe- tence, and intensity of behavior (Geen 1995; see also MOTI- sis. Hillsdale, NJ: Erlbaum. VATION). Culture, learned schemas shared by some people Gallistel, C. R. (1990). The Organization of Learning. Cambridge, due to common, humanly mediated experiences, as well as MA: MIT Press. the practices and objects creating and created by these sche- Higgins, E. T., T. Strauman, and R. Klein. (1986). Standards and mas, plays a large role in nearly all human behavior. Even the process of self-evaluation: Multiple affects from multiple such biologically adaptive motivations as hunger and sex stages. In R. M. Sorrentino and E. T. Higgins, Eds., Handbook instigate somewhat different behaviors in different societies, of Motivation and Cognition. New York: Guilford Press, pp. depending on learned schemas for desirable objects, appro- 23–63. Marlow, R. W., and K. Tollestrup. (1982). Mining and exploitation priate and effective ways to obtain these, and skills for of natural deposits by the desert tortoise, Gopherus agassizii. doing so (Mook 1987). Animal Behaviour 30: 475–478. The motivational effects of culturally variable beliefs can McClintock, M. K. (1984). Group mating in the domestic rat as a be illustrated by considering causal attribution processes. context for sexual selection: Consequences for the analysis of Weiner (1991) argues that we are unlikely to persist at a vol- sexual behavior and neuroendocrine responses. In J. Rosenb- untary behavior if we have failed in the past and we attribute latt, C. Beer, and R. Hinde, Eds., Advances in the Study of that failure to an unchanging and uncontrollable aspect of Behavior. New York: Academic Press, pp. 1–50. ourselves or the situation. Some studies show that people in McClintock, M. K. (1987). A functional approach to the behav- Japan tend to attribute poor academic performance to insuf- ioral endocrinology of rodents. In D. Crews, Ed., Psychobiol- Motivation and Culture 569 typical in one’s social group, making other behaviors less ficient effort, while people in the United States give greater available for consideration, likely to provoke disapproval, or weight than do their Japanese counterparts to lack of ability inconvenient. Examples are body language, table manners (Markus and Kitayama 1991; Weiner 1991). Given these and food choices, house design, mode of dress, occupations, assumptions, U.S. schoolchildren who receive poor grades and forms of worship. When action follows the patterns should thereafter put less effort into their schoolwork, while learned from repeated observation of the typical behavior of Japanese schoolchildren who receive poor grades should other people like oneself, as well as social facilitation of increase their effort. certain ways of acting over others, it could be said to draw Is there a fixed, limited number of universal basic on routine motivation (Strauss and Quinn 1997). In many motives, which vary cross-culturally only in their strength? cases, routine motivation is acquired nonverbally, is inter- Or is cross-cultural variation qualitative as well as quantita- nalized as implicit schemas, and is not strongly affectively tive, making it impossible to delimit universally applicable charged or linked to self-conceptions. Particularly important basic motives? McClelland (1985; Weinberger and McClel- routine motivations (e.g., schemas for being a good parent land 1990) has argued for the first position. He has found or reliable breadwinner), however, may be internalized with cross-societal as well as intrasocietal differences in the aver- an explicit verbal component and linked to emotions (e.g., age levels of such basic motives as achievement and affilia- fear or pride) and self-conceptions, depending on how they tion, and he posits that human as well as other animal were learned. behavior is motivated by a limited set of stable, “implicit The different forms of motivation, and various ways in motives” such as these, which draw on the “natural incen- which these are learned, highlight the fact that a culture is tive” of neurohormone release. Cantor and her colleagues not a single thing. In particular, cultures cannot be thought (1986), by contrast, focus on idiosyncratically variable self- of as master programmers loading up instructions that deter- concepts. These are conscious, change over time, and mine people’s behavior. In every society various, not always include a variety of understandings and images, positive and consistent, values are proclaimed explicitly. Some of these negative roles and behaviors in the past and present as well values are the basis for motivations that socializers try to as various future “possible selves,” namely, “those selves teach children, others are ignored, remaining “cultural cli- that individuals could become, would like to become, or are chés” (Spiro 1987; see also Strauss 1992; and Strauss and afraid of becoming” (p. 99). They illustrate the possible Quinn 1997). Finally, in addition to motivations that are variability among such self-conceptions with the example of deliberately instilled, there are needs and expectations students preparing for a final examination. One, fearing derived from preverbal parent-child interactions (McClel- exposure as a fraud, parties the night before the exam so that land 1985, Weinberger and McClelland 1990, see also Paul no one will attribute her failure to lack of ability. Another, 1990), as well as ongoing observations of the normal way of having a feared “careless failure” possible self, studies very acting in one’s social group. hard. The enormous variability among such self- conceptions, even within a single society, suggests the See also DECISION MAKING; RATIONAL CHOICE THEORY; potential for limitless cross-cultural variation. Markus and SELF Kitayama (1991), on the other hand, while continuing to —Claudia Strauss link motivation to self-conceptions, posit a general distinc- tion between societies with conceptions of self as indepen- References dent of others and societies with conceptions of self as interdependent with others (see also Miller 1997). Cantor, N., H. Markus, P. Niedenthal, and P. Nurius. (1986). On D’Andrade (1992) likewise advocates the infinite variability motivation and the self-concept. In R. M. Sorrentino and E. T. Higgins, Eds., Handbook of Motivation and Cognition: Foun- position. Discussing the potential for a wide variety of sche- dations of Social Behavior. New York: Guilford Press, pp. 96– mas (not just self or conscious schemas) to function as 121. goals, he offers as a classic example from the ethnographic D’Andrade, R. G. (1992). Schemas and motivation. In R. G. literature the intensity of most Nuers’s interest in cattle D’Andrade and C. Strauss, Eds., Human Motives and Cultural (Evans-Pritchard 1947). Models. Cambridge: Cambridge University Press, pp. 23–44. McClelland proposes (Weinberger and McClelland 1990) Evans-Pritchard, E. E. (1947). The Nuer: A Description of the that the differences between his approach and that of Cantor Modes of Livelihood and Political Institutions of a Nilotic Peo- et al. (1986) can be resolved by treating them as describing ple. Oxford: Clarendon Press. two sorts of motivation. He provides evidence that the Geen, R. G. (1995). Human Motivation: A Social Psychological implicit motives he discusses are derived largely from pre- Approach. Pacific Grove, CA: Brooks/Cole. Markus, H. R., and S. Kitayama. (1991). Culture and the self: verbal “affective experiences” (such as early parent-child Implications for cognition, emotion, and motivation. Psycho- interaction in feeding, elimination control, and so on) and logical Review 98: 224–253. explain behavior over the long term and in less structured sit- McClelland, D. C. (1985). Human Motivation. Glenview, IL: Scott, uations. In contrast, the explicit self-conceptions discussed Foresman. by Cantor et al. are acquired with the mediation of language Miller, J. (1997). Cultural conceptions of duty: Implications for and explain choices in structured tasks, especially ones that motivation and morality. In D. Munro, J. E. Schumaker, and S. make self-conceptions salient. C. Carr, Eds., Motivation and Culture. New York: Routledge, This categorization of kinds of motivation could be pp. 178–192. expanded. Neither McClelland’s nor Cantor et al.’s model Mook, D. G. (1987). Motivation: The Organization of Action. New accounts for the sort of behavior that is enacted because it is York: Norton. 570 Motor Control tional wisdom is that proprioception provides information Paul, R. (1990). What does anybody want? Desire, purpose, and the acting subject in the study of culture. Cultural Anthropol- about arm configuration to be used in the programming of ogy 5: 431–451. the arm’s trajectory. However there is experimental evi- Spiro, M. E. (1987). Collective representations and mental repre- dence indicating that information about the initial position sentations in religious symbol systems. In B. Kilborne and L. L. of the limb derives from a number of sources, including the Langness, Eds., Culture and Human Nature: Theoretical visual afferences (Ghez, Gordon, and Ghilardi 1993). Papers of Melford E. Spiro. Chicago: University of Chicago The current view on the formation of arm trajectories is Press, pp. 161–184. that the CNS formulates the appropriate command for the Strauss, C. (1992). What makes Tony run? Schemas as motives desired trajectory on the basis of knowledge about the initial reconsidered. In R. G. D’Andrade and C. Strauss, Eds., Human arm position and the target’s location. Recent psychophysi- Motives and Cultural Models. Cambridge: Cambridge Univer- cal evidence supports the hypothesis that the planning of sity Press, pp. 197–224. Strauss, C., and N. Quinn. (1997). A Cognitive Theory of Cultural limbs’ movements constitutes an early and separate stage of Meaning. Cambridge: Cambridge University Press. information processing. According to this view, during Weinberger, J., and D. C. McClelland. (1990). Cognitive versus planning the brain is mainly concerned with establishing traditional motivational models: Irreconcilable or complemen- movement kinematics, a sequence of positions that the hand tary? In E. T. Higgins and R. M. Sorrentino, Eds., Handbook of is expected to occupy at different times within the extraper- Motivation and Cognition. Vol. 2, Foundations of Social Behav- sonal space. Later, during execution, the dynamics of the ior. New York: Guilford Press, pp. 562–597. musculoskeletal system are controlled in such a way as to Weiner, B. (1991). On perceiving the other as responsible. In R. A. enforce the plan of movement within different environmen- Dienstbier, Ed., Nebraska Symposium on Motivation, 1990. tal conditions. Lincoln: University of Nebraska Press, pp. 165–198. There is evidence indicating that the planning of arm tra- Further Readings jectories is specified by the CNS in extrinsic coordinates. The analysis of arm movements has revealed kinematic Holland, D., and N. Quinn. (1987). Cultural Models in Language invariances (Abend, Bizzi, and Morasso, 1982; Morasso and Thought. Cambridge: Cambridge University Press. 1981). Remarkably, these simple and invariant features were detected only when the hand motion was described with Motor Control respect to a fixed Cartesian reference frame, a fact suggest- ing that CNS planning takes place in terms of the hand’s To specify a plan of action, the central nervous system motion in space (Flash and Hogan 1985). Even complex (CNS) must first transfer sensory inputs into motor goals curved movements performed by human subjects in an such as the direction, amplitude, and velocity of the obstacle-avoidance task displayed invariances in the hand’s intended movement. Then, to execute movements, the CNS motion and not in joint motion (Abend et al. 1982). The data must convert these desired goals into signals controlling the derived from straight and curved movements indicate that muscles that are active during the execution of even the sim- the kinematic invariances could be derived from a single plest kind of limb trajectory. Thus, the CNS must transform organizing principle based on optimizing endpoint smooth- information about a small number of variables (direction, ness (Flash and Hogan 1985). It follows that if actions are amplitude, and velocity) into a large number of signals to planned in spatial or extrinsic coordinates, then for the exe- many muscles. Any transformation of this type is “ill- cution of movement, the CNS must convert the desired posed” in the sense that an exact solution may be either not direction and velocity of the limb into signals that control available or not unique. How the nervous system computes muscles. these transformations has been the focus of recent studies. Investigators of motor control have been well aware of Specifically, to plan an arm trajectory toward an object, the computational complexities involved in the production the CNS first must locate the position of the object with of muscle forces. A variety of proposals have been made to respect to the body and represent the initial position of the explain these complexities. In theory, in a multijoint limb, arm. Recordings from single neurons in the parietal cortex the problem of generation forces may be addressed only and superior colliculus in awake monkeys have signifi- after the trajectory of the joint angles has been derived from cantly contributed to our understanding of how space is the trajectory of the endpoint—that is, after an inverse kine- represented. There is some evidence that in the parietal cor- matics problem has been solved. Investigations in robot tical areas there are retinotopic neurons whose activity is control in the late 1970s and early 1980s have shown that tuned by signals derived from somatosensory sources. both the inverse kinematic and inverse dynamic problems Their visual receptive field is modified by signals repre- may be efficiently implemented in a digital computer for senting both eye and head position. This result suggests many robot geometries. On the basis of these studies, inves- that parietal area 7a contains a representation of space in tigators have argued that the brain may be carrying out body-centered space. Neurons representing object location inverse kinematic and dynamic computations when moving in body-independent (allocentric) coordinates have also the arm in a purposeful way. been found in the parietal cortex and in the HIPPOCAMPUS One way to compute inverse dynamics is based on carry- ing out explicitly the algebraic operations after representing (Andersen et al. 1993). variables such as positions, velocity acceleration, torque, To specify the limb’s trajectory toward a target, the CNS and inertia. This hypothesis, however, is unsatisfactory must locate not only the position of an object with respect to because there is no allowance for the inevitable mechanical the body but also the initial position of the arm. The conven- Motor Learning 571 results obtained by other groups of investigators, who have vagaries associated with any interaction with the environ- demonstrated the existence of a few separate circuits for ment. controlling horizontal and vertical head movements in the Alternative proposals have been made that do not depend owl. These structures, which are located in the brain stem, on the solution of the complicated inverse-dynamic prob- receive inputs from the tectum and transform the tectal lem. Specifically, it has been proposed that the CNS may movement vectors into the neck motor-neural activation. transform the desired hand motion into a series of equilib- rium positions (Bizzi et al. 1984). The forces needed to See also MANIPULATION AND GRASPING; MOTION, PER- track the equilibrium trajectory result from the intrinsic CEPTION OF; MOTOR LEARNING; ROBOTICS AND LEARNING; elastic properties of the muscles (Feldman 1974). SINGLE-NEURON RECORDING; WALKING AND RUNNING According to the equilibrium-point hypothesis, as first MACHINES proposed by Feldman, limb movements result from a shift in —Emilio Bizzi the neurally specified equilibrium point. Studies of single and multijoint movements have provided experimental evi- References dence that supports the equilibrium-point hypothesis (Bizzi et al. 1984). The equilibrium-point hypothesis has implica- Abend, W., E. Bizzi, and P. Morasso. (1982). Human arm trajec- tions both for the control and for the computation of move- tory formation. Brain 105: 331–348. ments. With respect to control, the elastic properties of the Andersen, R. A., L. H. Snyder, C.-S. Li, and B. Stricanne. (1993). muscles provide instantaneous correcting forces when a Coordinate transformations in the representation of spatial limb is moved away from the intended trajectory by some information. Current Opinion in Neurobiology 3: 171–176. Bizzi, E., N. Accornero, W. Chapple, and N. Hogan. (1984). Pos- external perturbation. With respect to computation, the ture control and trajectory formation during arm movement. same elastic properties offer the brain an opportunity to deal Journal of Neuroscience 4: 2738–2744. with the inverse-dynamics problem. Once the brain has Bizzi, E., F. A. Mussa-Ivaldi, and S. Giszter. (1991). Computations achieved the ability to represent and control equilibrium underlying the execution of movement: A biological perspec- postures, it can master movements as temporal sequences of tive. Science 253: 287–291. such postures. In this context, a representation in the CNS of Feldman, A. G. (1974). Change of muscle length due to shift of the the inertial, viscous, and gravitational parameters contained equilibrium point of the muscle-load system. Biofizika 19: 534– in the equations of motion is no longer necessary. 538. Recently, a set of experiments performed in frogs with Flash, T., and N. Hogan. (1985). The coordination of arm move- spinal cords that were surgically disconnected from the ments: An experimentally confirmed mathematical model. Journal of Neuroscience 5: 1688–1703. brain stem has provided neurophysiological support for the Ghez, C., J. Gordon, and M. F. Ghilardi. (1993). Programming of equilibrium-point hypothesis. Microstimulation of the spi- extent and direction in human reaching movements. Biomedical nal cord demonstrated that this region is organized to pro- Research 14 (Suppl 1): 1–5. duce the neural synergies necessary for the expression of Masino, T., and E. I. Knudsen. (1990). Horizontal and vertical equilibrium points. These experiments have indicated that components of head movement are controlled by distinct neural the spinal cord contains circuitry that, when activated, pro- circuits in the barn owl. Nature 345: 434–437. duces precisely balanced contractions in groups of muscles. Morasso, P. (1981). Spatial control of arm movements. Experimen- These synergistic contractions generate forces that direct the tal Brain Research 42: 223–227. limb toward an equilibrium point in space (Bizzi, Mussa- Mussa-Ivaldi, F. A., S. F. Giszter, and E. Bizzi. (1994). Linear Ivaldi, and Giszter 1991). combinations of primitives in vertebrate motor control. Pro- ceedings of the National Academy of Sciences 91: 7534–7538. Experimental evidence also indicates that microstimula- tion of the lumbar gray results in a limited number of force patterns. More importantly, the simultaneous stimulation of Motor Learning two sites, each generating a force field, results in a force field proportional to the vector sum of the two fields (Mussa- Humans are capable of an impressive repertoire of motor Ivaldi, Giszter, and Bizzi 1994). Vector summation of force skills that range from simple movements, such as looking at fields implies that the complex nonlinearities that character- an object of interest by turning the head and eyes, to com- ize the interactions both among neurons and between neurons plex and intricate series of movements, such as playing a and muscles are in some way eliminated. This result has led violin or executing a triple somersault from a balance beam. to a novel hypothesis for explaining movement and posture Most movements are not performed perfectly the first time based on combinations of a few basic elements. The limited- around, but instead require extensive periods of practice. force pattern may be viewed as representing an elementary During practice, we detect errors in motor performance and alphabet from which, through superimposition, a vast number then modify subsequent movements to reduce or eliminate of movements could be fashioned by impulses conveyed by those errors. The iterative process of improving motor per- supraspinal pathways. With mathematical modeling, experi- formance by executing movements, identifying errors, and menters have verified that this novel view of the generation of correcting those errors in subsequent movements is called movement and posture has the competence required for con- motor learning. trolling a wide repertoire of motor behaviors. Motor learning occurs in behaviors that range in complex- The hypothesis that the premotor zones in the spinal gray ity from simple reflexive movements to highly developed may be the structures underlying the transformation from skills. The simplest form of motor learning is adaptation, in extrinsic to intrinsic coordinates is consistent with the 572 Motor Learning which muscular force generation changes to compensate for and MEMORY). The distinction between EXPLICIT MEMORY altered mechanical loads or sensory inputs. Adaptation can procedural and declarative memory was prompted by stud- involve movements across either a single joint or multiple ies of patients with amnesia caused by dysfunction of the joints, and can occur in both reflexive and voluntary move- part of the cerebral cortex called the medial temporal lobe. ments. The best understood example of this type of motor Despite a profound inability to remember the training ses- learning is in the vestibulo-ocular reflex, in which EYE MOVE- sions and other events and facts, amnesiac patients could be MENTS normally compensate for motion of the head such that trained to learn new motor skills or to improve existing images remain stable on the RETINA during head movements. skills with practice. This finding indicates that procedural If subjects experience persistent image motion during head and declarative memory involve distinct brain areas and movements (e.g., after vestibular trauma or when wearing mechanisms. new corrective lenses), motor learning produces adaptive Our knowledge of the brain regions involved in motor increases or decreases in compensatory eye movements that learning and memory derives from clinical studies of restore image stability during head movements. patients who have neurological diseases, stroke, or other Motor learning is not a unitary phenomenon, but can localized brain dysfunction, from brain imaging studies in affect many different components of sensory and motor pro- humans, and from neurophysiological recordings in animal cessing. MOTOR CONTROL involves both simple movement models. A number of distinct brain regions are involved in trajectories and complex series of movements in which mul- motor learning. The CEREBELLUM is required for adaptation, tiple muscles and joints must be controlled in precise tem- for conditioning, and for the learning and coordination of poral sequence. Motor learning refines simple movements movements that involve multiple joints and muscles. The by altering the magnitude and timing of muscular force gen- BASAL GANGLIA are involved in learning sequences of eration. For complex movements, motor learning is required movements, and are also critical for habit formation. to select and coordinate the appropriate muscular contrac- Although the studies of amnesiac patients indicate that the tions, to link together motor subroutines, and to create new medial temporal lobes are not required for motor learning, motor synergies by combining forces generated across mul- other cortical regions are clearly involved in the learning of tiple joints in novel spatial and temporal patterns. motor skills and the associations of sensory cues with Sensory processing is intimately linked with motor learn- appropriate motor programs. These include primary motor ing. Sensory information about the outcome of a movement cortex, somatosensory cortex, prefrontal cortex, and the is used to detect and evaluate errors in motor performance. supplementary motor areas. As learning proceeds and motor The nature of the sensory information used can vary and memories become consolidated, the relative contributions of depends on the movement being learned. For example, when neuronal activity in the various brain regions involved in one is learning to hit a tennis ball, vision provides the most motor learning can vary. The precise roles of distinct brain salient information about the accuracy of the shot, but soma- areas in motor learning and the neural mechanisms that tosensory information about the angles of the elbow and underlie the acquisition and retention of motor skills are wrist and the feel of the ball against the racket also provide areas of active investigation in neuroscience. important cues. A violin player evaluates his or her perfor- See also AGING, MEMORY, AND THE BRAIN; LEARNING mance with the auditory system, by listening for mistakes, —Sascha du Lac and also by monitoring the pressure of strings against fin- gers. As subjects attempt new movements and refine existing skills, they develop expectations of the sensory conse- References and Further Readings quences of their movements. During motor learning, the du Lac, S., J. L. Raymond, T. J. Sejnowski, and S. G. Lisberger. expected sensory outcomes of a movement are compared (1995). Learning and memory in the vestibulo-ocular reflex. with the actual outcomes, and the difference between what Annual Review of Neuroscience 18: 409–441. was expected and what actually occurred is used to drive Halsband, U., and H. J. Freund. (1993). Motor learning. Current changes in subsequent movements. Opinion in Neurobiology 3(6): 940–949. The sensory inputs used to detect and correct errors in Hikosaka, O. (1996). MRI in sequence learning. J. Neurophysiol. motor performance can change with practice. At the initial 76(1): 617–621. stages of motor learning, a subject may attend to a variety of Knowlton, B. J., J. Mangels, and L. R. Squire. (1996). A neostri- sensory stimuli, but as learning proceeds, attention becomes atal habit learning system in humans. Science 273(5280): restricted to salient sensory stimuli until eventually, as the 1399–1402. movement becomes perfected, the reliance on sensory cues Pascual-Leone, A., J. Grafman, and M. Hallett. (1995). Procedural learning and prefrontal cortex. Annals of the New York Acad- can disappear altogether. emy of Sciences 769: 61–70. The memories formed during motor learning are not Salmon, D. P., and N. Butters. (1995). Neurobiology of skill and accessible to conscious recall, but instead are expressed in habit learning. Current Opinion in Neurobiology 5 (2): 184–190. the context of motor performance. This type of subcon- Shadmehr, R., and H. H. Holcomb. (1997). Neural correlates of scious recollection of gradually learned skills is called “pro- motor memory consolidation. Science 277(5327): 821–825. cedural” (or “implicit”) memory and is also a feature of the Thach, W. T., H. P. Goodkin, and J. G. Keating. (1992). The cere- expression and formation of mental habits. In contrast, the bellum and the adaptive coordination of movement. Annual memory of facts and events, which can be learned in a sin- Review of Neuroscience 15: 403–442. gle trial and are subject to conscious recall, is termed Ungerleider, L. G. (1995). Functional brain imaging studies of cor- “declarative” (or “explicit”) memory (see IMPLICIT VS. tical mechanisms for memory. Science 270(5237): 769–775. Multiagent Systems 573 directly with issues of reconciling inconsistent beliefs and MRI accommodating local decisions made on the basis of partial, conflicting information (Durfee, Lesser, and Corkill 1992). See MAGNETIC RESONANCE IMAGING Even among cooperative agents, negotiation is often nec- essary to reach joint decisions. Through a negotiation pro- cess, for example, agents can convey the relevant informa- Multiagent Systems tion about their local knowledge and capabilities necessary to determine a principled allocation of resources or tasks Multiagent systems are distributed computer systems in among them. In the contract net protocol and its variants, which the designers ascribe to component modules auton- agents submit “bids” describing their abilities to perform omy, mental state, and other characteristics of agency. Soft- particular tasks, and a designated contract manager assigns ware developers have applied multiagent systems to solve tasks to agents based on these bids. When tasks are not eas- problems in power management, transportation scheduling, ily decomposable, protocols for managing shared informa- and a variety of other tasks. With the growth of the Internet tion in global memory are required. Systems based on a and networked information systems generally, separately blackboard architecture use this global memory both to designed and constructed programs increasingly need to direct coordinated actions of the agents and to share inter- interact substantively; such complexes also constitute multi- mediate results relevant to multiple tasks. agent systems. In a noncooperative setting, objectives as well as beliefs In the study of multiagent systems, including the field of and capabilities vary across agents. Noncooperative systems “distributed AI” (Bond and Gasser 1988) and much of the are the norm when agents represent the interests of disparate current activity in “software agents” (Huhns and Singh humans or human organizations. Note that having distinct 1997), researchers aim to relate aggregate behavior of the objectives does not necessarily mean that the agents are composite system with individual behaviors of the compo- adversarial or even averse to cooperation. It merely means nent agents and properties of the interaction protocol and that agents cooperate exactly when they determine that it is environment. Frameworks for constructing and analyzing in their individual interests to do so. multiagent systems often draw on metaphors—as well as The standard assumption for noncooperative multiagent models and theories—from the social and ecological sci- systems is that agents behave according to principles of ences (Huberman 1988). Such social conceptions are some- RATIONAL DECISION MAKING. That is, each agent acts to fur- times applied within an agent to describe its behaviors in ther its individual objectives (typically characterized in terms of interacting subagents, as in Minsky’s society of terms of UTILITY THEORY), subject to its beliefs and capabil- mind theory (Minsky 1986). ities. In this case, the problem of designing an interaction Design of a distributed system typically focuses on the mechanism corresponds to the standard economic concept interaction mechanism—specification of agent communica- of mechanism design, and the mathematical tools of GAME tion languages and interaction protocols. The interaction THEORY apply. Much current work in multiagent systems is mechanism generally includes means to implement deci- devoted to game-theoretic analyses of interaction mecha- sions or agreements reached as a function of the agents’ nisms, and especially negotiation protocols applied within interactions. Depending on the context, developers of a dis- such mechanisms (Rosenschein and Zlotkin 1994). Eco- tributed system may also control the configuration of partic- nomic concepts expressly drive the design of multiagent ipating agents, the INTELLIGENT AGENT ARCHITECTURE, or interaction mechanisms based on market price systems even the implementation of agents themselves. In any case, (Clearwater 1996). principled design of the interaction mechanism requires Both cooperative and noncooperative agents may derive some model of how agents behave within the mechanism, some benefit by reasoning expressly about the other agents. and design of agents requires a model of the mechanism Cooperative agents may be able to propose more effective rules, and (sometimes) models of the other agents. joint plans if they know the capabilities and intentions of the One fundamental characteristic that bears on design of other agents. Noncooperative agents can improve their bar- interaction mechanisms is whether the agents are presumed gaining positions through awareness of the options and pref- to be cooperative, which in the technical sense used here erences of others (agents that exploit such bargaining power means that they have the same objectives (they may have are called “strategic”; those that neglect to do so are “com- heterogeneous capabilities, and may also differ on beliefs petitive”). Because direct knowledge of other agents may be and other agent attitudes). In a cooperative setting, the role difficult to come by, agents typically induce their models of of the mechanism is to coordinate local decisions and dis- others from observations (e.g., “plan recognition”), within seminate local information in order to promote these global an interaction or across repeated interactions. objectives. At one extreme, the mechanism could attempt to See also AI AND EDUCATION; COGNITIVE ARTIFACTS; centralize the system by directing each agent to transmit its HUMAN-COMPUTER INTERACTION; RATIONAL AGENCY local state to a central source, which then treats its problem —Michael P. Wellman as a single-agent decision. This approach may be infeasible or expensive, due to the difficulty of aggregating belief References states, increased complexity of scale, and the costs and delays of communication. Solving the problem in a decen- Bond, A. H., and L. Gasser, Eds. (1988). Readings in Distributed tralized manner, in contrast, forces the designer to deal Artificial Intelligence. San Francisco: Kaufmann. 574 Multisensory Integration neither “ga” nor “ba,” but a synthesis of the two, “da.” Sim- Clearwater, S. H., Ed. (1996). Market-Based Control: A Paradigm for Distributed Resource Allocation. Singapore: World Scien- ilarly, in the “ventriloquism effect,” the sight of movement tific. (i.e., the dummy’s head and lips) compels one to believe it is Durfee, E. H., V. R. Lesser, and D. D. Corkill. (1992). Distributed also the source of the sound. problem solving. In Encyclopedia of Artificial Intelligence. 2nd Multisensory neurons, which receive input from more ed. New York: Wiley. than a single sensory modality, are found in many areas of Huberman, B. A., Ed. (1988). The Ecology of Computation. the CNS (see Stein and Meredith 1993 for a review). These Amsterdam: Elsevier. neurons are involved in a number of circuits, and presum- Huhns, M., and M. Singh, Eds. (1997). Readings in Agents. San ably in a variety of cognitive and behavioral functions. Francisco: Kaufmann. Thus, for example, multisensory neurons in neocortex are Minsky, M. (1986). The Society of Mind. New York: Simon and likely participants in the perceptual, mnemonic, and asso- Schuster. Rosenschein, J. S., and G. Zlotkin. (1994). Rules of Encounter: ciative processes that serve to bind together the modality- Designing Conventions for Automated Negotiation among specific components of a multisensory experience. Still Computers. Cambridge, MA: MIT Press. other multisensory neurons, positioned at the sensorimotor interface, are known to mediate goal-directed orientation behavior. Such neurons, a high incidence of which are found Multisensory Integration in the superior colliculus (SC), have been the most exten- sively studied, and serve as the model for deciphering how Because of its importance in forming an appropriate picture multiple sensory cues are integrated at the level of the single of the external world, the representation of sensory informa- neuron (see Stein and Meredith 1993 for review). Visual, tion has been a powerful driving force in EVOLUTION. Extant auditory, and somatosensory inputs converge on individual organisms possess an impressive array of specialized sen- neurons in the SC, where each of these modalities is repre- sory systems that allow them to monitor simultaneously a sented in a common coordinate frame. As a result, the host of environmental cues. This “parallel” processing of modality-specific receptive fields of an individual multisen- multiple cues not only increases the probability of detecting sory neuron represent similar regions of space. a given stimulus but, because the information carried along An example of an SC neuron’s ability to integrate two dif- each sensory channel reflects a different feature of that stim- ferent sensory inputs is illustrated in figure 1. When presented ulus, it also increases the likelihood of its accurate identifi- simultaneously and paired within their receptive fields, a cation. For example, stimuli that are similar along one visual and auditory stimulus result in a substantial response physical dimension (how they sound) might be identified on enhancement, well above the sum of the two individual the basis of a second dimension (how they look). But if a responses (see A1V). Conversely, when the auditory stimulus coherent representation of the external world is to be con- is presented outside its receptive field, the neuron’s ability to structed, and if the appropriate responses are to be gener- generate a vigorous response to the visual stimulus is sup- ated, the brain must synthesize the information originating pressed (see A2V). The timing of these stimuli is critical, and from these different sensory channels. One way in which the magnitude of their interaction changes when the interval such a multimodal representation is generated is by having between the two stimuli is manipulated (Meredith and Stein information from different sensory systems converge on a 1986). However, this interval or “temporal window” gener- common group of neurons. ally is quite broad (e.g., several hundred milliseconds). During the evolution of sensory systems, mechanisms The multisensory interactions that are observable at the were preserved or elaborated so that the combined action of level of the single neuron are reflected in the animal’s sensory systems would provide information not available behavior (Stein et al. 1989). Thus, its ability to detect and within any single sensory channel. Indeed, in many circum- orient toward a visual stimulus is markedly enhanced when stances, events are more readily perceived, have less ambi- it is paired with a neutral auditory cue at the same position guity, and elicit a response far more rapidly when signaled in space. However, if the auditory cue is spatially disparate by the coordinated action of multiple sensory modalities. from the visual, the response is strongly degraded. Sensory systems have evolved to work in concert, and nor- Although SC neurons can respond to different sensory mally, different sensory cues that originate from the same stimuli via inputs from a variety of structures, their ability to event are concordant in both space and time. The products of integrate multisensory information depends on projections this spatial and temporal coherence are synergistic intersen- from a specific region of neocortex (Wallace and Stein 1994). sory interactions within the central nervous system (CNS), If these inputs from cortex are removed, SC neurons continue interactions that are presumed to enhance the salience of the to respond to stimuli from different sensory modalities but fail initiating event. For example, seeing a speaker’s face makes to exhibit the synergistic interactions that characterize multi- the spoken message far easier to understand, especially in a sensory integration. At the behavioral level, animals can still noisy room (Sumby and Pollack 1954). orient normally to unimodal cues, but the benefit derived from Similarly, discordant cues from different modalities can combined cues is markedly diminished (Wilkinson, Meredith, have powerful effects on perception, as illustrated by a host and Stein 1996). This intimate relationship between cortex of interesting cross-modal illusions. One of the most com- and SC suggests that the higher-level cognitive functions of pelling of these is the so-called McGurk effect, wherein a the neocortex play a substantial role in controlling the infor- speaker lip-synchs the syllable “ga” in time with the sound mation-processing capability of multisensory neurons in the “ba” (McGurk and MacDonald 1976). The perception is of SC, as well as the overt behaviors they mediate. Naive Mathematics 575 associated circuitry should greatly aid in our understanding of how multisensory information is used in higher cognitive functions and, in doing so, reveal the neural basis of a fully integrated multisensory experience. See also BINDING PROBLEM; CONSCIOUSNESS, NEUROBI- OLOGY OF; MODULARITY OF MIND —Barry E. Stein, Terrence R. Stanford, J. William Vaughan, and Mark T. Wallace References McGurk, H., and J. MacDonald. (1976). Hearing lips and seeing voices. Nature 264: 746–748. Meredith, M. A., and B. E. Stein. (1986). Visual, auditory and somatosensory convergence on cells in superior colliculus results in multisensory integration. J. Neurophysiol. 56: 640– 662. Stein, B. E., and M. A. Meredith. (1993). The Merging of the Senses. Cambridge, MA: MIT Press. Stein, B. E., M. A. Meredith, W. S. Huneycutt, and L. McDade. (1989). Behavioral indices of multisensory integration: orientation to visual cues is affected by auditory stimuli. J. Cogn. Neurosci. 1: 12–24. Sumby, W. H., and I. Pollack. (1954). Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26: 212–215. Wallace, M. T., M. A. Meredith, and B. E. Stein. (1992). Integra- tion of multiple sensory modalities in cat cortex. Exp. Brain Res. 91: 484–488. Wallace, M. T., and B. E. Stein. (1994). Cross-modal synthesis in the midbrain depends on input from cortex. J. Neurophysiol. 71: 429–432. Wilkinson, L. K., M. A. Meredith, and B. E. Stein. (1996). The role of anterior ectosylvian cortex in cross-modality orientation and approach behavior. Exp. Brain Res. 112:1–10. Further Readings Cytowic, R. E. (1989). Synesthesia: A Union of the Senses. New York: Springer-Verlag. Lewkowicz, D. J., and R. Lickliter. (1994). The Development of Intersensory Perception: Comparative Perspectives. Hillsdale, NJ: Erlbaum. Stein, B. E., M. A. Meredith, and M. T. Wallace. (1994). Neural mechanisms mediating attention and orientation to multisen- sory cues. In M. Gazzaniga, Ed., The Cognitive Neurosciences. Figure 1. Multisensory integration in a visual-auditory SC neuron. Cambridge, MA: MIT Press, pp. 683–702. The two receptive fields (RFs) of this neuron (dark gray shading Walk, R. D., and L. H. Pick. (1981). Intersensory Perception and shows the region of their overlap) are shown at the top. Icons depict Sensory Integration. New York: Plenum Press. stimuli: visual (V) is a moving bar of light, auditory is a broad-band Welch, R. B., and D. H. Warren. (1986). Intersensory interactions. noise burst from a speaker either within (A1), or outside (A2) the In K. R. Boff, L. Kaufman, and J. P. Thomas, Eds., Handbook RF. Below, peristimulus time histograms and bar graphs (means) of Perception and Human Performance, vol. 1: Sensory Pro- show responses to the visual stimulus alone (movement is cesses and Perception. New York: Wiley, pp. 25-1–25-36. represented by a ramp), the within-field auditory stimulus alone (square wave), and the stimulus combination. The summary bar Naive Biology graph shows that the large response enhancement is greater than the sum of A+V. The bottom panel illustrates the inhibition of the visual response when the auditory stimulus is outside its RF. See FOLK BIOLOGY At present, comparatively little is known about the multi- Naive Mathematics sensory integrative properties of the cortical multisensory neurons presumed to be involved in various aspects of per- ception. However, they have been shown to share some of Whether or not schooling is offered, children and adults all the features of SC neurons (Wallace, Meredith, and Stein over the world develop an intuitive, naive mathematics. As 1992). Future studies detailing their response properties and long as number-relevant examples are part of their culture, 576 Naive Mathematics cross-language variability in the transparency of the base people will learn to reason about and solve addition and rules for number word generation. For example, in Chinese, subtraction problems with positive natural numbers. They the words for 10, 11, 12, 13 . . . 20, 21 . . . 30, 31 . . . and so also will rank order and compare continuous amounts, if forth, translate as 10, 10–1, 10–2, 10–3, . . . 2–10s-1 . . . 3– they do not have to measure with equal units. The notion of 10s-1 . . . 3–10s, and so forth. English has no comparable equal units is hard, save for the cases of money and time. pattern for the teens. This difference influences the rate at Universally, and without formal instruction, everyone can which children in different countries master the code for use money. Examples abound of child candy sellers, taxicab generating large numbers although it does not affect rate of drivers, fishermen, carpenters, and so on developing fluent learning of the count words for 1–9. American and Chinese quantitative scripts, including one for proportional reason- children learn these at comparable rates and use them ing. Of note is that almost always these strategies use the equally well to solve simple arithmetic problems (Miller et natural numbers and nonformal notions of mathematical al. 1995). operations. For example, the favored proportions strategy Almost all of the mathematics or arithmetic revealed in for Brazilian fishermen can be dubbed the “integer propor- the above examples from divergent settings, ages, and cul- tional reasoning”: the rule for reasoning is that one whole tural conditions map onto a common structure. Different number goes into another X number of times and there is no count lists all honor the same counting principles, and dif- remainder. ferent numbers are made by adding, subtracting, com- Intuitive mathematics serves a wide range of everyday posing, and decomposing natural numbers that are thought math tasks. For example, Liberian tailors who have no of in terms of counted sets. The favored mathematical enti- schooling can solve arithmetic problems by laying out and ties are the natural numbers; the favored operations addi- counting familiar objects, such as buttons. Taxicab drivers tion and subtraction, even if the task is stated as and child fruit vendors in Brazil invent solutions that serve multiplication or division. The general rule seems to be, them well (Nunes, Schliemann, and Carraher 1993). find a way to use whole numbers, either by counting, Two kinds of theories vie for an account of the origins decomposing N, subtracting, or doing repeated counting and acquisition of intuitive arithmetic. One idea is that and subtraction with whole numbers. Notions about con- knowledge of the counting numbers and their use in arith- tinuous quantity usually are not integrated with those about metic tasks builds from a set of reinforced bits of learning discrete quantities, where people prefer to use repeated about situated counting number routines. Given enough addition or subtraction if they can. This commonality of the learning opportunities, principles of counting and arithmetic underlying arithmetic structure and reliance on natural are induced (Fuson 1988). Despite the clear evidence that numbers is an important line of evidence for the idea that there are pockets of early mathematical competence, young counting principles and simple arithmetic are universal. children are far from perfect on tasks they can negotiate. The reliance on whole number strategies, even when pro- Additionally, the range of set sizes and tasks they can deal portional reasoning is used, is consistent with this con- with is limited. These facts constitute the empirical founda- clusion. tion for the “bit-bit” theory and would seem to constitute a Understanding the mathematician’s zero, negative num- problem for the “principle-first” account of intuitive mathe- bers, rational and irrational numbers, and all other higher matics, which proposes an innate, domain-specific, learning- mathematics does not contribute to the knowledge base of enabling structure. Although skeleton-like to start, such a intuitive mathematics. The formal side of mathematical structure serves to draw the beginning learner’s attention to understanding is outside the realm of intuitive mathematics seek out, attend to, and assimilate number-relevant data—be (Hartnett and Gelman 1998). Even the mathematical con- these in the physical, social, cultural and mental environ- cept of a fraction develops with considerable difficulty, a ments—that are available for the epigenesis of number- fact that is surely related to the problems people have learn- specific knowledge. ing to measure and understand the concept of equal units. True, there are many arithmetic reasoning tasks that Reliance on intuitive mathematics is ubiquitous, sometimes young children cannot do, and early performances are even to the point where it becomes a barrier to learning new shaky. But this would be expected for any learning account. mathematical concepts that are related to different structures Those who favor the principle-first account (Geary 1996; (Gelman and Williams 1997). A salient case in point is the Gelman and Williams 1997) point to an ever-increasing concept of rational numbers and the related symbol systems number of converging lines of evidence: animals and infants for representing them. Rational numbers are not generated respond to the numerical value of displays (Gallistel and by the counting principles. They are the result of dividing Gelman 1992; Wynn 1995); retarded children have consid- one cardinal number by another. Nevertheless, there is a erable difficulty with simple arithmetic facts, money, time, potent tendency for elementary school children to interpret and novel counting or arithmetic tasks—despite extensive lessons about rational numbers as if these were opportuni- in-school practice (e.g., Gelman and Cohen 1988); pre- ties to generalize their knowledge of natural numbers. For school children distinguish between novel count sequences example, they rank order fractions on the basis of the that are wrong and those which are unusual but correct; they denominator and therefore say 1/75 is larger than 1/56, and also invent counting solutions to solve arithmetic problems so on. There is a growing body of evidence that the mastery (Siegler and Shrager 1984; Starkey and Gelman 1982); and of mathematical concepts outside the range of those encom- elementary school children invent counting solutions to passed by intuitive mathematics constitutes a difficult con- solve school arithmetic tasks in ways that differ from those ceptual challenge. they are taught in school (Resnick 1989). Moreover, there is Naive Mathematics 577 See also Gelman, R., and C. R. Gallistel. (1978). The Child’s Understand- DOMAIN SPECIFICITY; HUMAN UNIVERSALS; ing of Number. Cambridge, MA: Harvard University Press. INFANT COGNITION; NATIVISM; NUMERACY AND CULTURE; Greer, B. (1992). Multiplication and division as models of situa- SCIENTIFIC THINKING AND ITS DEVELOPMENT tions. In D. A. Grouws, Ed., Handbook of Research on Mathe- —Rochel Gelman matics Teaching and Learning: A Project of The National Council of Teachers of Mathematics. New York: Macmillan. Groen, G., and L. B. Resnick. (1977). Can preschool children References invent addition algorithms? Journal of Educational Psychology 69: 645–652. Fuson, K. C. (1988). Children’s Counting and Concepts of Num- Lave, J., and E. Wenger. (1991). Situated Learning: Legitimate ber. New York: Springer. Peripheral Participation. Cambridge: Cambridge University Gallistel, C. R., and R. Gelman. (1992). Preverbal and verbal Press. counting and computation: Cognition 44(1–2), 43–74. Special Saxe, G. B., S. R. Guberman, and M. Gearhart. (1987). Social pro- issue on numerical cognition. cesses in early development. Monographs of the Society for the Geary, D. C. (1996). Biology, culture, and cross-national differ- Research in Child Development 52. ences in mathematical ability. In R. J. Sternberg and T. Ben- Sophian, C. (1994). Children’s Numbers. Madison, WI: W. C. B. Zeev, Eds., The Nature of Mathematical Thinking. The Studies Brown and Benchmark. in Mathematical Thinking and Learning Series. Mahwah, NJ: Starkey, P., E. S. Spelke, and R. Gelman. (1990). Numerical Erlbaum, pp. 145–171. abstraction by human infants. Cognition 36(2): 97–127. Gelman, R. (1993). A rational-constructivist account of early Stevenson, H., and J. Stigler. (1992). The Learning Gap. New learning about numbers and objects. In D. Medin, Ed., Learn- York: Summit Books. ing and Motivation, vol. 30. New York: Academic Press. Wynn, K. (1995). Infants possess a system of numerical knowl- Gelman, R., and M. Cohen. (1988). Qualitative differences in the edge. Current Directions in Psychological Science 4: 172–177. way Down’s syndrome and normal children solve a novel counting problem. In L. Nadel, Ed., The Psychobiology of Down’s Syndrome. Cambridge, MA: MIT Press, pp. 51–99. Naive Physics Gelman, R., and B. Meck. (1992). Early principles aid initial but not later conceptions of number. In J. Bideaud, C. Meljac, and J. Fischer, Eds., Pathways to Number. Hillsdale, NJ: Erlbaum, Naive physics refers to the commonsense beliefs that people pp. 171–189. hold about the way the world works, particularly with Gelman, R., and E. Williams. (1997). Enabling constraints for cog- respect to classical mechanics. Being the oldest branch of nitive development and learning: Domain-specificity and epi- physics, classical mechanics has priority because mechanical genesis. In D. Kuhn and R. Siegler, Eds., Cognition, Perception, systems can be seen, whereas the motions relevant to other and Language, 5th ed., vol. 2 of W. Damon, Ed., Handbook of branches of physics are invisible. Because the motions of Child Psychology. New York: Wiley. mechanical systems are both lawful and obvious, it is always Hartnett, P. M., and R. Gelman. (1998). Early understandings of number: Paths or barriers to the construction of new under- intriguing to find instances in which people hold beliefs standings? Learning and Instruction 8, 341–374. about mechanics that are not just underdeveloped but sys- Miller, K. F., C. M. Smith, J. Zhu, and H. Zhang. (1995). Preschool tematically wrong. origins of cross-national differences in mathematical compe- Jean PIAGET (1952, 1954) studied how young children tence: The role of number-naming systems. Psychological Sci- acquire an understanding of the basic physical dimensions ence 6: 56–60. of the world and demonstrated that at young ages, children Nunes, T., A. D. Schliemann, and D. W. Carraher. (1993). Street are systematically disposed to construe the world in biased Mathematics and School Mathematics. Cambridge: Cambridge ways. Interestingly, it was also found that adults often do University Press. not exhibit simple physical concepts that Piaget assumed Resnick, L. (1989). Developing mathematical knowledge. Ameri- they must have. A notable example of this is the water level can Psychologist 44: 162–169. Siegler, R. S., and J. Shrager. (1984). Strategy choices in addition: problem introduced by Piaget and Inhelder (1956). When How do children know what to do? In C. Sophian, Ed., Origins asked to indicate the surface orientation of water in a tilted of Cognitive Skills. Hillsdale, NJ: Erlbaum. container, about 40 percent of the adult population produce Starkey, P., and R. Gelman. (1982). The development of addition estimates that systematically deviate from the horizontal by and subtraction abilities prior to formal schooling. In T. P. Car- more than 5 degrees (cf. McAfee and Proffitt 1991). penter, J. M. Moser, and T. A. Romberg, Eds., Addition and A large number of studies have found that adults express Subtraction: A Developmental Perspective. Hillsdale, NJ: systematic errors in their reasoning about how objects natu- Erlbaum. rally move in the world (Champagne, Klopher, and Ander- son 1980; Clement 1982; Kaiser, Jonides, and Alexander Further Readings 1986; McCloskey 1983; McCloskey, Caramazza, and Behr, M. J., G. Harrell, T. Post, and R. Lesh. (1992). Rational num- Green 1980; McCloskey and Kohl 1983; Shanon 1976). An ber, ratio, and proportion. In D. A. Grouws, Ed., Handbook of excellent introduction to this research can be found in Research on Mathematics Teaching and Learning: A Project of McCloskey (1983), who dubbed this field of study “Intui- The National Council of Teachers of Mathematics. New York: tive Physics.” The best-known example of these problems Macmillan. is the C-shaped tube problem, in which a participant is Fischbein, E., M. Deri, M. Nello, and M. Marino. (1985). The role asked to predict the trajectory taken by a ball after it exits a of implicit models in solving problems in multiplication and C-shaped tube lying flat on a table (McCloskey, Cara- division. Journal of Research in Mathematics Education 16: 3– mazza, and Green 1980). The correct answer is that the ball 17. 578 Naive Physics will follow a straight trajectory tangent to the tube’s curva- 1992; Proffitt, Kaiser, and Whelan 1990) found that when ture at the point of exit. About 40 percent of college stu- presented with animations of particle motion problems, peo- dents get this problem wrong and predict instead that the ple judged their own predictions to be unnatural and ball will continue to curve after exiting the tube. selected natural motions as appearing correct. For example, There are two classes of explanations for these findings. when contrived animations were presented to people who The first supposes that people possess a general mental drew curved paths on the paper-and-pencil version of the model that dictates the form of their errors (see MENTAL problem, these people reported that balls rolling through a C-shaped tube and continuing to curve upon exit looked MODELS). One such proposal is that naive physics reflects an Aristotelian model of mechanics. Shanon (1976) found that very odd, whereas straight paths appeared natural. Anima- many people reasoned, like Aristotle, that objects will fall at tions, however, did not evoke more accurate judgments for a constant velocity proportional to their mass. In his early extended-body motions (Proffitt, Kaiser, and Whelan 1990). writings, diSessa (1982) also argued that people display For example, Howard (1978) and McAfee and Proffitt Aristotelian tendencies. McCloskey (1983) suggested that (1991) found that viewing animations of liquids moving to people’s intuitive model resembled medieval impetus the- nonhorizontal orientations in tilting containers did not evoke ory. By this account, an object is made to move by an impe- more accurate judgments from people prone to err on this tus that dissipates over time. However, there are at least two problem. problems with these mental model approaches to naive Adults’ naive conceptions about how the world works physics. The first is that people are not internally consistent appear to be simplistic, inconsistent, and situation-specific. (Cooke and Breedin 1994; diSessa 1983; Kaiser et al. 1992; However, recent research with infants suggests that a few Ranney and Thagard 1988; Shanon 1976). The same person core beliefs may underlie all dynamical reasoning (see will respond to different problems in a manner that suggests INFANT COGNITION). Baillargeon (1993) and Spelke et al. the application of different models. The second problem is (1992) have shown that, by around 2 1/2 months of age, that people are strongly influenced by the surface structure infants can reason about the continuity and solidity of of the problem. Kaiser, Jonides, and Alexander (1986) objects involved in simple events. Other physical concepts, found that people do not err on the C-shaped tube problem such as gravity and inertia, do not seem to enter infants’ rea- when the situation is put in a more familiar context. For soning until much later, around 6 months of age (Spelke et example, no one predicts that water exiting a curved hose al. 1992). Spelke et al. proposed the intriguing notion that will continue to curve upon exit. Although people’s dynam- continuity and solidity are core principles that persist ical judgments seem not to adhere to either implicit Aristo- throughout the development of people’s naive physics. telian or impetus theories, this does not imply that they have See also COGNITIVE DEVELOPMENT; FOLK BIOLOGY; no mental models applicable to natural dynamics. People NAIVE MATHEMATICS; SCIENTIFIC THINKING AND ITS DEVEL- may possess general models having some as yet undeter- OPMENT; THEORY OF MIND mined structure, or their models may be domain specific. —Dennis Proffitt The second type of EXPLANATION for people’s systematic errors appeals to issues of problem complexity. Proffitt and References Gilden (1989) proposed an account of dynamical event com- plexity that parsed mechanical systems into two classes. Par- Baillargeon, R. (1993). The object concept revisited: New direc- ticle motions are those that can be described mathematically tions in the investigation of infants’ physical knowledge. In C. by treating the moving object as if it were a point located at E. Granrud, Ed., Carnegie Symposium on Cognition: Visual its center of mass. Extended-body motions are those contexts Perception and Cognition in Infancy. Hillsdale, NJ: Erlbaum, pp. 265–315. in which the object’s mass distribution, size, and orientation Champagne, A. B., L. E. Klopher, and J. H. Anderson. (1980). influence its motion. As an example, consider a wheel. If the Factors influencing the learning of classical mechanics. Ameri- wheel is dropped in a vacuum, then its velocity is simply a can Journal of Physics 48: 1074–1079. function of the distance that its center of mass has fallen. Clement, J. (1982). Students’ preconceptions in introductory Ignoring air resistance, freefall is a particle motion. On the mechanics. American Journal of Physics 50: 66–71. other hand, if the wheel is placed on an inclined plane and Cooke, N. J., and S. D. Breedin. (1994). Constructing naive theo- released, then the wheel’s shape—its moment of inertia—is ries of motion on the fly. Memory and Cognition 22: 474–493. dynamically relevant. This is an extended-body motion con- diSessa, A. (1982). Unlearning Aristotelian physics: A study of text. People reason fairly well about particle motion prob- knowledge-based learning. Cognitive Science 6: 37–75. lems but not extended-body motion ones. In addition, people diSessa. A. (1983). Phenomenology and the evolution of intuition. In D. Gentner and A. L. Stevens, Eds., Mental Models. Hills- also err when they misrepresent a particle motion as being an dale, NJ: Erlbaum, pp. 15–33. extended-body motion as, for example, in the C-shaped tube Howard, I. (1978). Recognition and knowledge of the water-level problem. In doing so, they attribute more dimensionality to problem. Perception 7: 151–160. the problem than is actually there. Kaiser, M. K., J. Jonides, and J. Alexander. (1986). Intuitive rea- Given that people often predict that events will follow soning about abstract and familiar physics problems. Memory unnatural courses—for example, that a ball exiting a C- and Cognition 14: 308–312. shaped tube will persist to follow a curved path—it is inter- Kaiser, M. K., D. E. Proffitt, and K. A. Anderson. (1985). Judg- esting to ask what would happen if they actually saw such ments of natural and anomalous trajectories in the presence and an event occur. Would it look odd or natural? Kaiser and absence of motion. Journal of Experimental Psychology: Proffitt (Kaiser, Proffitt, and Anderson 1985; Kaiser et al. Human Perception and Performance 11: 795–803. Naive Sociology 579 Kaiser, M. K., D. R. Proffitt, S. M. Whelan, and H. Hecht. (1992). Experimental Psychology: Learning, Memory, and Cognition Influence of animation on dynamical judgments. Journal of 9: 636–649. Experimental Psychology: Human Perception and Performance Smith, B., and R. Casati. (1994). Naive physics. Philosophical 18: 384–393. Psychology 7: 227–247. McAfee, E. A., and D. R. Proffitt. (1991). Understanding the sur- Spelke, L. S. (1991). Physical knowledge in infancy: Reflections face orientation of liquids. Cognitive Psychology 23: 669– on Piaget’s theory. In S. Carey and R. Gelman, Eds., The Epi- 690. genesis of Mind: Essays on Biology and Cognition. Hillsdale, McCloskey, M. (1983). Intuitive physics. Scientific American 248: NJ: Erlbaum. 122–130. McCloskey, M., A. Caramazza, and B. Green. (1980). Curvilinear Naive Psychology motion in the absence of external forces: Naive beliefs about the motion of objects. Science 210: 1139–1141. McCloskey, M., and D. Kohl. (1983). Naive physics: The curvilin- See FOLK PSYCHOLOGY ear impetus principle and its role in interactions with moving objects. Journal of Experimental Psychology: Learning, Mem- Naive Sociology ory, and Cognition 9: 146–156. Piaget, J. (1952). The Origins of Intelligence in Childhood. New York: International Universities Press. Humans everywhere possess elaborate and often articulate Piaget, J. (1954). The Construction of Reality in the Child. New knowledge of the social world. Central to this knowledge is York: Basic Books. the recognition of and reasoning about those groupings of Piaget, J., and B. Inhelder. (1956). The Child’s Conception of individuals that constitute the social world. Naive sociology Space. London: Routledge and Kegan Paul. is the study of the cognitive processes underlying these Proffitt, D. R., and D. L. Gilden. (1989). Understanding natural dynamics. Journal of Experimental Psychology: Human Per- everyday beliefs about human groups and human group ception and Performance 15: 384–393. affiliation. Proffitt, D. R., M. K. Kaiser, and S. M. Whelan. (1990). Under- That humans develop complex representations of soci- standing wheel dynamics. Cognitive Psychology 22: 342–373. ety is not surprising. Humans almost certainly know more Ranney, M., and P. Thagard. (1988). Explanatory coherence and about other humans than they do about any other aspect of belief revision in naive physics. In Proceedings of the Tenth the world, and group living is a hallmark of human exist- Annual Conference of the Cognitive Science Society. Hillsdale, ence. Group living likely includes adaptation to the fact NJ: Erlbaum, pp. 426–432. that humans may be the only species in which conspecifics Shanon, B. (1976). Aristotelianism, Newtonianism, and the phys- are the principal predator (Alexander 1989). Since much ics of the layman. Perception 5: 241–243. of this predation is regulated by and implemented through Spelke, E. S., K. Breinlinger, J. Macomber, and K. Jacobson. (1992). Origins of knowledge. Psychological Review 99: 605– social groups, cognitive skills, like the capacity to rapidly 632. and accurately interpret the behavior and motivations of others, are critical for survival. Further Readings Human social groupings are more complex and more fluid than those of other social species. Consequently, the Caramazza, A., M. McCloskey, and B. Green. (1981). Naive rapid and accurate appraisal of the social environment is beliefs in “sophisticated” subjects: Misconceptions about tra- both difficult to achieve and demanding of cognitive jectories of objects. Cognition 9: 117–123. resources. Major tasks include the capacity to represent and Chi., M. T. H., and J. D. Slotta. (1993). The ontological coherence to compute information about (1) large numbers of groups, of intuitive physics. Cognition and Instruction 10: 249–260. Clement, J. (1983). A conceptual model discussed by Galileo and (2) varied group affiliations, and (3) shifting coalitions used intuitively by physics students. In D. Gentner and A. L. between groups. A number of mechanisms underlie these Stevens, Eds., Mental Models. Hillsdale, NJ: Erlbaum, pp. 325– capacities, and their precise nature remains a matter of some 339. controversy. diSessa, A. (1993). Toward an epistemology of physics. Cognition Considerable research in social psychology, particularly and Instruction 10: 105–225. group dynamics, has revealed and interpreted many pro- Gilden, D. L. (1991). On the origins of dynamical awareness. Psy- cesses pertinent to these capacities. Like the bulk of psy- chological Review 98: 554–568. chology, work in SOCIAL COGNITION tends to approach Hubbard, T. L. (1996). Representational momentum: Centripetal sociality from a domain-general perspective. Thus, repre- force, and curvilinear impetus. Journal of Experimental Psy- sentations of group-level phenomena, like social identity, chology: Learning, Memory, and Cognition 22: 1049–1060. Kaiser, M. K., D. R. Proffitt, and M. McCloskey. (1985). The are typically interpreted as instances of general cognitive development of beliefs about falling objects. Perception and strategies for processing categories. Patterns of inferencing Psychophysics 38: 533–539. associated with social categories (e.g., STEREOTYPING and Larkin, J. H. (1983). The role of problem representation in physics. prejudice), on this view, involve general category effects In D. Gentner and A. L. Stevens, Eds., Mental Models. Hills- that simply happen to target person categories (Fiske and dale, NJ: Erlbaum, pp. 75–98. Taylor 1991; Hamilton 1981). McCloskey, M. (1983). Naive theories of motion. In D. Gentner Other research in social psychology has identified mecha- and A. L. Stevens, Eds., Mental Models. Hillsdale, NJ: nisms that specifically act on mental representations of Erlbaum, pp. 299–324. human groupings. Research on stereotyping has contributed McCloskey, M., A. Washburn, and L. Felch. (1983). Intuitive important insights into cognitions of group-level phenomena, physics: The straight-down belief and its origin. Journal of 580 Naive Sociology particularly insights into the relationship between ascribed human diversity (e.g., race, ethnicity, nationality, and gen- group affiliation and explanations for the beliefs and behav- der) may derive via analogy from the notion of species in iors of members of other groups (Hogg and Abrams 1988; FOLK BIOLOGY (Atran 1990; Boyer 1990; Rothbart and Tay- Pettigrew 1979; Taylor and Fiske 1991; Miller and Prentice lor 1990). In much the same vein, other aspects of social forthcoming). reasoning (e.g., the willingness to interpret behavior in Influential studies by Tajfel (1981) demonstrate that terms of traits and dispositions) have been attributed to the- biases of this sort may be extremely general in the sense that ory of mind (Wellman 1990). they are not tethered to any actual group affiliation. Tajfel Hirschfeld (1995) and Jackendoff (1992) argue that men- and his colleagues have shown that individuals, in virtually tal representations of human groups are also governed by a any situation, privilege members of their own group distinct cognitive faculty of social cognition or naive sociol- (ingroup) vis-à-vis members of other groups (outgroups). ogy. Noam Chomsky (1988), in a discussion of bilingual- Thus, even when subjects know that the ingroup has no real- ism, implies something of the same when he observes that world group status (e.g., when the ingroup is composed of young children have theories of both language and society all persons whose social security numbers end in the same that they must coordinate in determining, among other digit), they distribute pretend money more readily to mem- things, the particular language to speak in a given context. bers of their own group than to members of an outgroup. The basic task of a faculty of social cognition is to develop Biases of this sort are extremely resistant to change and an integrated picture of the self in society. Whereas the fun- attempts to inhibit spontaneous group-related favoritism damental units of spatial cognition are physical objects in have been largely ineffective (Miller and Brewer 1984; space, those of social cognition are persons in social interac- Gaertner et al. 1993). tion (Jackendoff 1992: 72). On this view, the notion of per- These studies typically approach group-relevant cogni- sons in social interaction involves at least two elements that tions from the perspective of the individual, both with set the domain of social cognition apart from other domains. respect to the individual who perceives group affiliation First, the causal principles of social relations (e.g., consan- from the vantage point of him or herself and with respect to guinity, group membership, and dominance) appear to be the individual as target of bias. unrelated to those underlying other domains of knowledge. Evolutionary and comparative studies have been espe- Second, the fundamental unit of social cognition, the per- cially important in making clear that mental representations son, is a singular conceptual entity. As already noted, of group-level phenomena also include beliefs about groups humans have a number of highly specialized input devices themselves. EVOLUTIONARY PSYCHOLOGY, COGNITIVE AN- that allow the identification of specific persons and the THROPOLOGY, AND ETHNOPSYCHOLOGY all speak directly or interpretation of their actions. indirectly to the role representations of groups play in soci- The concept of the person itself may be contingent on ality (Alexander 1989; Dunbar 1988; Brereton 1996; War- group-relevant cognitions. The image of a social person, for necke, Masters, and Kempter 1992; Fishbein 1996; Shaw and instance, may be a conceptual prerequisite for other individ- Wong 1989; Reynolds, Falger, and Vine 1987; Cosmides ually oriented domain-specific competencies. Recent work 1989; LeVine and Campbell 1972), as does comparative re- with young children, for example, suggests that the notion search on DOMINANCE IN ANIMAL SOCIAL GROUPS and group may developmentally preceed the notion of self (Hir- schfeld 1996). Similarly, in theory of mind the person is the SOCIAL COGNITION IN ANIMALS. Much of this work reveals the importance of domain-spe- entity to which beliefs and desires are attributable (except in cific and modular mechanisms to naive sociology. Evolution rare and pathological circumstances, like multiple personal- prepares all living things to resolve (or attempt to resolve) ity disorder; see Hacking 1995). Yet belief/desire psychol- recurrent problems facing the organism. It is extremely ogy, taken by some to be the backbone of social reasoning likely that evolved adaptations emerged in response to (e.g., Baron-Cohen 1995), may well be insufficient to recurring social problems that our ancestral populations account for social reasoning in that it is insufficient to faced (Baron-Cohen 1995). Relevant evolved adaptations account for representations of groups. For instance, it is a include specialized mechanisms in both humans and nonhu- commonplace in anthropological analysis to proceed with- man animals (particularly primates) such as a THEORY OF out reference to individuals at all on the belief that social groups and social affiliation are distinct from (and perhaps MIND; domain-specific devices for the recognition of faces, antecedent to) knowledge of individuals (Mauss 1985). voices, and affective states; cheater detectors; and capacities Indeed, social analysis would be impoverished without for representing social dominance. invoking the notion of corporate groups (groups that are Other capacities that evolved to coordinate information conceptualized as corporate individuals rather than collec- relevant to nonsocial phenomena may have also been tions of individuals; Brown 1976). recruited to treat social group-level phenomena. Scholars in A major cognitive issue in this regard is the nature and the domain-specific tradition, using beliefs about NATURAL scope of cognitive resources that human sociality demands. KINDS as a point of departure, have proposed that concepts The social units with which any individual can affiliate are of human groupings are organized around principles that many and varied. A critical task for both children and adults initially emerge in naive understanding of nonhuman group- is to develop skills at high-speed scanning of social contexts ings (particularly the folk notion of species). Strategies for and high-speed identification of the appropriate (or strate- classifying and reasoning about human groups are strikingly gic) affiliations and allegiances invoked in a given context. similar to strategies for classifying and reasoning about non- For example, choosing something as “simple” as the correct human species. It has been argued that notions that capture Narrow Content 581 register of speech for a particular situation depends on ade- Fiske, S., and S. Taylor. (1991). Social Cognition. New York: McGraw Hill. quately parsing the social affiliations of the individuals in Gaertner, S., J. Dovidio, A. Anastasio, B. Bachman, and M. Rust. that context (Hirschfeld and Gelman 1997). (1993). The common in-group identity: Recategorization and The complexity of the social environment led Hirschfeld the reduction of intergroup bias. European Review of Social (1996) to propose the existence of specialized knowledge Psychology 4: 1–26. structures dedicated to social group understanding. He Hacking, I. (1995). Rewriting the Soul: Multiple Personality and argues that identifying and reasoning about “natural” the Sciences of Memory. Princeton: Princeton University Press. groupings (i.e., groups such as race and gender that are Hamilton, D. (1981). Illusory correlation as a basis for stereotyp- considered immutable and derived from a unique group ing. In D. Hamilton, Ed., Cognitive Processes in Stereotyping essence) rest on mechanisms unique to social reasoning. and Intergroup Behavior. Hillsdale, NJ: Erlbaum. Thus, despite the predominant view that preschoolers are Hirschfeld, L. (1995). Do children have a theory of race? Cogni- tion 54: 209–252. conceptually unable to reason beyond external properties Hirschfeld, L. (1996). Race in the Making: Cognition, Culture, and (Aboud 1988), Hirschfeld found that even quite young chil- the Child’s Construction of Human Kinds. Cambridge: MIT dren represent the social environment in terms of abstract Press. principles and nonvisible qualities. For instance, even 3- Hirschfeld, L., and S. Gelman. (1997). Discovering social differ- year-olds distinguish “natural” human kinds from other ence: the role of appearance in the development of racial aware- ways of sorting people and attribute group membership to ness. Cognitive Development 25: 317–350. underlying and unique essences that are transmitted from Hogg, M., and D. Abrams. (1988). Social Identifications: A Social parent to child. Psychology of Intergroup Relations and Group Processes. Lon- In sum, cognitive science has provided important don: Routledge. insights into the nature and scope of group living. Many Jackendoff, R. (1992). Language of the Mind: Essays on Mental Representation. Cambridge, MA: MIT Press. questions remain open. What is the relationship between LeVine, R., and D. Campbell. (1972). Ethnocentrism: Theories of knowledge of group-level and individual-level phenomena? Conflict, Ethnic Attitudes, and Group Behavior. New York: Wiley. Given the marked variation in sociality, what role does the Mauss, M. (1985). A category of the human mind: The notion of cultural environment play in shaping social understanding? person. In M. Carrithers, S. Collins, and S. Lukes, Eds., The To what extent does this marked variation preclude evolu- Category of Person. New York: Cambridge University Press. tionary accounts? If it does not, what kinds of adaptations Miller, D., and D. Prentice. (Forthcoming). Social consequences of evolved to treat social phenomena? What was the evolu- a belief in group essence: the category divide hypothesis. In D. tionary environment like in which these adaptations Prentice and D. Miller, Eds., Cultural Divides: Understanding emerged? and Resolving Group Conflict. New York: Russell Sage Foun- dation. See also DOMAIN SPECIFICITY; ESSENTIALISM; NAIVE Miller, N., and M. Brewer. (1984). Groups in Contact: The Psy- PHYSICS chology of Desegregation. New York: Academic Press. —Lawrence A. Hirschfeld Pettigrew, T. (1979). The ultimate attribution error: Extending All- ports’ cognitive analysis. Personality and Social Psychology References and Further Readings Bulletin 5: 461–476. Reynolds, V., V. Falger, and I. Vine. (1987). The Sociobiology of Aboud, F. E. (1988). Children and Prejudice. New York: Black- Ethnocentrism: Evolutionary Dimensions of Xenophobia, Dis- well. crimination, Racism, and Nationalism. London: Croom Helm. Alexander, R. (1989). Evolution of the human psyche. In P. Mel- Rothbart, M., and M. Taylor. (1990). Category labels and social lars and C. Stringer, Eds., The Human Revolution: Behavioural reality: Do we view social categories as natural kinds? In G. and Biological Perspectives on the Origins of Modern Humans. Semin and K. Fiedler, Eds., Language and Social Cognition. Princeton: Princeton University Press. London: Sage. Atran, S. (1990). Cognitive Foundations of Natural History. New Shaw, R., and Y. Wong. (1989). Genetic Seeds of Warfare: Evolu- York: Cambridge University Press. tion, Nationalism, and Patriotism. Boston: Unwin Hyman. Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Tajfel, H. (1981). Human Groups and Social Categories. Cam- Theory of Mind. Cambridge, MA: MIT Press. bridge: Cambridge University Press. Boyer, P. (1990). Tradition as Truth and Communication. New Taylor, S., S. Fiske, N. Etcoff, and A. Ruderman. (1978). The cate- York: Cambridge University Press. gorical and contextual bases of person memory and stereotyp- Brereton, A. (1996). Coercion-defense hypothesis: The evolution ing. Journal of Personality and Social Psychology 36: 778–793. of primate sociality. Folia Primatol. 64: 207–214. Warnecke, A., R. Masters, and G. Kempter. (1992). The roots of Brown, D. (1976). Principles of Social Structure: Southeast Asia. nationalism: Nonverbal behavior and xenophobia. Ethology London: Duckworth. and Sociobiology 13: 267–282. Chomsky, N. (1988). Language and Problems of Knowledge: The Wellman, H. (1990). The Child’s Theory of Mind. Cambridge: MIT Managua Lectures. Cambridge, MA: MIT Press. Press. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason Narrow Content selection task. Cognition 31: 187–276. Dunbar, R. (1988). Primate Social Systems. Ithaca: Cornell Uni- versity Press. According to some causal theories, the referent of a term Fishbein, H. (1996). Peer Prejudice and Discrimination: Evolu- like “water” is whatever substance bears the appropriate tionary, Cultural, and Developmental Dynamics. Boulder, CO: causal relation to the use of that term (Putnam 1975; Kripke Westview Press. 582 Narrow Content 1980; see also Fodor 1987; Dretske 1981; Stampe 1977). rus,” because these are unavailable to the subject and so This view is supported by Putnam’s TWIN EARTH example, cannot explain how it could be rational to hold these beliefs according to which the referents of our terms, and hence the simultaneously. This last point reveals a problem for con- truth conditions and meanings of utterances and the con- ceptual or FUNCTIONAL ROLE SEMANTICS and for procedural tents of our thoughts, depend on conditions in our environ- semantics as accounts of narrow content (Block 1986; Loar ment and so are not determined by (do not supervene on) 1982; Field 1977; Schiffer 1981; Miller and Johnson-Laird our individual, internal psychology alone. Content that does 1976; Johnson-Laird 1977; see also Harman 1982). Func- not supervene on an individual subject’s internal psychol- tional role semantics answers the question what the subjects ogy is called broad content. and their Twin Earth doppelgängers have in common by ref- Several considerations, however, suggest the need for a erence to the equivalence of the functional states underlying concept of content that would supervene on internal psy- their beliefs (see FUNCTIONALISM). But, because these func- chology, that is, a concept of narrow content. We normally tional properties, like the causal chains, are not available to assume that our behavior is causally explained by our the subject in question, this approach cannot characterize intentional states such as beliefs and desires (see INTEN- the world as it presents itself to the brain in the vat or ratio- TIONALITY). We also assume that our behavior has its nalize and justify the behavior of the uninformed subject. causal explanation in our individual, internal psychologi- Another approach to narrow content exploits an analogy cal makeup (see INDIVIDUALISM). But if the explanation between broad contents and the contents of token-reflexive of behavior supervenes on individual, internal psychol- utterances involving INDEXICALS AND DEMONSTRATIVES ogy, and the broad contents of our intentional states do (White 1982; Fodor 1987). Suppose Jones and Jones’ dop- not, then it seems that either those states will not figure in pelgänger on Twin Earth both say “It’s warm here.” Because the causal explanations of a genuine psychological sci- they have different locations, they say different things and ence or that a notion of narrow content is required. Some express different belief contents. What is common to the theorists have challenged this argument, however, by two expressions, however, is a function from their contexts denying the first assumption (Stich 1978; 1983), some of utterance to the contents expressed. If Jones had uttered have denied the second (Wilson 1995), and some have what he did at his doppelgänger’s location, he would have denied that the conclusion follows (Fodor 1994). Even the expressed what his doppelgänger did and vice versa. Sup- legitimacy of a distinction between broad and narrow con- pose now that Jones and his duplicate both say “Water is tent along these lines has been challenged (Bilgrami wet.” What Jones says is true just in case H2O is wet, and 1992; Chomsky 1995). Thus the implication that the sci- the same goes for his duplicate and the twin Earth analogue entific explanation of behavior requires narrow content of water, XYZ. Again they express different propositions, remains controversial. and their utterances have different broad contents. But sup- For this reason, other arguments for narrow content have pose Jones had acquired his word “water” not on Earth but been advanced that involve not just the causal explanation of on Twin Earth. Then the broad content of his utterance behavior but explanations that capture the subject’s own would have been the same as that of his duplicate. In this perspective on the world and thus rationalize and justify that case too, what the broad contents of their utterances have in behavior. For example, if all content is the broad content common can be expressed as a function—this time from postulated by causal theories, a brain in a vat being fed arti- contexts of acquisition to broad contents. ficial sensory inputs by a computer will either have no We can also appeal to such functions to show what the beliefs or its beliefs will be about the computer’s internal brain in the vat has in common with normal subjects. Had states, regardless of the nature of the stimulus inputs. Thus the brain acquired its beliefs in the same context as normal although the question how the world presents itself seems subjects, it would have had the same broad-content beliefs just as legitimate for the brain in the vat as for a normal sub- that they have, and vice versa. Thus in this example as well, ject, a theory of belief that restricts itself to broad content the narrow contents that the beliefs have in common can be apparently cannot provide an adequate account. expressed as functions from contexts of acquisition to broad A third consideration favoring narrow content derives contents. Furthermore, we can appeal to the same functions from a problem raised by Gottlob FREGE (1952). Though the to distinguish the content the uninformed subject would expressions “Hesperus” and “Phosphorus” both refer to the express in saying “Hesperus is inhabited” from what that same object—Venus—a person who was sufficiently unin- subject would express in saying “Phosphorus is inhabited.” formed could be perfectly rational in believing and assent- Though there is no possible world at which Hesperus is not ing to what he would express by saying both “Hesperus is identical with Phosphorus, there are worlds epistemically inhabited” and “Phosphorus is not inhabited.” The problem identical with the actual one such that had the subject is that according to the causal theory these two beliefs are acquired the terms at those worlds they would have referred about the same object and say contradictory things about it. to different planets. Thus the terms “Hesperus” and “Phos- And we cannot counter the implication of the causal theory phorus,” though they have the same referent, are associated that the subject is irrational by appeal to the different with different functions from contexts of acquisition to ref- descriptions that the subject associates with the two terms; erents. Hence the two beliefs whose broad contents are con- to do so would undermine the claim of the causal theory that tradictory have narrow contents that do not support the reference is independent of the descriptions available to the charge of irrationality. subject. Nor can we appeal to the differences in the causal The appeal to narrow content in this sense answers the chains connecting Venus with “Hesperus” and “Phospho- three objections to broad content that we have been con- Nativism 583 sidering. This approach, however, is not fully satisfactory. Putnam, H. (1975). The meaning of “meaning.” In H. Putnam, Ed., First, it underestimates the theoretical significance of nar- Mind, Language and Reality. New York: Cambridge University Press. row content by making its ascription parasitic on the Schiffer, S. (1981). Truth and the theory of content. In H. Parret ascription of broad content. Second, it provides only an and J. Bouveresse, Eds., Meaning and Understanding. New indirect answer to the question what the subject believes, York: Walter de Gruyter. where the question concerns narrow belief. Third, narrow Stampe, D. W. (1977). Toward a causal theory of linguistic repre- contents, so defined, do not lend themselves easily to a sentation. In P. French, T. Uehling, and H. Wettstein, Eds., Mid- characterization of the logical or epistemic relations west Studies in Philosophy, vol. 2. Minneapolis: University of among a subject’s beliefs, nor to an analysis of the rela- Minnesota Press. tions in virtue of which they figure in practical reasoning Stich, S. (1978). Autonomous psychology and the belief-desire or decision-making. thesis. Monist 61: 573–591. An alternative approach to narrow content takes its cue Stich, S. (1983). From Folk Psychology to Cognitive Science. Cam- bridge, MA: MIT Press. from the fact that the truth conditions of a belief or utter- White, S. L. (1982). Partial character and the language of thought. ance are often represented as the set of possible worlds at Pacific Philosophical Quarterly 63: 347–365. Reprinted in S. L. which that belief or utterance is true. As we have seen, the White, The Unity of the Self. Cambridge, MA: Bradford/MIT causal theorist’s notion of truth and truth conditions leads Press. directly to broad content. However, we can represent the White, S. L. (1991). Narrow content and narrow interpretation. In narrow contents of the subject’s beliefs as the set of S. L. White, Ed., The Unity of the Self. Cambridge, MA: Brad- worlds where those beliefs are accurate or veridical and ford/MIT Press. then define these in a way that is independent of truth. Wilson, R. A. (1995). Cartesian Psychology and Physical Minds: One suggestion is that there is a conceptual connection Individualism and the Sciences of the Mind. New York: Cam- between possible worlds at which one’s beliefs are accu- bridge University Press. rate and worlds at which one’s actions are optimal (and Further Readings nonaccidentally so), given one’s desires and one’s avail- able alternatives. The intuition is that if one performs an Burge, T. (1979). Individualism and the mental. In P. French, T. action that from one’s own point of view is the best action Uehling, and H. Wettstein, Eds., Midwest Studies in Philoso- under the circumstances (it is not weak willed, etc.), then phy, vol. 4. Minneapolis: University of Minnesota Press. it could only fail to be optimal if some of one’s beliefs Field, H. (1978). Mental representation. Erkenntnis 13: 9–61. were inaccurate (White 1991). Fodor, J. A. (1978). Tom Swift and his procedural grandmother. See also POSSIBLE WORLDS SEMANTICS; PROPOSITIONAL Cognition 6: 229–247. Fodor, J. A. (1980). Methodological solipsism considered as a ATTITUDES; SENSE AND REFERENCE research strategy in cognitive psychology. Behavioral and —Stephen L. White Brain Sciences 3: 63–73. Reprinted in J. A. Fodor, Representa- tions. Cambridge, MA: MIT Press. References Johnson-Laird, P. (1978). What’s wrong with grandma’s guide to procedural semantics: A reply to Jerry Fodor. Cognition 6: Bilgrami, A. (1992). Belief and Meaning. Oxford: Blackwell. 241–261. Block, N. (1986). Advertisement for a semantics for psychology. Kaplan, D. (1989). Demonstratives. In J. Almog, J. Perry, and H. In P. French, T. Uehling, and H. Wettstein, Eds., Midwest Stud- Wettstein, Eds., Themes from Kaplan. New York: Oxford Uni- ies in Philosophy, vol 10. Minneapolis: University of Minne- versity Press. sota Press. Stalnaker, R. (1989). On what's in the head. In J. Tomberlin, Ed., Chomsky, N. (1995). Language and nature. Mind 104: 1–61. Philosophical Perspectives, vol. 3, Philosophy of Mind and Dretske, F. (1981). Knowledge and the Flow of Information. Cam- Action Theory. Atascadero, CA: Ridgeview. bridge, MA: Bradford/MIT Press. Woodfield, A., Ed. (1982). Thought and Content. New York: Field, H. (1977). Logic, meaning, and conceptual role. Journal of Oxford University Press. Philosophy 74: 379–409. Fodor, J. A. (1987). Psychosemantics: The Problem of Meaning in Nativism the Philosophy of Mind. Cambridge, MA: Bradford/MIT Press. Fodor, J. A. (1994). The Elm and the Expert: Mentalese and Its Semantics. Cambridge, MA: Bradford/MIT Press. Nativism is often understood as the view that a significant Frege, G. (1952). On sense and reference. In P. Geach and M. body of knowledge is “built in” to an organism, or at least Black, Eds., Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell. innately predetermined. This characterization, however, fails Harman, G. (1982). Conceptual role semantics. Notre Dame Jour- to capture contemporary nativism as well as being inade- nal of Formal Logic 23: 242–256. quate for many older views (see NATIVISM, HISTORY OF). Johnson-Laird, P. N. (1977). Procedural semantics. Cognition 5: Few nativists argue today for the full predetermination of 189–214. specific concepts, ideas, or cognitive structures such as a lan- Kripke, S. A. (1980). Naming and Necessity. Cambridge, MA: guage’s grammar; and few empiricists fail to argue for cer- Harvard University Press. tain kinds of information processing, such as back Loar, B. (1982). Conceptual role and truth conditions. Notre Dame propagation (see SUPERVISED LEARNING), as being built in. Journal of Formal Logic 23: 272–283. Every party to current debates about nativism in fact shares Miller, G. A., and P. N. Johnson-Laird. (1976). Language and Per- the view that there is something special and intrinsic, that is ception. Cambridge, MA: Harvard University Press. 584 Nativism innate, to particular types of organisms that enables them to and empiricists disagree on whether one organism achieves more easily come to engage in some behaviors as opposed to greater learning success than another because it is more others. It is the nature of those intrinsic structures and pro- cognitively capable in general or because it has specialized cesses that is the true focus of debates about whether some structures tuned to learn a particular kind of knowledge. aspect of cognition or perception is compatible with a nativ- This is the question of DOMAIN SPECIFICITY that has become ist perspective. In particular, nativist views endorse the pres- such a pivotal issue in cognitive science today, especially in ence of multiple learning systems each of which is especially EVOLUTIONARY PSYCHOLOGY (Fodor 1983; Hirschfeld and effective at acquiring a particular kind of information and Gelman 1994; Keil 1981; Cosmides and Tooby 1994). where that effectiveness arises from specializations for infor- Domain specificity alone, however, is not enough to mation that occurs at all levels in that learning system, not characterize nativism. The specialization for information just in the initial stages of processing. The rise of connec- must involve a certain kind and “level” of processing. The tionism (see CONNECTIONISM, PHILOSOPHICAL ISSUES and eye is tailored for different kinds of information than the ear COGNITIVE MODELING, CONNECTIONIST) has been said to (VISUAL ANATOMY AND PHYSIOLOGY, AUDITORY PHYSIOL- pose a fatal challenge to nativism; but as seen later in this OGY), a fact well known to both nativists and empiricists for article, it in itself in no way renders nativism obsolete. centuries. But empiricists see those specializations as soon Confusions about nativism often arise in cases where one disappearing when that information flows beyond the sen- organism can acquire a body of knowledge or an ability that sory transducers. If all of thought and all patterns of learn- another cannot. Thus, attempts to teach human language to ing can be explained by general laws, such as those of primates are often seen as bearing directly on nativist views association, once one goes beyond the specializations of the of an innate language even as the researchers on such topics sense organs, then nativism founders. If, however, there are are usually much more cautious (e.g., Savage-Rumbaugh et. specialized systems for building up representations and pro- al. 1993). But differential success at learning is in itself not cesses in specific domains, whether they be language, biol- relevant. This irrelevance is clear when more extreme com- ogy, or number (LANGUAGE ACQUISITION, FOLK BIOLOGY, parisons are made. When a child acquires language and a NAIVE MATHEMATICS), and general learning principles seem pet gerbil in the same environment does not, no one argues inadequate, nativism is supported. that language is therefore innate in humans. Failure to learn Consider the difference between having a system that is can arise for many reasons, only some of which support a tuned to expect certain patterns in a specific modality, such nativist perspective. There may be general cognitive capac- as the eye’s “expectations” concerning reflected light pat- ity requirements necessary for learning complex knowledge terns, and a system that has expectations that transcend that exceed the capacities of some organisms. When a gerbil modalities, such as that two physical bodies cannot inter- fails to learn language, we might well assume that it simply penetrate (Spelke 1994). The second expectation can be could not acquire any knowledge system with the structural borne out tactilely, visually, and possibly even auditorily. complexity and memory loads imposed by language. When This expectation is still domain specific in that it applies a primate fails to learn a language, it too may fail to pass only to bounded physical objects and not fluids, gases, or some general capacity threshold. Alternatively, a primate aggregates. Systems that are tuned to patterns that transcend that is highly adept in some sorts of complex cognitions modalities would therefore be more likely to fit with a nativ- might fail at language acquisition because it does not have ist stance. capacities that are specifically tailored for the pickup and Connectionist architectures can favor either empiricist or learning of linguistic structure. It may fail because it does nativist orientations depending on their implementation. A not “know” enough in advance about some specific proper- system with preset weights and compression algorithms that ties in the domain of natural language, a pattern of failure seem to be optimized for learning only certain kinds of that is compatible with a nativist view of knowledge and its information, such as that of spatial layout, might well sup- origins. But that prior “knowledge” does not have to be in port a nativist account. If such weights, however, only bias the form of an innately represented set of grammatical rules; the learner toward low-level perceptual features, an empiri- it can be a set of powerful biases for interpreting linguistic cist approach is supported (Seidenberg 1992). In some mod- information in highly constrained ways. Specifying those els, a low-level bias, such as selective attention in human biases and constraints is where the real distinctions between infants for moving triangular dot patterns, has been argued contemporary nativists and empiricists reside (Keil 1981, to result in a “face processing area of the brain,” even 1998). Indeed, it has recently been argued that, even in tra- though there were no initial biases in that region for faces ditional biology, constraints thought of as “canalization” are initially (Johnson and Morton 1991). Thus, an end state of the best way of understanding innateness (Ariew 1996). domain-specific processing of a particular kind of informa- Although it is more common in philosophy to distinguish tion that is localized in a specific region of the brain is not RATIONALISM VS. EMPIRICISM, in cognitive science today it by itself nativist. The way in which that specialized process- is nativism that is usually pitted against empiricism, where, ing was set up is critical. Similarly, a recent interest in for a nativist, knowledge of such things as grammar or folk “emergentism” in connectionist systems, namely ways in psychology does not arise from simply having rational which general learning systems can yield unpredictable thought and its logical consequences, but rather from having emergent higher-order properties (MacWhinney forthcom- information-specific learning biases that go beyond the ing), does not displace nativism as much as it makes appar- more content-neutral mechanisms of learning favored by ent the subtlety needed for arguments about the origins of empiricists, such as unconstrained associationism. Nativists various types of knowledge. Nativism 585 In short, nativists and empiricists primarily disagree on Johnson, M. H., and J. Morton. (1991). Biology and Cognitive Development: The Case of Face Recognition. Cambridge, MA: the extent to which pre-existing biases for specific domains Blackwell. of information go beyond those in effect at the levels of sen- Kahneman, D., P. Slovic, and A. Tversky. (1982). Judgement under sory transducers. The sense of domain also shifts, with Uncertainty: Heuristics and Biases. New York: Cambridge domains at the sensory levels being patterns of information University Press. such as “light” or “sound waves” and domains at higher lev- Keil, F. C. (1981). Constraints on knowledge and cognitive devel- els being patterns corresponding to such things as bounded opment. Psychological Review 88: 197–227. physical objects, intentional agents, number, or spatial lay- Keil, F. C. (1998). Cognitive science and the origins of thought and out. All of these domains of the second sort clearly are knowledge. In R. M. Lerner, Ed., Theoretical Models of Human amodal and more cognitive than perceptual. Development. Vol. 1, Handbook of Child Psychology. New Biases on high-level cognition that work in domain-gen- York: Wiley. Lehrman, D. (1953). A critique of Konrad Lorenz’s theory of eral ways do not need to support nativism. For example, the instinctive behavior. Quarterly Review of Biology 28: 337–363. base rate fallacy (Kahneman, Slovic, and Tversky 1982) MacWhinney, B. J., Ed. (Forthcoming). Emergentist Approaches to would seem to apply to any kind of experienced informa- Language. Hillsdale, NJ: Erlbaum. tion, regardless of its domain. As such, it would seem to be Savage-Rumbaugh, E. S., J. Murphy, R. A. Sevick, K. E. Brakke, a further modification of general laws of learning, such as S. L. Williams, and D. Rumbaugh. (1993). Language Compre- those on association, all of which fits with empiricism. If, hension in Ape and Child. Monographs of the Society for however, this bias were to be much more prominent in cases Research in Child Development, 233. Chicago: University of of social attribution, and seemed to help learning about Chicago Press. social situations, it would be considered domain specific. Seidenberg, M. S. (1992). Connectionism without tears. In S. Some have argued that the nativism/empiricism contro- Davis, Ed., Connectionism: Theory and Practice. Oxford: Oxford University Press. versy is seriously misguided because of the intrinsically Spelke, E. (1994). Initial knowledge: Six suggestions. Cognition interactional nature of development (Lehrman 1953). This 50: 431–445. notion has been raised again more recently (Elman et al. 1996) in attempts to argue that it makes no sense to ask Further Readings what is innate in dynamic learning systems. These objec- tions, however, attack a caricature of “innate structures” Chomsky, N. (1988). Language and Problems of Knowledge: The and not the current debate between nativists and empiri- Managua Lectures. Cambridge, MA: Bradford Books/MIT cists. Part of the confusion is between specifying particu- Press. lar behaviors or pieces of knowledge as innate, as Fischer, K. W., and T. Bidell. (1991). Constraining nativist infer- ences about cognitive capacities. In S. C. A. R. Gelman, Ed., opposed to being products of the learning function itself The Epigenesis of Mind: Essays on Biology and Knowledge. and the cognitive biases it engenders. When learning is Hillsdale, NJ: Erlbaum, pp. 199–235. considered as a function from sets of environments to sets Fodor, J. (1981). Representations: Philosophical Essays on the of mental representations (Chomsky 1980), the intrinsi- Foundations of Cognitive Science. Cambridge, MA: MIT Press. cally interactional nature of learning is part of the formu- Keil, F. C. (1990). Constraints on constraints: Surveying the epige- lation. netic landscape. Cognitive Science 14: 135–168. See also CONNECTIONIST APPROACHES TO LANGUAGE; Keil, F., C. Smith, D. Simons, and D. Levin. (1998). Two dogmas EVOLUTION OF LANGUAGE; INNATENESS OF LANGUAGE; of conceptual empiricism. Cognition 65(2). LEARNING; LINGUISTIC UNIVERSALS AND UNIVERSAL GRAM- Lerner, R. M. (1984). On the Nature of Human Plasticity. New York: Cambridge University Press. MAR; MODULARITY OF MIND Leslie, A. (1995). A theory of agency. In A. L. Premack, D. Prem- —Frank Keil ack, and D. Sperber, Eds., Causal Cognition: A Multi-Disci- plinary Debate. New York: Oxford, pp. 121–141. Lightfoot, D. (1982). The Language Lottery: Toward a Biology of References Grammars. Cambridge, MA: MIT Press. Ariew, A. (1996). Innateness and canalization. Philosophy of Sci- Mandler, J. M. (1992). How to build a baby: II. Conceptual primi- ence 63: 19–27. tives. Psychological Review 99: 587–604. Chomsky, N. (1980). Rules and Representations. New York: McClelland, J. L., M. Bruce, L. O’Reilly, and C. Randall. (1995). Columbia University Press. Why there are complementary learning systems in the hippo- Cosmides, L., and J. Tooby. (1994). The evolution of domain spec- campus and neocortex: Insights from the successes and failures ificity: The evolution of functional organization. In L. A. of connectionist models of learning and memory. Psychological Hirschfeld and S. A. Gelman, Eds., Mapping the Mind: Review 102 (3): 419–437. Domain Specificity in Cognition and Culture. Cambridge: Meltzoff, A. N., and M. K. Moore. (1989). Imitation in newborn Cambridge University Press. infants: Exploring the range of gestures imitated and the under- Elman, J. L., E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, D. lying mechanisms. Developmental Psychology 25: 954–962. Parisi, and K. Plunkett. (1996). Rethinking Innateness. Cam- Piatelli-Palmarini, M. (1994). Ever since language and learning: bridge, MA: MIT Press. Afterthoughts on the Piaget-Chomsky debate. Cognition 50: Fodor, J. A. (1983). Modularity of Mind. Cambridge, MA: MIT 315–346. Press. Pinker, S. (1994). The Language Instinct. New York: Morrow. Hirschfeld, L. A., and S. A. Gelman, Eds. (1994). Mapping the Pinker, S. (1997). How the Mind Works. New York: W. W. Norton. Mind: Domain Specificity in Cognition and Culture. Cam- Prince, A., and P. Smolensky. (1997). Optimality: From neural net- bridge: Cambridge University Press. works to universal grammar. Science 275: 1604–1610. 586 Nativism, History of be innate. Gottfried Wilhelm Leibniz, the main rationalist Spelke, E., and E. Newport. (1998). Nativism. In R. M. Lerner, Ed., Theoretical Models of Human Development. Vol. 1, Hand- spokesman for nativism, argues that our certain knowledge book of Child Psychology. New York: Wiley. of necessary truths (of mathematics, logic, metaphysics, and Wynn, K. (1995). Origins of numerical knowledge. Mathematical so on) is wholly inexplicable on the empiricist position. Our Cognition 1. experience is always particular and contingent; how could our knowledge be universal and necessary? Such knowledge Nativism, History of must instead rest on innate principles. Leibniz also argues that even our ordinary empirical concepts contain an innate element. Our concept of a man, for instance, draws upon our Our understanding of ourselves and of our world rests on innate general concept of substance as well as on the spe- two factors: the innate nature of our minds and the specific cific features of men that we discover in experience. A pri- character of our experience. For 2,500 years there has been ori knowledge about substance is possible because we can an on-again off-again debate over which of these factors is mine this innate source, and such knowledge is therefore paramount. NATIVISM champions our innate endowment; immune from the contingencies of the specific substances empiricism, the role of experience (cf. RATIONALISM VS. we experience. EMPIRICISM). There have been three significant moments in Leibniz’s position illustrates the fit between seven- the historical development of nativism: Plato’s doctrine of teenth-century rationalism and nativism. Rationalism holds anamnesis, the rationalist defense of innateness in the sev- that the mind can go beyond appearances and provide us enteenth and eighteenth centuries, and the contemporary with insight into the intelligible nature of things; this revival of nativism in the cognitive sciences. insight yields a priori knowledge. But how do we get such insight? Here nativism is invoked: our innate ideas and Platonic Nativism principles are the source of our a priori understanding. The problem with this package is that even if something is Plato presents the first explicit defense of nativism in the innate, that does not in itself establish its truth; it certainly Meno, where Socrates draws out a geometrical theorem cannot establish its necessity. René Descartes implicitly from an uneducated slave, and argues that this is possible recognizes this when he introduces a benevolent God into only because the slave implicitly had the theorem in him all his epistemology as the ultimate guarantor of our knowl- along. He had merely forgotten it, and questioning helped edge. The idea is that if something is innate, a benevolent him to recollect. For Plato, all genuine learning is a matter God must have put it there for our edification, and a benev- of recollecting (anamnesis) what is innate but forgotten. olent God would not mislead us. Socrates goes on to argue that because the slave had not The historical result was that nativism became entangled been taught geometry, he must have acquired the knowledge with an excess of philosophical baggage. Plato, as we saw, in an earlier existence. In the Phaedo, Plato connects innate- joined it to a transcendent world of forms and a mystical ness to the theory of forms and argues that our grasp of the doctrine of the preexistence of the soul. From rationalism it form of equality could not come from perceived equals, and inherited an exalted conception of the power of pure reason must therefore also be innate. For Plato, nativism is more and an epistemology that seemed to ultimately require a than a solution to the epistemological problem of know- theological basis. Whatever the original merits of the basic ledge acquisition; it also provides evidence for the preexis- nativist claim about the initial state of the mind, the position tence and immortality of the soul. began to seem out of step with the more naturalistic world Plato’s claims have served as the touchstone for defenders view of the Newtonian revolution. of nativism (so much so that the doctrine is sometimes referred to as “Platonism”), but it is difficult to pin down a specific Platonic innateness “doctrine.” The problem is that Empiricism Plato’s nativism is embedded in an epistemological frame- John Locke’s Essay, the first systematic defense of empiri- work that takes transcendent forms to be the only objects of cism, is a philosophical expression of this more naturalistic genuine knowledge, and there are unresolved questions about perspective. Locke begins with an extended polemic against the exact nature of that framework. Plato never definitively nativism, in which he charges that it is either blatantly false, says what forms there are, or what role our grasp of the forms because there are no principles that can claim the “universal plays in ordinary cognition. It is therefore difficult to say consent” that an innate principle would produce, or that it confidently what Plato took to be innate, or how he conceived reduces to the trivial claim that we have an inborn capacity the influence of the innate in thinking. Apart from these to come to know everything we know. Leibniz responds to uncertainties, his argument seems threatened by a potentially these preemptive strikes in his New Essays, where a number devastating regress: if knowledge acquisition is recollection, of innovative ideas are introduced—for example, the notion how is it that we acquire knowledge in an earlier existence? of unconscious knowledge, the procedural-declarative dis- tinction, the suggestion of innate biases that may or may not Nativism and Continental Rationalism be expressed. But although this part of the debate has had In the Meditations, René DESCARTES argues that concepts greater visibility, the more important empiricist attack—and such as God and infinity can not be derived from experience this is the main point of Locke’s Essay and of subsequent and must therefore be innate. At some points he even sug- empiricist theorizing—is that nativism is an unnecessary gests that no ideas can come to us via experience; all must extravagance, because our knowledge can be explained with Nativism, History of 587 lished in contemporary cognitive science as a viable alter- the simpler empiricist hypothesis. The empiricist project native to empiricism. The core question-schema it exerted a dominant influence in both philosophy and psy- addresses remains cogent: are our ideas, beliefs, knowl- chology well into the twentieth century. It was widely edge, and so forth in any particular domain derived solely assumed that the program had to eventually succeed, from experience, or are they to some extent traceable to because nativism was stigmatized as a backward supersti- domain-specific features of the mind’s initial endowment? tion and not a serious “scientific” alternative. Empiricist- There is nothing obscure or unscientific about nativist oriented psychologists carried over the early associationist answers. They are on the contrary very much in line with thinking of David HUME, John Stuart Mill, and others into our understanding of the way brain adaptations equip the behaviorist analyses of learning, while their counterparts organisms to function in their environmental niches. Cogni- in philosophy pursued technical analyses of INDUCTION and tive ethologists (see COGNITIVE ETHOLOGY) have shown CONCEPT formation. that rats are born with a grasp of their nutritional needs, and that ants do not need to be taught the system of dead reck- Chomsky and the INNATENESS OF LANGUAGE oning they use in foraging expeditions. Nativists extend The reign of this presumptive empiricism ended at mid- this pattern of findings to the higher cognitive functions century with Noam Chomsky’s groundbreaking work in found in humans. The new field of EVOLUTIONARY PSY- linguistics. Chomsky has revived nativism by arguing that a CHOLOGY, which adopts a thoroughgoing nativist perspec- child’s mastery of language cannot be accounted for in tive, focuses especially on the sorts of cognitive and terms of empiricist learning mechanisms. His case rests on motivational structures that might have developed as adap- the POVERTY OF THE STIMULUS ARGUMENTS. Speakers tations in the original ancestral settings in which humans adhere to a complex system of grammatical rules that must evolved. somehow be reflected in the speaker’s psychological pro- This newly secured scientific respectability has come at a cessors; otherwise we cannot explain the adherence. But philosophical price. The “transcendental” nativism of Plato these rules involve categories and classifications that are and Descartes had significant epistemological and meta- abstract and far removed from the linguistic evidence avail- physical ramifications that the new nativism cannot secure able to the learner, and their specific content is underdeter- with the same ease. mined by the evidence available. The empiricist’s inductive See also COGNITIVE ARCHITECTURE; CONNECTIONIST manipulation of the data available to the child cannot pro- APPROACHES TO LANGUAGE; DOMAIN SPECIFICITY; KANT; duce the rule-information that the child must have. But LINGUISTIC UNIVERSALS AND UNIVERSAL GRAMMAR; despite this shortfall, normal children acquire the right set PARAMETER-SETTING APPROACHES TO ACQUISITION, CRE- of rules with little or no rule-instruction, and at an age at OLIZATION, AND DIACHRONY which they cannot master much else. Chomsky’s hypothe- —Jerry Samet sis is that language learners have innately specified infor- mation that is specifically about the nature of human References language (”universal grammar”). The child is not simply dropped into the wholly alien terrain of language; instead Chomsky, N. (1965). Aspects of a Theory of Syntax. Cambridge, she comes to the language-learning task with a “head MA: MIT Press. start”—a rough map giving her some idea of what to look Descartes, R. (1641). Meditations on first philosophy. In E. for. Chomsky’s claims have attracted criticism both from Haldane and G. R. T. Ross, Eds., (1967). The Philosophical within and outside linguistics, but the preponderant view is Works of Descartes. Cambridge: Cambridge University Press. that as far as language goes, empiricism is wrong and nativ- Fodor, J. (1981). Representations: Philosophical Essays on the Foundation of Cognitive Science. Cambridge, MA: MIT Press. ism is right. Leibniz, G. W. (1704). New Essays on Human Understanding. This nativist revival in linguistics led to the reassessment Translated and edited by P. Remnant and J. Bennett. Cam- of established empiricist approaches to development in bridge: Cambridge University Press, 1981. other areas like mathematics, physical causality, visual per- Locke, J. (1690). An Essay Concerning Human Understanding. P. ception, and so on. In many of these areas, nativists have H. Nidditch, Ed. Oxford: Oxford University Press, 1975. developed new evidence to support their positions, and in Plato. (c. 380 B.C.). Meno (key passages: 80a–86c) and Phaedo some cases have argued that older findings were misinter- (key passages: 73c–78b). preted. A case in point is Jerry Fodor’s contention that the whole empiricist “concept-learning” paradigm—the sort of Further Readings “learning by example” that has been championed from Chomsky, N. (1988). Language and Problems of Knowledge. Cam- Locke to the present—has at its core a surprising and bridge, MA: MIT Press. unavoidable nativist commitment. Empiricists have of Edgley, R. (1970). Knowledge and Necessity, Royal Institute of course not given up; new connectionist models of learning Philosophy Lectures, vol. 3. London: Macmillan. have been touted as using only empiricist-sanctioned princi- Hook, S., Ed. (1969). Language and Philosophy: A Symposium. ples, but as nevertheless being able to learn what nativists New York: NYU Press. have claimed was unlearnable without domain-specific Jolley, N. (1984). Leibniz and Locke. Oxford: Clarendon Press. innate structure. Piattelli-Palmerini, M., Ed. (1980). Language and Learning. Cam- Regardless of how the empirical issues are resolved in bridge, MA: Harvard University Press. any particular domain, nativism has been at least reestab- Samet, J. (Forthcoming). Nativism. Cambridge, MA: MIT Press. 588 Natural Kinds there are substantive theoretical questions about the bound- Scott, D. (1996). Recollection and Experience: Plato’s Theory of Learning and its Successors. Cambridge: Cambridge Univer- aries of those kinds. In this view, certain ways of drawing sity Press. the boundaries among psychopathologies are simply mis- Stich, S., Ed. (1975). Innate Ideas. Berkeley: University of Califor- taken, and not merely inconvenient or ill-suited to the pur- nia Press. poses for which we have devised our classificatory scheme. More important, some systems of classification might be extremely useful for certain purposes without giving a Natural Kinds proper account of the nature and boundaries of the items classified. Some systems of classification are merely conventional: Those who believe that all classification is merely con- they divide a population of objects into kinds, and the prin- ventional thus see taxonomic disputes as shallow and the ciples by which objects are categorized are designed to search for substantive theory to guide taxonomy as mis- answer to some specific purpose. There is no antecedently guided; substantive scientific questions arise only after the correct or incorrect way to categorize various objects apart choice of a taxonomic system. Realists about natural kinds, from the purposes to which the system of classification will on the other hand, see taxonomic disputes as potentially be put. Thus, for example, we divide the world into different important; a proper taxonomy must be guided by theoretical time zones, and our purpose in so doing is to allow for coor- insight into the underlying causal structure of the phenom- dination of activities in different locales, but there is no right ena under study. The dispute between conventionalists and or wrong way to draw the boundaries of time zones; there realists is thus significant not only in issues concerning psy- are merely more or less convenient ways that answer better chodiagnosis, but also in addressing taxonomic questions or worse the concerns that led us to devise these categories throughout the cognitive sciences. in the first place. The view that all systems of categorization The way in which questions about natural kinds influ- are merely conventional is called conventionalism about ence taxonomic issues demonstrates the importance of this kinds. concept for methodological concerns in the cognitive sci- Some systems of CATEGORIZATION, however, do not ences. Questions about natural kinds arise more directly, seem to be merely conventional. Rather, they attempt to however, as well. First, natural kind CONCEPTS play a cru- draw conceptual boundaries that correspond to real distinc- cial role in inductive inference. Second, and relatedly, the tions in nature, boundaries which, in Plato’s phrase, “cut acquisition of natural kind concepts plays an important role nature at its joints.” Thus, for example, the periodic table of in cognitive development. elements seems not merely an arbitrary or convenient sys- Natural kind concepts play an important role in success- tem of classification, a system that makes certain calcula- ful inductive inference because members of a given natural tions or predictions easier; rather, it seems to describe real kind tend to have many of their most fundamental proper- kinds in nature, kinds whose existence is not just a product ties in common. Thus, finding that one member of a kind of our classificatory activity. Kinds of this sort, which are has a certain property gives one reason for believing that not merely conventional, are called natural kinds. Those others will share that property as well. This uniformity of who believe that there are natural kinds, called realists natural kinds is part of what distinguishes them from arbi- about kinds, believe that it is part of the business of the vari- trarily specified classes of individuals, for in the case of ous sciences to discover what the natural kinds are; scien- arbitrary classes, noting that one member of the class has a tific taxonomies, in this view, attempt to provide a proper certain property (other than the ones that are used to define account of these kinds. the class) provides one with no reason at all to believe that The notion of a natural kind figures in to important ques- other members of the class will share that property. Our tions in the methodology of the cognitive sciences, as well ability to make successful inductive inferences thus as work in COGNITIVE DEVELOPMENT, the psychology of depends on our ability to recognize natural kinds. Not sur- reasoning, and COGNITIVE ANTHROPOLOGY. prisingly, many suggest an evolutionary basis for such a According to conventionalism, many disputes about preestablished harmony between folk and scientific taxono- proper taxonomy in the sciences are misguided. Taxonomic mies. While no one believes that our native categories sim- systems, in this view, cannot themselves “get things right” ply mirror those of the sciences, the suggestion here is that or “get things wrong,” although some will, of course, be natural selection provides us with a starting point that more convenient than others. Disputes about taxonomy, in approximately captures some of the real distinctions in this view, do not involve genuine disagreement about sub- nature, thereby allowing for the possibility of more elabo- stantive scientific questions. Realists, however, regard dis- rate and more accurate scientific taxonomies. Without putes about proper taxonomy in the various sciences as some native help in identifying the real categories in substantive. Consider, for example, the categorization of nature, some have argued, we would be unable to develop psychopathologies in the Diagnostic and Statistical Manual accurate taxonomies at all. (1994). Those who think of psychodiagnostic categories as It is for this reason that the acquisition of natural kind merely conventional will regard questions about categoriza- concepts is such an important intellectual achievement. Two tion here as ones of convenience; a proper system of catego- developmental questions need to be separated here: (1) At rization is merely one that well serves the purposes for what point are various natural kind concepts acquired? For which it was designed. If, however, the various psychodiag- example, when do children acquire the concept of a living nostic categories constitute a system of natural kinds, then thing, of an animal, of a being with mental states, and so on? Natural Language Generation 589 (2) At what point do children acquire the concept of a natural Dupre, J. (1993). The Disorder of Things. Cambridge, MA: Har- vard University Press. kind itself? A good deal of work has been done on each of Gelman, S. A., and H. M. Wellman. (1991). Insides and essences: these questions. Although an explicitly articulated concept of Early understanding of the non-obvious. Cognition 38: 213– a natural kind is certainly found in no children, and indeed, 244. in few adults, the ways in which children classify objects and Gopnik, A., and A. Meltzoff. (1997). Words, Thoughts, and Theo- the ways in which they respond to the information that two ries. Cambridge, MA: MIT Press. objects are members of a single taxonomic category suggest Hacking, I. (1995). Rewriting the Soul. Princeton, NJ: Princeton that there is a strong tendency to view the world as having a University Press. structure that presupposes the existence of natural kinds. In Keil, F. (1989). Concepts, Kinds, and Cognitive Development. particular, children do not tend to classify objects merely on Cambridge, MA: MIT Press. the basis of their most obvious observable features, features Kornblith, H. (1993). Inductive Inference and Its Natural Ground. Cambridge, MA: MIT Press. that may be unrevealing of natural kind membership; and Locke, J. (1690). An Essay Concerning Human Understanding. when children are told that two individuals are members of a London: Thomas Bassett. single category, they tend to assume that these individuals Markman, E. (1989). Categorization and Naming in Children. will share many fundamental properties, even when the most Cambridge, MA: MIT Press. obvious observable features of the objects differ. Some Medin, D., and A. Ortony. (1989). Psychological essentialism. In authors have suggested that these tendencies may be innate; S. Vosniadou and A. Ortony, Eds., Similarity and Analogical this would help to explain the possibility of successful induc- Reasoning. Cambridge: Cambridge University Press, pp. 179– tive inference by explaining the source of the human ability 195. to identify kinds that support inductive generalizations. A Putnam, H. (1975). Mind, Language and Reality. Cambridge: tendency to view the world in terms of the structure required Cambridge University Press. Quine, W. V. O. (1969). Ontological Relativity and Other Essays. by natural kinds seems, at a minimum, to be an ability New York: Columbia University Press. already in place early in cognitive development. Schwartz, S. P. (1977). Naming, Necessity and Natural Kinds. Ith- Relevant here too is work in cognitive anthropology. aca, NY: Cornell University Press. Atran’s work (1990) on folk taxonomies reveals deep simi- Wellman, H. M., and S. A. Gelman. (1988). Children’s understand- larities in the ways in which different cultures divide up the ing of the non-obvious. In R. Sternberg, Ed., Advances in the biological world. More than this, these taxonomies have Psychology of Intelligence, vol. 4. Hillsdale, NJ: Erlbaum. much more than just a passing resemblance to the more refined taxonomic categories of the biological sciences. Not everyone is entirely optimistic about the possibility of Natural Language Generation filling in the details of the picture presented here. Some deny that the taxonomies of the different sciences have enough in common with one another to speak of them all as displaying Automated natural language generation (NLG), currently a single structure, the structure of natural kinds. This skepti- about 25 years old, investigates how to build computer pro- cism has been fueled by a number of factors, including the grams to produce high-quality text from computer-internal recognition that the physical world is not deterministic, as representations of information. Generally, NLG does not well as the fact that the kinds of the biological world cross- include research on the automatic production of speech, cut one another. That scientific taxonomies are simply more whether from text or from a more abstract input (see SPEECH messy than was once assumed has thus not only complicated SYNTHESIS). Also, with few exceptions, research has the picture of natural kinds, but also made some doubt the steadily moved away from modeling how people produce very usefulness of the notion. Moreover, the similarity language to developing methods by which computers can be between folk taxonomies and the taxonomies of the various made to do so robustly. sciences differ substantially. The connection between our The information provided to a language generator is pro- conceptual capacities and the causal structure of the world duced by some other system (the “host” program), which thus leaves a good deal to be discovered on all accounts. may be an expert system, database access system, MACHINE TRANSLATION engine, and so on. The outputs of various See also CAUSAL REASONING; FOLK BIOLOGY; INDUC- host systems can differ quite significantly, a fact that makes TION; NATIVISM; REALISM AND ANTIREALISM; REFERENCE, creating a standardized input notation for generators a THEORIES OF perennial problem. —Hilary Kornblith Traditionally, workers on NLG have divided the problem into two major areas: content selection (“what shall I say?”) References and content expression (“how shall I say it?”). Processing in these stages is generally performed by so-called text plan- American Psychiatric Association. (1994). Diagnostic and Statisti- ners and sentence realizers, respectively. More recently, two cal Manual of Mental Disorders. 4th ed. Washington, D.C. further developments have occurred: first, as generators Atran, S. (1990). Cognitive Foundations of Natural History. Cam- became more expressive, the control of stylistic variation bridge: Cambridge University Press. (“why should I say it this way?”) has become important; Boyd, R. (1991). Realism, anti-foundationalism and the enthusi- second, an intermediate stage of sentence planning has been asm for natural kinds. Philosophical Studies 61: 127–148. introduced to fill the “generation gap” between text planners Carey, S. (1985). Conceptual Change in Childhood. Cambridge, and sentence realizers. The canonical generator architecture MA: MIT Press. 590 Natural Language Generation 1988) and the associated method of planning (Hovy 1993; Moore 1989) are typical. Stage 2: Sentence Planning Until the early 1990s, sentence-planning tasks were per- formed during text planning or sentence realization. Increasingly, however, sentence planning is seen as a dis- tinct stage; this clarifies the generation process and makes focused investigation of subtasks easier. Accepting from the text planner a text structure, some of the sentence planner’s tasks include: specifying sentence boundaries; organizing (ordering, relativizing, etc.) the material internal to each sentence; planning cross-sentence reference and other anaphora; selecting appropriate words and phrases to express content; and specifying tense, mode Figure 1. (active or passive), as well as other syntactic parameters. The ideal output of a sentence planner is a list of clause- appears in figure 1. No generator created to date fully sized units containing a fairly complete syntactic specifica- embodies all these modules. Pilot attempts at comprehen- tion for each clause; see Meteer (1990) for a thoughtful sive architectures are ERMA (Clippinger 1974) and study. PAULINE (Hovy 1988). Most generators contain just some In the example, the sentence planner must decide of these stages, in various arrangements; see Reiter (1994) whether to generate each floor plan as a separate sentence and De Smedt, Horacek, and Zock (1995). or to conjoin them (and if so, to choose an appropriate conjunction). It must decide whether to say, for example, Stage 1: Text Planning “the ground floor contains . . .” or “the ground floor has the following rooms . . . ,” or any of numerous other for- Accepting one or more communicative goals from the mulations. It must decide whether to say “living room” or host system, the text planner’s two tasks are to select the “sitting room”; “den” or “family room.” The interrelated- appropriate content material to express, and to order that ness and wide range of variation of such individual material into a coherently flowing sequence. A typical aspects makes sentence planning a difficult task, as any- input goal might be [DESCRIBE HOUSE-15] or [MOTI- one who has ever written an essay knows. A considerable VATE GOING-ON-VACATION-12], where the terms with amount of research on individual sentence-planning tasks numbers denote specific packages of information. After exists (see the readings below), but there is relatively little planning, the output is generally a tree structure or an on their integration (see Appelt 1985; Nirenburg et al. ordered list of more detailed content propositions, linked 1988). together by discourse connectives signaled by “there- fore,” “and,” “however,” and so on. Usually, each proposi- Stage 3: Sentence Realization tion represents approximately the information contained in a single-clause sentence. Thus, the initial goal Accepting from the sentence planner a list of sentence spec- [DESCRIBE HOUSE-15] may be expanded into a text ifications, the sentence realizer’s tasks are to determine the plan containing (in simplified notation) [GENERATE grammatically correct order of words; to inflect words for HOUSE-IDENTIFIER] [GENERATE ADDRESS] tense, number, and so on, as required by the language; and [INTRODUCE FLOORPLAN] [ELABORATE [GENER- to add punctuation, capitalization, and the like. These tasks ATE GROUND-FLOOR] “and” [GENERATE TOP- are language-dependent. FLOOR] “and” [GENERATE BASEMENT]] and so on. Realization is the most extensively studied stage of Generally, text planning is considered to be language- generation. The principal knowledge required is a gram- independent. mar of syntactic rules and a lexicon of words. Different The two principal methods for performing the text plan- theories of SYNTAX have led to very different approaches ning tasks involve schemas and so-called rhetorical relations. to realization. Realization algorithms include unification Schemas (McKeown 1985; Paris 1993; see SCHEMATA) are (Elhadad 1992), Systemic network traversal (Mann and the simplest and most popular, useful when the texts follow a Matthiessen 1985), phrase expansion (Meteer et al. 1987), fairly stereotypical structure, such as short encyclopedia arti- head-driven and reversible methods (Van Noord 1990; St. cles or business reports. Each schema specifies the typical Dizier 1992), simulated annealing (De Smedt 1990), con- sequence of units of content material (or of other schemata; nectionist architectures (Ward 1990), and statistically they may be nested); for example, the order of floors. Rhe- based management of underspecification (Knight and torical relations (e.g., Elaboration, Justification, Back- Hatzivassiloglou 1995). The systems Penman (Mann and ground) organize material by specifying which units of Matthiessen 1985; later extended as KPML; Bateman material (or blocks of units) should be selected and linked in 1994), FUF/SURGE (Elhadad 1992), and MUMBLE sequence. Several collections of relations have been pro- (Meteer et al. 1987) have been distributed and used by posed; Rhetorical Structure Theory (Mann and Thompson several external users. Natural Language Generation 591 In the example, the specification (in simplified form): ple text planner: TEXT (McKeown 1985), using schemas as patterns. (1) [GENERATE (TYPE: DECLARATIVE-SENTENCE) Features: In the most sophisticated approach to realiza- (HEAD: POSSESS) tion, grammar rules, lexical items, and the input notation (SUBJECT:((HEAD: FLOOR) (LEVEL-MODI- are all encoded as collections of features, using the same FIER: GROUND) (DETERMINED: YES))) type of notation. A process called unification is employed (OBJECT:((HEAD: ROOM) (NUMBER: 4) to compare the input’s features against all possible gram- (DETERMINED: NO))) mar rules and lexical items to determine which combina- (TENSE: PRESENT)] tion of rules and items matches. For example, the as produced by the sentence planner will be interpreted by specification for the sentence “the ground floor has four the grammar rules to form a sentence such as “the ground rooms” given above will unify with the feature-based floor has four rooms.” grammar rule: (2) [SENTENCE (TYPE: DECLARATIVE-SENTENCE) Stylistic Control (HEAD: X0) Throughout the generation process, some agency has to (SUBJECT: X1) ensure the consistency of choices, whose net effect is the (OBJECT: X2) style of the text. Since different styles have different com- (TENSE: X3)] municative effects, the stylistic control module must use where each variable X is then associated with the appropri- high-level pragmatic parameters initially specified for the ate portion of the input and subsequently unified against system (such as degree of formality, the addressee’s lan- other rules. And the word “rooms” is obtained from the lex- guage level, amount of time available, communication icon by successfully unifying the input’s subject with the genre) to govern its overall decision policies. These policies lexical item. determine the selection of the most appropriate option from the options facing any generator module at each point dur- (3) [LEXITEM (HEAD: ROOM) (NUMBER: >1) (LEX- ing the planning and realization process. EME: “rooms”)] Few studies have been performed on this aspect of gen- Example realizer: FUF/SURGE (Elhadad 1992). eration; lexicons, grammars, and sets of planning rules are Using features a different way, the influential Penman still too small to necessitate much stylistic guidance. Fur- system (Mann and Matthiessen 1985) contains a network of thermore, the complexity of interaction of choices across decision points that guide the system to identify appropriate the stages of generation requires deep attention: how does features, whose ultimate combination specifies the desired the choice of word “freedom fighter”/“terrorist”/“guer- sentence structure and lexical items. rilla” interact with the length of the sentences near it, or See also KNOWLEDGE-BASED SYSTEMS; LANGUAGE PRO- with the choice of active or passive mode? See Jameson DUCTION; NATURAL LANGUAGE PROCESSING (1987), Hovy (1988), and DiMarco and Hirst (1990) for studies. —Eduard Hovy Generation Techniques References Two core operations are performed throughout the genera- tion process: content selection and ordering. For text plan- Appelt, D. E. (1985). Planning English Sentences. Cambridge: Cambridge University Press. ning, the items are units of meaning representation; for Bateman, J. A. (1994). KPML: The KOMET-Penman (Multilin- realization, the items are grammatical constituents and/or gual) Development Environment. Darmstadt, Germany: Tech- words. To do this, almost all generators use one of the fol- nical Report, IPSI Institute. lowing four basic techniques. Clippinger, J. H. (1974). A Discourse Speaking Program as a Pre- Canned items: Predefined sentences or paragraphs are liminary Theory of Discourse Behavior and a Limited Theory selected and printed without modification. This approach is of Psychoanalytic Discourse. Ph.D. diss., University of Penn- used for simple applications. sylvania. Templates: Predefined structures that allow some varia- De Smedt, K. J. M. J. (1990). Incremental Sentence Generation. tion are selected, and their blank spaces filled with items Ph.D. diss., University of Nijmegen. specified by the content. The blanks usually have associated De Smedt, K. J. M. J., H. Horacek, and M. Zock. (1995). Architec- tures for natural language generation: problems and perspec- requirements that specify what kinds of information may fill tives. In G. Adorni and M. Zock, Eds., Trends in Natural them. Language Generation: An Artificial Intelligence Perspective. Cascaded patterns: An initial abstract pattern is selected, Heidelberg, Germany: Springer-Verlag Lecture Notes in AI, and each of its pieces are replaced by successively more No. 1036, pp. 17–46. detailed patterns, forming a tree structure with, at its leaves, DiMarco, C., and G. Hirst. (1990). A computational theory of the target elements. An example is traditional phrase-struc- goal-directed style in syntax. Computational Linguistics 19(3): ture grammars, with words as target elements. The selection 451–500. of suitable patterns for further expansion is guided by the Elhadad, M. (1992). Using Argumentation to Control Lexical content to be generated. Example realizer: MUMBLE Choice: A Functional Unification-Based Approach. Ph.D. diss., (Meteer et al. 1987), using grammar rules as patterns; exam- Columbia University. 592 Natural Language Processing Hovy, E. H. (1988). Generating Natural Language under Prag- Bateman, J. A., and E. H. Hovy. (1992). An overview of computa- matic Constraints. Hillsdale: Erlbaum. tional text generation. In C. Butler, Ed., Computers and Texts: Hovy, E. H. (1993). Automated discourse generation using dis- An Applied Perspective. Oxford: Blackwell, pp. 53–74. course structure relations. Artificial Intelligence 63(1–2): 341– Cole, R., J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue. 386 Special Issue on Natural Language Processing. (1996). Survey of the State of the Art of Human Language Jameson, A. (1987). How to appear to be conforming to the “max- Technology. Report commissioned by NSF and LRE; http:// ims” even if you prefer to violate them. In G. Kempen, Ed., www.cse.ogi.edu/CSLU/HLTsurvey/. Natural Language Generation: Recent Advances in Artificial Dale, R. (1990). Generating recipes: An overview of EPICURE. In Intelligence, Psychology, and Linguistics. Dordrecht: Kluwer, R. Dale, C. Mellish, and M. Zock, Eds., Current Research in pp. 19–42. Natural Language Generation. New York: Academic Press, pp. Knight, K., and V. Hatzivassiloglou. (1995). Two-level, many- 229–255. paths generation. In Proceedings of the 33rd Conference of Dale, R., E. H. Hovy, D. Röesner, and O. Stock, Eds. (1992). the Association for Computational Linguistics, pp. 252– Aspects of Automated Natural Language Generation. Heidel- 260. berg, Germany: Springer-Verlag Lecture Notes in AI, No. 587. Mann, W. C., and C. M. I. M. Matthiessen. (1985). Nigel: A sys- Goldman, N. M. (1974). Computer Generation of Natural Lan- temic grammar for text generation. In R. Benson and J. guage from a Deep Conceptual Base. Ph.D. diss., Stanford Uni- Greaves, Eds., Systemic Perspectives on Discourse: Selected versity. Also in R. C. Schank, Ed., (1975) Conceptual Papers from the Ninth International Systemics Workshop. Lon- Information Processing. Amsterdam: Elsevier, pp. 54–79. don: Ablex, pp. 95–135. Horacek, H. (1992). An integrated view of text planning. In R. Mann, W. C., and S. A. Thompson. (1988). Rhetorical structure Dale, E. H. Hovy, D. Röesner, and O. Stock, Eds., Aspects of theory: Toward a functional theory of text organization. Text 8: Automated Natural Language Generation. Heidelberg, Ger- 243–281. Also available as USC/Information Sciences Institute many: Springer-Verlag Lecture Notes in AI, No. 587, pp. 57– Research Report RR-87-190. 72. McKeown, K. R. (1985). Text Generation: Using Discourse Strate- Kempen, G., Ed. (1987). Natural Language Generation: Recent gies and Focus Constraints to Generate Natural Language Text. Advances in Artificial Intelligence, Psychology, and Linguis- Cambridge: Cambridge University Press. tics. Dordrecht: Kluwer. Meteer, M. W. (1990). The Generation Gap: The Problem of Lavoie, B., and O. Rambow. (1997). A fast and portable realizer Expressibility in Text Planning. Ph.D. diss., University of Mas- for text generation systems. In Proceedings of the 5th Confer- sachusetts. Available as BBN Technical Report 7347. ence on Applied Natural Language Processing. Washington, Meteer, M., D. D. McDonald, S. Anderson, D. Foster, L. Gay, A. pp. 73–79. Huettner, and P. Sibun. (1987). Mumble-86: Design and Imple- Paris, C. L., W. R.Swartout, and W. C. Mann, Eds. (1990). Natural mentation. Amherst: University of Massachusetts Technical Language Generation in Artificial Intelligence and Computa- Report COINS-87-87. tional Linguistics. Dordrecht: Kluwer. Moore, J. D. (1989). A Reactive Approach to Explanation in Reiter, E. B. (1990). Generating Appropriate Natural Language Expert and Advice-Giving Systems. Ph.D. diss., University of Object Descriptions. Ph.D. diss., Harvard University. California at Los Angeles. Reiter, E. B., C. Mellish, and J. Levine. (1992). Automatic genera- Nirenburg, S., R. McCardell, E. Nyberg, S. Huffman, E. Ken- tion of on-line documentation in the IDAS project. In Proceed- schaft, and I. Nirenburg. (1988). Lexical realization in natural ings of the 3rd Conference on Applied Natural Language language generation. In Proceedings of the 2nd Conference on Processing. Association for Computational Linguistics, pp. 64– Theoretical and Methodological Issues in Machine Translation. 71. Pittsburgh, pp. 18–26. Robin, J. (1990). Lexical Choice in Language Generation. Ph.D. Paris, C. L. (1993). The Use of Explicit Models in Text Generation. diss., Columbia University, Technical Report CUCS-040-90. London: Francis Pinter. Reiter, E. B. (1994). Has a consensus NL generation architecture Natural Language Processing appeared, and is it psychologically plausible? In Proceedings of the 7th International Workshop on Natural Language Genera- tion. Kennebunkport, pp. 163–170. Natural language processing is a subfield of artificial intelli- St. Dizier, P. (1992). A constraint logic programming treatment gence involving the development and use of computational of syntactic choice in natural language generation. In R. models to process language. Within this, there are two gen- Dale, E. H. Hovy, D. Roesner, and O. Stock, Eds., Aspects of eral areas of research: comprehension, which deals with Automated Natural Language Generation. Heidelberg, Ger- processes that extract information from language (e.g., natu- many: Springer-Verlag Lecture Notes in AI, No. 587, pp. ral language understanding, information retrieval), and gen- 119–134. eration, which deals with processes of conveying Van Noord, G. J. M. (1990). An overview of head-driven bottom- information using language. Traditionally, work dealing up generation. In R. Dale, C. S. Mellish, and M. Zock, Eds., with speech has been considered separate fields of SPEECH Current Research in Natural Language Generation. London: Academic Press, pp. 141–165. RECOGNITION and SPEECH SYNTHESIS. We will continue Ward, N. (1990). A connectionist treatment of grammar for gener- with this separation here, and the issues of mapping sound ation. In Proceedings of the 5th International Workshop on to words and words to sound will not be considered further. Language Generation. University of Pittsburgh, pp. 95–102. There are two main motivations underlying work in this area. The first is the technological goal of producing auto- Further Readings mated systems that perform various language-related tasks, such as building automated interactive systems (e.g., auto- Adorni, G., and M. Zock, Eds. (1996). Trends in Natural Language mated telephone-operator services) or systems to scan data- Generation: An Artificial Intelligence Perspective. Heidelberg, bases of documents to articles on a certain topic (e.g., Germany: Springer-Verlag Lecture Notes in AI, No. 1036. Natural Language Processing 593 finding relevant pages on the world wide web). The second, structural properties, such as sentence structure. In general, the one most relevant to cognitive science, seeks to better the most successful approaches to these problems involve understand how language comprehension and generation combining statistical approaches with other approaches. A occurs in humans. Rather than performing experiments on good introduction to statistical approaches is Charniak humans as done in psycholinguistics, or developing theories 1993. that account for the data with a focus on handling possible Structural and pattern-based approaches have the closest counterexamples as in linguistics and philosophy, research- connection to traditional linguistic models. These ers in natural language processing test theories by building approaches involve defining structural properties of lan- explicit computational models to see how well they behave. guage, such as defining FORMAL GRAMMARS for natural lan- Most research in the field is still more in the exploratory guages. Active research issues include the design of stage of this endeavor and trying to construct “existence grammatical formalisms to capture natural language struc- proofs” (i.e., find any mechanism that can understand lan- ture yet retain good computational properties, and the guage within limited scenarios), rather than building com- design of efficient parsing algorithms to interpret sentences putational models and comparing them to human with respect to a grammar. Structural approaches are not performance. But once such existence-proof systems are limited solely to syntax, however. Many more practical sys- completed, the stage will be set for more detailed compara- tems use semantically based grammars, where the primitive tive study between human and computational models. units in the grammar are semantic classes rather than syn- Whatever the motivation behind the work in this area, how- tactic. And other approaches dispense with fully analyzing ever, computational models have provided the inspiration sentence structure altogether, using simpler patterns of lexi- and starting point for much work in psycholinguistics and cal, syntactic and semantic information that match sentence linguistics in the last twenty years. fragments. Such techniques are especially useful in limited- Although there is a diverse set of methods used in natural domain speech-driven applications where errors in the input language processing, the techniques can generally be can be expected. Because the domain is limited, certain broadly classified in three general approaches: statistical phrases (e.g., a prepositional phrase) may have only one methods, structural/pattern-based methods and reasoning- interpretation possible in the application. Structural models based methods. It is important to note that these approaches also appear at the DISCOURSE level, where models are devel- are not mutually exclusive. In fact, the most comprehensive oped that capture the interrelationships between sentences models combine all three techniques. The approaches differ and build models of topic flow. Structural models provide a in the kind of processing tasks they can perform and in the capability for detailed analysis of linguistic phenomena, but degree to which systems require handcrafted rules as the more detailed the analysis, the more one must rely on opposed to automatic training/learning from language data. hand-constructed rules rather than automatic training from A good source that gives an overview of the field involving data. An excellent collection of papers on structural all three approaches is Allen 1995. approaches, though missing recent work, is Grosz, Sparck Statistical methods involve using large corpora of lan- Jones, and Webber 1986. guage data to compute statistical properties such as word Reasoning-based approaches involve encoding knowl- co-occurrence and sequence information (see also STATISTI- edge and reasoning processes and use these to interpret lan- CAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING). For guage. This work has much in common with work in instance, a bigram statistic captures the probability of a KNOWLEDGE REPRESENTATION as well as work in the phi- word with certain properties following a word with other losophy of language. The idea here is that the interpretation properties. This information can be estimated from a corpus of language is highly dependent on the context in which the that is labeled with the properties needed, and used to pre- language appears. By trying to capture the knowledge a dict what properties a word might have based on its preced- human may have in a situation, and model common-sense ing context. Although limited, bigram models can be reasoning, problems such as word sense and sentence- surprisingly effective in many tasks. For instance, bigram structure disambiguation, analysis of referring expressions, models involving part of speech labels (e.g., noun, verb) can and the recognition of the intentions behind language can be typically accurately predict the right part of speech for over addressed. These techniques become crucial in discourse, 95 percent of words in general text. Statistical models are whether it be extended text that needs to be understood or a not restricted to part of speech tagging, however, and they dialogue that needs to be engaged in. Most dialogue-based have been used for semantic disambiguation, structural dis- systems use a speech-act–based approach to language and ambiguation (e.g., prepositional phrase attachment), and computational models of PLANNING and plan recognition to many other properties. Much of the initial work in statistical define a conversational agent. Specifically, such systems language modeling was performed for automatic speech- first attempt to recognize the intentions underlying the utter- recognition systems, where good word prediction can dou- ances they hear, and then plan their own utterances based on ble the word-recognition accuracy rate. The techniques have their goals and knowledge (including what was just recog- also proved effective in tasks such as information retrieval nized about the other agent). The advantage of this approach and producing rough “first-cut” drafts in machine transla- is that is provides a mechanism for contextual interpretation tion. A big advantage to statistical techniques is that they of language. The disadvantage is the complexity of the mod- can be automatically trained from language corpora. The els required to define the conversational agent. Two good challenge for statistical models concerns how to capture sources for work in this area are Cohen, Morgan, and Pol- higher level structure, such as semantic information, and lack 1990 and Carberry 1991. 594 Navigation There are many applications for natural language pro- tive science will remain the powerful metaphor that the cessing research, which can be roughly categorized into computer provides for understanding human language pro- three main areas: cessing. It allows us to specify models at a level of detail that would otherwise be unimaginable. We are now at the Information Extraction and Retrieval stage where end-to-end models of conversational agents can be constructed in simple domains. Work in this area will Given that much of human knowledge is encoded in textual continue to further our knowledge of language processing form, work in this area attempts to analyze such informa- and suggest novel ideas for experimentation. tion automatically and develop methods for retrieving See also COMPUTATIONAL LEXICONS; COMPUTATION LIN- information as needed. The most obvious application area GUISTICS; COMPUTATIONAL PSYCHOLINGUISTICS; CONNEC- today is in developing internet web browsers, where one TIONIST APPROACHES TO LANGUAGE; HIDDEN MARKOV wants to find web pages that contain specific information. MODELS; NATURAL LANGUAGE GENERATION While most web-based techniques today involve little more —James Allen than sophisticated keyword matching, there is considerable research in using more sophisticated techniques, such as References classifying the information in documents based on their statistical properties (e.g., how often certain word patterns Allen, J. F. (1995). Natural Language Understanding. 2nd ed. appear) as well as techniques that use robust parsing tech- Menlo Park, CA: Benjamin-Cummings. niques to extract information. A good survey of applica- Carberry, S. (1991). Plan Recognition in Natural Language Dia- tions for information retrieval can be found in Lewis and logue. Cambridge, MA: MIT Press. Sparck Jones (1996). Many of the researchers in this area Charniak, E. (1993). Statistical Language Learning. Cambridge, have participated in annual evaluations and present their MA: MIT Press. Cohen, P., J. Morgan, and M. Pollack. (1990). Communication and work at the MUC conferences (Chincor, Hirschman, and Intention. Cambridge, MA: MIT Press. Lewis 1993). Chincor, N., L. Hirschman, and D. Lewis. (1993). Evaluation of message understanding systems. Computational Linguistics Machine Translation 19(3): 409–450. Grosz, B., K. Sparck Jones, and B. Webber (1986). Readings in Given the great demand for translation services, automatic Natural Language Processing. San Francisco: Kaufmann. translation of text and speech (in simultaneous translation) Hutchins, W. J., and H. Somers. (1992). An Introduction to is a critical application area. This is one area where there Machine Translation. New York: Academic Press. is an active market for commercial products, although the Lewis, D. D., and K. Sparck Jones. (1996). Natural language pro- most useful products to date have aimed to enhance cessing for information retrieval. Communications of the ACM human translators rather than to replace them. They pro- 39(1): 92–101. vide automated dictionary/translation aids and provide rough initial translations that can be post-edited. In appli- cations where the content is stylized, such as technical and Navigation user manuals for products, it is becoming feasible to pro- duce reasonable-quality translations automatically. A See good reference for the machine-translation area is Hutch- ANIMAL NAVIGATION; ANIMAL NAVIGATION, NEURAL ins and Somers 1992. NETWORKS; COGNITIVE ARTIFACTS; HUMAN NAVIGATION Human-Machine Interfaces Neural Development Given the increased availability of computers in all aspects of everyday life, there are immense opportunities for defin- Neural development is the mechanistic link between the ing language-based interfaces. A prime area for commercial shaping forces of EVOLUTION and the physical and computa- application is in telephone applications for customer ser- tional architecture of the mature brain. A growing knowl- vice, replacing the touch-tone menu-driven interfaces with edge of how the genome is expressed in development in the speech-driven language-based interfaces. Even the simplest progressive specification of cell fates and neural structures applications, such as a ten-word automated operator service is being combined with a better understanding of the func- for long-distance calls, can save companies millions of dol- tional organization of the adult brain to define the questions lars a year. Another important but longer-term application of neural development in a way never before possible. Until concerns the computer interface itself, replacing current recently, questions in neural development were problem- interfaces with multimedia language-based interfaces that centered—for example, how do axons locate a target?, how enhance the usability and accessibility of personal comput- is a neuron’s neurotransmitter specified?, how are topo- ers for the general public. Although general systems are a graphic maps made? While most current research remains long way off, it will soon be feasible to define such inter- directed at such empirical problems, advances in our under- faces for limited applications. standing of genetics and evolution have begun to give the Although natural language processing is an area of great questions of neural development a more principled struc- practical importance and commercial application, it is ture. important to remember that its main contribution to cogni- Neural Development 595 The most surprising insight of molecular genetics and sequence; (3) neighboring elements in a map could actively evolution regarding the brain is the extreme conservation of recognize one another so that the map might travel in a fundamental genetic and physical structures across verte- coherent pattern of axons to its target; (4) different parts of brate orders, and even across phyla. Specification of gene the map might have different “road maps” to find the target; expression at the level of the individual neuron is too expen- (5) the elements in the first map might recognize locations sive of the genome; a better solution is to divide areas of the in the target map at varying degrees of specificity; (6) the developing nervous system into domains specified by over- map might develop from trial and error, based on experi- lapping patterns of gene expression that, in combination, ence; or (7) statistical regularities in the activity pattern of confer unique information to each zone. Such a solution was the input array could be used to confer order in the target originally found to be operating in the control of head devel- array. In the highly studied, paradigmatic case of map for- opment in the fruit fly Drosophila. Through a mosaic pat- mation in the retinotectal system begun with the work of tern of expression, the HOM-C class of homeotic genes Roger SPERRY over fifty years ago, every one of the logical specify segmentation during Drosophila development possibilities described above has been shown to contribute (Lewis 1978), regulating expression of other genes that to the formation of the adult map (Udin and Fawcett 1988), direct differentiation of structures along the anterior- and unsurprisingly (given the multiplicity of mechanisms), posterior neuraxis. Since this discovery, these genes or their multiple genes are required for its successful development homologues have been found throughout the animal world; (Karlstrom et al. 1996). The cause of such (at least concep- vertebrate homologues of the HOM-C genes, known as Hox tually) uneconomical solutions is unclear; because this is an genes, were found to delineate various aspects of segmenta- evolved system, it could be an accretion of solutions and tion of the vertebrate hindbrain and midbrain (Keynes and exaptations, spandrels on spandrels (Gould 1997). Different Krumlauf 1994). An immediate benefit from this descriptive features of the solution come in at different developmental work is a better understanding of the segmental architecture times, and as a whole this apparent redundancy of mecha- of the forebrain, which had been enigmatic and controver- nism may be responsible for the robust nature of much of sial. The overlapping pattern of Hox and other regulator neural development. gene expression allowed the first characterization of a con- Conversely, single mechanisms often appear in the solu- tinuous pattern of segmental architecture from spinal cord to tion to multiple developmental problems. Such a mecha- olfactory bulb in the vertebrate brain, even in the elaborate nism is the Hebbian, activity-dependent stabilization of the forebrain (Puelles and Rubenstein 1993). synapse. This single mechanism serves in diverse areas such This type of conservation of developmental patterning as stabilization of the neuromuscular junction, refinement of has been apparent not only in fundamental segmental divi- topographic maps due to the correlation of firing of neigh- sions but in many other features of development. In mor- boring units in a topographic map, sorting of unlike from phogenesis, for example, the same regulator gene (reviewed like inputs in such notable cases as the formation of ocular in Zuker 1994) is implicated in the proper development of dominance columns in the VISUAL CORTEX, and basic asso- eyes both in Drosophila and mammals, even though their ciative learning both in development and in adulthood (Katz common ancestor could not have had an image-forming eye. and Shatz 1996). A particularly interesting developmental Formation of initial axonal scaffolding and the mechanisms use of this mechanism is the “retinal waves” described in of axon extension are strikingly similar in both vertebrate the ferret where, prior to actual visual experience, the RET- and complex invertebrate brains (Easter, Ross, and Frank- INA appears to generate its own highly self-correlated waves furter 1993; Goodman 1994; Reichert and Boyan 1997). of activation that propagate through the nervous system and Within mammals, and possibly most vertebrates, the can initiate the various axonal sorting processes described sequence and relative timing of events in both early neuro- above (Wong, Meister, and Shatz 1993). A challenge for genesis and process extension produce extremely predict- future work is to describe how this mechanism might func- able alterations in morphology as brains enlarge in various tion in the creation of neural networks capable of detecting radiations (Finlay and Darlington 1995). That the structures more complex aspects of information structure than tempo- that underlie cognitive processes in complex animals can be ral association. found, albeit in reduced form, in animals without such The CEREBRAL CORTEX or isocortex has always com- capacities is an important consideration for future theory manded special attention as the structure of largest volume building in COMPUTATIONAL NEUROSCIENCE. in the human brain, and the one most closely associated with The actual solutions found in neural development to complex cognitive skills. In this case, the principal develop- clearly stated, logical problems seem bound to defy stan- mental questions have been motivated by the adult func- dard hypothesis testing. For example, a central and conspic- tional architecture of the isocortex, a structure with a rather uous feature of brain organization is the topographic uniform architecture which nevertheless carries out a num- representation of sensory surfaces and the preservation of ber of distinct and diverse functions. Is the isocortex a struc- topographic order as one brain structure maps to the next, ture that performs some sort of standard transformation of its though the information-bearing dimensions being mapped input, with differences in cortical areas (e.g., visual, motor, are sometimes unclear. How are such maps formed? A num- secondary sensory) arising epigenetically from interaction ber of mechanisms could independently produce an accept- with the input, or are early regions in some way optimized able solution: (1) the spatial relationship of elements in for their future roles in adult isocortex? connecting maps could be passively apposed; (2) temporal Both positions capture aspects of the true state of affairs, gradients could map one element to another in an organized as the following list of features of cortical development will 596 Neural Development illustrate. The neurons of the isocortex arise from a sheet of Gould, S. J. (1997). The exaptive excellence of spandrels as a term and prototype. Proc. Natl. Acad. Sci. U.S.A. 94: 10750–10755. cells in the ventricular zone that is not fully uniform in its Karlstrom, R. O., T. Trowe, S. Klostermann, H. Baier, M. Brand, neurochemical identity or rate of neurogenesis (reviewed in A. D. Crawford, B. Grunewald, P. Haffter, H. Hoffmann, S. U. Levitt, Barbe, and Eagleson 1997). Neuroblasts migrate out Meyer, B. K. Muller, S. Richter, F. J. M. van Eeden, C. Nusslei- on radial glial cells to the cortical plate in a well-described Volhard, and F. Bonhoeffer. (1996). Zebrafish mutations affect- “inside-out” settling pattern, thus potentially retaining posi- ing retinotectal axon pathfinding. Development 123: 427–438. tional information of the ventricular zone, though consider- Katz, L. C., and C. J. Shatz. (1996). Synaptic activity and the con- able dispersion also occurs (Rakic 1995). The cortical plate struction of cortical circuits. Science 274: 1133–1138. shows some neurochemical nonuniformity before it is inner- Keynes, R., and R. Krumlauf. (1994). Hox genes and the regional- vated by any outside structure (Cohen-Tannoudji, Babinet, ization of the nervous system. Annual Review of Neuroscience and Wassef 1994). The innervation of cortex by the THALA- 17: 109–132. Levitt, P., M. F. Barbe, and K. L. Eagleson. (1997). Patterning and MUS is extremely specific and mosaic, while by contrast, the specification of the cerebral cortex. Annual Review of Neuro- large majority of intracortical connectivity can be accounted science 20: 1–24. for by the generic rule, “connect to your two nearest neigh- Lewis, E. B. (1978). A gene complex controlling segmentation in bors” (Scannell, Blakemore, and Young 1995). The specific Drosophila. Nature 276: 565–570. outputs of discrete cortical areas emerge from a generic set O’Leary, D. D. M., B. L. Schlaggar, and B. B. Stanfield. (1992). of intracortical and subcortical connections in mid- to late The specification of sensory cortex: Lessons from cortical development, dependent on activity. There are several nota- transplantation. Experimental Neurology 115: 121–126. ble examples of plasticity: the isocortex can represent and Puelles, L., and J. L. R. Rubenstein. (1993). Expression patterns of transform visual input artificially induced to project to the homeobox and other regulatory genes in the embryonic mouse auditory thalamus (Roe et al. 1990), and also, early trans- forebrain suggest a neuromeric organization. Trends in Neuro- sciences 16: 472–479. plants of one cortical area to another can accept innervation Rakic, P. (1995). Radial versus tangential migration of neuronal and make connections characteristic of the new area clones in the developing cerebral cortex. Proc. Nat. Acad. Sci. (O’Leary, Schlaggar, and Stanfield 1992). However, neither U. S. A. 92: 11323–11327. the new innervation nor the new connectivity is identical to Reichert, H., and G. Boyan. (1997). Building a brain: Develop- the unaltered state. mental insights in insects. Trends in Neuroscience 20: 258– Thus, the isocortex does not fall clearly into either the 264. equipotential or modular description, but rather shows evi- Roe, A. W., S. L. Pallas, J.-O. Hahm, and M. Sur. (1990). A map of dence of some early channeling of the epigenetic land- visual space induced in primary auditory cortex. Science 250: scape in the context of a great deal of equipotentiality. 818–820. When the isocortex first encounters the information of the Scannell, J. W., C. Blakemore, and M. P. Young. (1995). Analysis of connectivity in the cat cerebral cortex. Journal of Neuro- external world, the separation of modalities and the topo- science 15: 1463–1483. graphic mapping of surfaces available already in the thala- Udin, S. B., and J. W. Fawcett. (1988). The formation of topo- mus is preserved and available. Overall, the evolutionarily graphic maps. Annual Review of Neuroscience 11: 289–328. conservative primary sensory cortices show the most evi- Wong, R. O. L., M. Meister, and C. J. Shatz. (1993). Transient dence of early morphological and connectional specializa- period of correlated bursting activity during development of the tion, while those areas of frontal and parietal cortex that mammalian retina. Neuron 11: 923–938. proliferate in the largest brains show the least. The specifi- Zuker, C. S. (1994). On the evolution of eyes: Would you like it cation of intracortical connectivity, both short- and long- simple or compound? Science 265: 742–743. range, occurs concurrent with early experience, and it seems likely that it is in this circuitry that the isocortex Further Readings will represent the predictability and variability of the out- Fraser, S. E., and D. H. Perkel. (1990). Competitive and positional side world. cues in the patterning of nerve connections. Journal of Neurobi- See also AUDITORY PLASTICITY; COGNITIVE DEVELOP- ology 21: 51–72. MENT; NATIVISM; NEURAL PLASTICITY Gould, S. J., and R. C. Lewontin. (1979). Spandrels of San Marco —Barbara Finlay and John K. Niederer and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society of London. Series B: Biological Sciences 205: 581–598. References Miller, K. D., and M. P. Stryker. (1990). The development of ocular Cohen-Tannoudji, M., C. Babinet, and M. Wassef. (1994). Early dominance columns: Mechanisms and models. In S. J. Hanson determination of a mouse somatosensory cortex marker. Nature and C. R. Olson, Eds., Connectionist Modeling and Brain 368: 460–463. Functions, The Developing Interface. Cambridge, MA: MIT Easter, S. S., L. S. Ross, and A. Frankfurter. (1993). Initial tract Press, pp. 255–350. formation in the mouse brain. Journal of Neuroscience 13: O’Leary, D. D. M. (1989). Do cortical areas emerge from a proto- 285–299. cortex? Trends in Neurosciences 12: 400–406. Finlay, B. L., and R. B. Darlington. (1995). Linked regularities in Oppenheim, R. W. (1991). Cell death during development of the the development and evolution of mammalian brains. Science nervous system. Annual Review of Neuroscience 14: 453–502. Rakic, P. (1990). Critical cellular events in cortical evolution: 268: 1578–1584. Radial unit hypothesis. In B. L. Finlay, G. Innocenti, and H. Goodman, C. (1994). The likeness of being: Phylogenetically con- Scheich, Eds., The Neocortex: Ontogeny and Phylogeny. New served molecular mechanisms of growth cone guidance. Cell York: Plenum Press, pp. 21–32. 78: 353–373. Neural Networks 597 jectories tend as the nodes in the network are updated. This Neural Networks gave a satisfying formal interpretation of constraint satisfac- tion in neural networks—as the minimization of an energy The study of neural networks is the study of information function—and provided an interesting implementation of an processing in networks of elementary numerical processors. associative memory: the attractors are the memories. In some cases these networks are endowed with a certain The second important technical development was the dis- degree of biological realism and the goal is to build models covery of a class of learning algorithms for general net- that account for neurobiological data. In other cases abstract works. The focus on learning algorithms can either be networks are studied and the goal is to develop a computa- viewed as a natural outgrowth of the earlier research on tional theory of highly parallel, distributed information-pro- adaptive algorithms for simple one-layer networks (e.g., the cessing systems. In both cases the emphasis is on LMS algorithm and the perceptron), or as a necessity born accounting for intelligence via the statistical and dynamic of the fact that general networks are difficult to analyze and regularities of highly interconnected, large-scale networks. accordingly difficult to program. In any case, the algorithms Historically, neural networks arose as a number of have greatly extended the range of the networks that can be loosely connected strands, many of which were subse- utilized in models and in practical applications, so much so quently absorbed into mainstream engineering disciplines. that in AI and engineering the topic of neural networks has Some of the earliest research on neural networks (in the become essentially synonymous with the study of numerical 1940s and 1950s) involved the study of interconnected sys- learning algorithms. tems of binary switches, or MCCULLOCH-PITTS neurons. The earliest successes were obtained with SUPERVISED This research also contributed to the development of LEARNING algorithms. These algorithms require an error AUTOMATA theory and dynamic systems theory. Links signal at each of the output nodes of the network. The para- between these fields and neural networks continue to the digm case is that of the layered feedforward network, a net- present day. work with no feedback connections between layers and no Other early research efforts emphasized adaptive systems. lateral connections within a layer. Input patterns are pre- In the 1950s and 1960s, Widrow and others studied adaptive sented at the first layer, and each subsequent layer is linear systems and in particular the LMS algorithm (cf. updated in turn, resulting in an output at the final layer. This output is compared to a desired output pattern, yielding an SUPERVISED LEARNING IN MULTILAYER NEURAL NET- WORKS). This work led to the field of adaptive signal pro- error signal. Algorithms differ in how they utilize this error cessing and provided the basis for later extensions to signal, but in one way or another the error signal is propa- nonlinear neural networks. Adaptive classifiers (systems gated backward into the network to compute updates to the with a discrete output variable) were also studied during the weights and thereby decrease the error. same period in the form of the “perceptron” algorithm and A wide variety of theoretical results are available con- related schemes; these developments contributed to the cerning neural network computation. Layered neural net- development of the engineering field of PATTERN RECOGNI- works have been shown to be universal, in the sense of TION, which continues to house much neural network being able to represent essentially any function. Best research. Finally, efforts in the area of REINFORCEMENT approximation results are available for large classes of feed- LEARNING formed a strand of neural network research with forward networks. Recurrent neural networks have been strong ties to CONTROL THEORY. In the 1980s, these ties were shown to be TURING-equivalent and have also been shown to further solidified by research establishing a link between be able to represent a wide class of nonlinear dynamic sys- reinforcement learning and optimal control theory, in partic- tems. A variety of results are also available for supervised ular the optimization technique of DYNAMIC PROGRAMMING. learning in neural networks. In particular, the Vapnik- Neural networks received much attention during the Chervonenkis (VC) dimension (a measure of the sample 1970s and 1980s, partly as a reaction against the prevailing complexity of a learning system; see COMPUTATIONAL symbolic approach to the study of intelligence in artificial LEARNING THEORY and STATISTICAL LEARNING THEORY) intelligence (AI). Emphasizing architectures that largely has been computed for simple networks, and bounds on the dispense with centralized sequential processing and strict VC dimension are available for more complex networks. In separation between process and data, researchers studied classification problems, network learning algorithms have distributed processing in highly parallel architectures (cf. been shown to converge to the posterior probabilities of the COGNITIVE MODELING, CONNECTIONIST). Intelligence was classes. Methods from statistical physics have been utilized viewed in terms of mechanisms of CONSTRAINT SATISFAC- to characterize learning curves. Finally, Bayesian statistical TION and pattern recognition rather than explicit symbol methods (see BAYESIAN LEARNING) have been exploited manipulation. A number of technical developments sus- both for the analysis of supervised learning and for the tained research during this period, two of which stand out. design of new algorithms. First, the dynamics of symmetrical networks (networks Recent years have seen an increase in interest in UNSU- in which a connection from node A to node B of a given PERVISED LEARNING and a concomitant growth in interest in strength implies a connection from B to A of the same fully probabilistic approaches to neural network design. The strength) was elucidated by the discovery of energy func- unsupervised learning framework is in many ways more tions (see RECURRENT NETWORKS). This allowed network powerful and more general than supervised learning, requir- dynamics to be understood in terms of a (generally finite) ing no error signal and no explicit designation of nodes as set of attractors, points in the state space toward which tra- input nodes or output nodes. One general way to approach 598 Neural Plasticity the problem involves specifying a generative model—an Jordan, M. I., Ed. (1998). Learning in Graphical Models. Cam- bridge, MA: MIT Press. explicit model of the way in which the environment is Ripley, B. D. (1996). Pattern Recognition and Neural Networks. assumed to generate data. In the neural network setting, Cambridge: Cambridge University Press. such models are generally realized in the form of a network. Vapnik, V. (1995). The Nature of Statistical Learning Theory. New The learner’s uncertainty about the environment is formal- York: Springer. ized by annotating the network with probabilities. The Whittaker, J. (1990). Graphical Models in Applied Multivariate learning problem in this setting becomes the classic statisti- Statistics. New York: Wiley. cal problem of finding the best model to fit the data. The learner may either explicitly manipulate an instantiation of Neural Plasticity the generative model, or may utilize a network that is obtained by inverting the generative model (e.g., via an application of Bayes’s rule). The latter network is often The functional properties of neurons and the functional referred to as a discriminative network. architecture of the CEREBRAL CORTEX are dynamic, con- Probabilistic network models are studied in other areas stantly under modification by experience, expectation, and of AI. In particular, BAYESIAN NETWORKS provide a general behavioral context. Associated with functional plasticity is a formalism for designing probabilistic networks. It is inter- process of modification of circuits, either by altering the esting to note that essentially all of the unsupervised learn- strength of a given synaptic input or by axonal sprouting ing architectures that have been studied in the neural and synaptogenesis. Plasticity has been seen under a num- network literature can be obtained by specifying a genera- ber of conditions, including functional recovery following tive model in the form of a Bayesian network. lesions of the sensory periphery of central structures, per- This rapprochement between neural networks and Bayes- ceptual learning and learning of object associations, spatial ian networks has a number of important consequences that learning, visual-motor adaptation, and context-dependent are of current research interest. First, the Bayesian network changes in receptive field properties. This discussion will formalism makes it natural to specify and manipulate prior compare plasticity observed early in development with that knowledge, an ability that eluded earlier, nonprobabilistic seen in adulthood, and then discuss the role of plasticity in neural networks. By associating a generative model with a recovery of function after lesions, in learning in sensory neural network, prior knowledge can be more readily incor- systems, and in visual-spatial integration. porated and posterior knowledge more readily extracted Much of the original work on neural plasticity in the cen- from the network. Second, the relationship between genera- tral nervous system was done in the context of experience- tive models and discriminative models can be exploited, dependent plasticity in the period of postnatal development yielding architectures that utilize feedback connections and during which cortical connections, functional architecture, lateral connectivity. Third, the strengths of the neural net- and receptive field properties continue to be refined. Hubel work focus on LEARNING—particularly discriminative learn- and Wiesel (1977) showed that in the visual system, the bal- ing—and the Bayesian network focus on inference can be ance of input from the two eyes, known as ocular domi- combined. Indeed, learning and inference can be fruitfully nance, can be influenced by keeping one eye closed, which viewed as two sides of the same coin. Finally, the emphasis shifts the balance toward the open eye, or by induced stra- on approximation techniques and laws of large numbers that bismus (where the two eyes are aimed at different points in is present in the neural network literature can be transferred the visual field), which blocks the development of binocular to the Bayesian network setting, yielding a variety of meth- cells. The substrate of these changes is an alteration in the ods for approximate inference in complex Bayesian net- extent of thalamocortical axonal arbors, which, immediately works. after birth, are undergoing a process of collateral sprouting and pruning. The plasticity of these arbors, of ocular domi- See also COGNITIVE ARCHITECTURE; COMPUTATION AND nance columns, and of the ocular dominance of receptive THE BRAIN; COMPUTATIONAL NEUROSCIENCE; CONNECTION- fields, is under experience-dependent regulation for a lim- ISM, PHILOSOPHICAL ISSUES; DISTRIBUTED VS. LOCAL REP- ited period early in the life of animals that is known as the RESENTATION; MODELING NEUROPSYCHOLOGICAL DEFICITS critical period. The length of the critical period is species- dependent, and can extend for the first few months (cats) or —Michael I. Jordan years (humans and nonhuman primates) of life. After the end of the critical period the properties involved become Further Readings fixed for the rest of the life of the animal. Bishop, C. M. (1995). Neural Networks for Pattern Recognition. The model of ocular dominance plasticity has been a New York: Oxford University Press. prime example of the role of activity and of experience in the Duda, R. O., and P. E. Hart. (1973). Pattern Classification and formation of the functional properties of neurons and of the Scene Analysis. New York: Wiley. refinement of connectivity between neurons. Models of cor- Haykin, S. (1994). Neural Networks: A Comprehensive Founda- tical development have shown how spontaneous activity in tion. New York: Macmillan College Publishing. utero can lead to the formation of cortical functional archi- Hertz, J., A. Krogh, and R. G. Palmer. (1991). Introduction to the tecture and receptive field properties in the absence of visual Theory of Neural Computation. Redwood City, CA: Addison- experience, and prenatal patterned activity in the RETINA and Wesley. cortex has been discovered. It has been shown that the effects Jensen, F. (1996). An Introduction to Bayesian Networks. London: of monocular deprivation can be prevented by blockade of UCL Press. Neural Plasticity 599 can be unmasked by a potentiation of excitatory connec- retinal activity. Some of the molecular intermediaries impli- tions or by a suppression of inhibitory connections. cated in the competition between different populations of The perceptual consequences of lesion-induced plasticity cortical afferents and of activity-dependent plasticity include can include, depending on the site of the lesion, a recovery neurotrophins and their receptors and glutamate and N- of function or perceptual distortions. PHANTOM LIMB sensa- methyl-d-aspartate (NMDA) receptors. The fundamental rule underlying this plasticity, originating from work in the tion following limb amputation has been linked by Ram- HIPPOCAMPUS and the ideas of HEBB, is that neurons that fire achandran (1993) to experimentally induced somatosensory together wire together, and that LONG-TERM POTENTIATION is cortical plasticity. For arm amputations, there is often a sen- involved in the consolidation of connections. sation of stimulation of the absent hand when stroking the Given this background it had been widely assumed that limb stump or the cheek. Human patients suffering a loss of all cortical areas, at least those involved in early sensory central retinal input (by, e.g., age-related macular degenera- processing, would have fixed properties and connections. Of tion) often adopt a preferred retinal locus in the intact retina course, some measure of plasticity would have to accom- for targeting visually guided eye movements. Lesions in pany the ability to acquire and store new percepts through- area MT, an area that plays a role in the perception of move- out life, but this had been thought to be a special property of ment and the tracking of moving objects by the eyes, ini- higher-order cortical areas, particularly those in the tempo- tially leads to a loss of smooth pursuit eye movements, but ral lobe, associated with object memory. A radical change within a few days this function recovers. It is well known has occurred in this view with the growing body of evidence that following stroke there are varying degrees of functional that experience-dependent plasticity is a universal property recovery. Though this recovery had been thought to involve of cortex, even primary sensory areas. a return to health of metabolically compromised but not Each area of sensory cortex, particularly those at early destroyed tissue, it may well be that intact areas of cortex stages in the sensory pathway, has a representation of the are taking over the function of adjacent cortical regions that sensory surface on the cortical surface. Somatosensory cor- have been destroyed. tex contains a representation of the body map (somatotopy), The mechanisms available to the cortex for functional auditory cortex of the cochlea (tonotopy), and visual cortex recovery following lesions are likely to be used for normal of the retina (visuotopy). The integrity of these maps sensory processing. It is likely that perceptual learning depends on ongoing stimulation of the periphery. Removal involves analogous changes in cortical topography. LEARN- of input from any part of the sensory surface, such as by ING in general has been divided into categories including digit amputation or retinal lesion, leads to a reorganization declarative or explicit learning (including the learning of of the cortical maps. Some of the initial evidence for corti- events, facts, and objects) and nondeclarative or implicit cal plasticity in the adult came from changes in somatotopic learning (including procedural, classical CONDITIONING, maps following digit amputation (see Merzenich and priming, and perceptual learning). These different forms of Sameshima 1993). Amputation of a body part or transection learning may be distinguished less on the basis of the under- of a sensory nerve causes the area of cortex initially repre- lying synaptic mechanisms than on the brain region in senting that part to be remapped toward a representation of which the memory is stored. While one ordinarily associates the adjacent body parts. Retinal lesions lead to a shrinkage sensory learning with the acquisition and storage of com- of the cortical representation of the lesioned part of retina plex percepts and with the temporal lobe, it has been known and an expansion in the representation of the part of retina for well over a hundred years that it is possible to improve surrounding the lesion. one’s ability to discriminate simple sensory attributes. Some The mechanism of the reorganization varies with the sen- characteristics of perceptual learning are suggestive of the sory pathways involved. Generally, the site of reorganiza- involvement of early stages in sensory processing. tion depends on the existence of exuberant connections Perceptual learning has been shown to apply to a wide linking cells representing widely separated parts of the map. range of sensory tasks, including visual acuity, hue discrim- Thus, in the somatosensory system, a measure of lesion- ination, velocity estimation, acoustic pitch discrimination, induced plasticity can be observed in the spinal cord, and two-point somatosensory acuity. This is a form of although it is likely that a considerable degree of plasticity implicit learning, generally not reaching conscious aware- is based in the somatosensory cortex. In the plasticity ness or requiring error feedback, but is associated with observed in the visual system, most of the changes are repetitively performing discrimination tasks. intrinsic to the visual cortex, and are likely to involve the The evidence supporting the idea that the neural sub- long-range horizontal connections formed by cortical pyra- strate for perceptual learning is found in primary sensory midal cells. The unmasking of these connections seen with cortex comes from the specificity of the learning and from long-term topographic reorganization involves a sprouting physiological studies demonstrating cortical changes in ani- of axonal collaterals and synaptogenesis. mals trained on simple discrimination tasks. Improvement There is a wide range of time scales over which the plas- in a visual discrimination task at one position in the visual ticity of topography takes place. The changes occurring field, for example, does not transfer to other locations. Since over the largest spatial scales in cortex (topographic shifts the highest spatial resolution is seen in primary visual cor- of up to a centimeter) require several months or years. tex, where the receptive fields are the smallest and the Smaller but significant changes can be seen within minutes topography most highly ordered, one might expect to find following a lesion, and this is likely to involve changes in the basis for specificity there. The learning also shows no the strength of existing connections. Exuberant connections transfer to orthogonal orientations, again indicative of early 600 Neural Plasticity cortical stages where selectivity for stimulus orientation is reduced, and eventually settles to a level where once again sharpest. On the other hand, the learning is also specific for the world is stabilized on the eyes. Various brain structures stimulus configuration or the context within which a feature have been suggested to be involved in this phenomenon, is embedded during the training period, which may point including the CEREBELLUM and pontine nuclei. Another toward a feedback influence from higher-order cortical revealing model of sensorimotor adaptation has been studied areas providing information about more complex features. in the owl by Knudsen and Brainard (1995). In the tectum of The physiological studies show changes in cortical mag- the owl (analogous to the mammalian superior colliculus), nification factor, or cortical recruitment, associated with there are superimposed maps of visual and auditory space. training. Training a monkey to do a texture discrimination When prisms are placed over the owl’s eyes, shifting the task with a particular digit will increase the area of primary visual map, there is a compensatory shift of the auditory map, somatosensory cortex representing that digit. Similarly, it so that once again there is a registration between the two has been suggested that training on an auditory frequency maps for a given elevation and azimuth in the external world. discrimination task increases the representation of that fre- This enables the owl to make accurate targeting movements quency in primary auditory cortex. Not only these forms of for catching prey, as detected by both visual and sound cues. implicit learning but associative learning as well causes Within sensory cortex, rapid changes in receptive field changes in the receptive fields of cells in primary sensory properties have been associated with perceptual fill-in. cortex. When a tone is associated with an aversive stimulus, When an occluder, or artificial scotoma, is placed within a cells in auditory cortex tend to shift their critical frequency background of a uniform color or a textured pattern, and toward that of the tone. The reward component of the train- when this stimulus is stabilized on the retina, the occluder ing may come from basal forebrain structures, involving the disappears over a few seconds, becoming filled with the sur- diffuse ascending cholinergic input to cortex. rounding pattern. It is supposed that this phenomenon may The storage of more complex information has been iden- be a manifestation of the process of linkage and segmenta- tified in the inferior temporal cortex. There, animals trained tion of the visual scene into contours and surfaces belonging to recognize complex forms, particularly in association with to particular objects. At the cellular level, at several stages other forms, have cells that showed elevated activity when a in the visual pathway, cells tend to change their response given form is presented. The acquisition of the trained infor- properties when their receptive fields are placed within the mation may depend on input from more medial structures, artificial scotoma. Assuming that each cell represents a line such as the perirhinal cortex. label signaling, when active, the presence of a feature cen- A central focus for studies of neural plasticity is the hip- tered within its receptive field, the response of a cell whose pocampus. As a consequence of neuropsychological findings receptive field is centered within the artificial scotoma is showing that persons with medial temporal lesions suffer an interpreted by the visual system as a shift of stimulus fea- inability to acquire and store recent memories, the hippoc- tures toward its center. ampus has been an active area of study for neural mecha- A growing body of evidence reveals a remarkable degree nisms of MEMORY. At the systems level, the hippocampus of mutability of function in primary sensory cortex that is was shown by O’Keefe (1976) to play a role in spatial learn- not limited to the first months of life but extends throughout ing, with cells being tuned for an animal’s position in its adulthood. It is becoming increasingly clear that rather than external environment, known as a place field. At the synaptic performing a fixed and stereotyped calculation on their level, the hippocampus has become the prime model for input, cells in all cortical areas represent active filters. Vari- changes in synaptic weight, through the phenomenon of ous components of cortical circuitry have been implicated long-term potentiation originally described by Bliss and as the likely source of the changes, including intrinsic hori- Lomo (1973), and long-term depression. While it has been zontal connections and feedback connections from higher- presumed that these forms of synaptic plasticity account for order cortical areas. Neural plasticity serves a wide variety the storage of complex information in the hippocampus, the of functional roles, and the extent to which it plays a role in linkage has not yet been established. It is clear, however, that the ongoing processing of sensory information depends on cells in the hippocampus are capable of rapidly changing the rapidity with which cells can alter their response proper- their place fields as the external environment is altered, and ties. To date, it has been shown that substantial changes can this alteration is associated with changes in effective connec- be induced within seconds of exposure to novel stimuli. It tivity between hippocampal neurons. remains to be seen whether even shorter-term modifications The functional properties of cells in cerebral cortex and of cortical circuits and receptive field properties may under- in brain stem have been shown to be modifiable over much lie the recognition of objects as the eyes move from saccade shorter time scales, potentially involving neural plasticity in to saccade. the ongoing processing of sensory information and in sen- See also NEURAL DEVELOPMENT; PERCEPTUAL DEVELOP- sorimotor integration. MENT; VISION AND LEARNING; VISUAL ANATOMY AND One of the earliest and most active areas of investigation PHYSIOLOGY of adult plasticity is the vestibulo-ocular reflex, the compen- —Charles Gilbert satory movement of the eyes associated with rotation of the head or body to keep the visual field stabilized on the retina. References Melville-Jones and Gonshor (1975) found that if prisms are put on the eyes to reduce the amount of retinal slip associated Bliss, T. V. P., and T. Lomo. (1973). Long-lasting potentiation of with a given amount of head rotation, the gain of the reflex is synaptic transmission in the dentate area of the anesthetized Neuroendocrinology 601 (TSH), luteinizing hormone (LH), follicle-stimulating hor- rabbit following stimulation of the perforant path. Journal of Physiology 232: 331–356. mone (FSH), prolactin, and growth hormone. These hor- Gibson, E. J. (1953). Psychol. Bull. 50: 401–431. mones, in turn, regulate endocrine responses—for example, Gilbert, C. D. (1994). Early perceptual learning. Proc. Natl. Acad. ACTH stimulates glucocorticoid secretion by the adrenal cor- Sci. U.S.A. 91: 1195–1197. tex; TSH, thyroid hormone secretion; and LH, sex hormone Gilbert, C. D., A. Das, M. Ito, M. Kapadia, and G. Westheimer. production. Other hypothalamic neurons produce the hor- (1996). Spatial integration and cortical dynamics. Proc. Natl. mones vasopressin and oxytocin and release these hormones Acad. Sci U.S.A. 93: 615–622. at nerve terminals located in the posterior lobe of the pitu- Hubel, D. H., and T. N. Wiesel. (1977). Functional architecture of itary gland. Brain activity stimulates the secretion of these macaque monkey visual cortex. Proc. R. Soc. Lond B. Biol. Sci. hormones; for example, oxytocin and prolactin release are 198: 1–59. stimulated by suckling, and the sight and sound of an infant Knudsen, E. I., and M. S. Brainard. (1995). Creating a unified rep- resentation of visual and auditory space in the brain. Annu. Rev. can stimulate “milk letdown” in the mother; ACTH is driven Neurosci. 18: 19–43. by stressful experiences and by an internal clock in the brain Linsker, R. (1986). From basic network principles to neural archi- that is entrained by the light-dark cycle; and LH and FSH tecture, III: Emergence of orientation columns. Proc. Natl. secretion are influenced by season of the year in some ani- Acad. Sci. U.S.A. 83: 8779–8783. mals. Melville-Jones, G., and A. Gonshor. (1975). Goal-directed flexibil- Thyroid hormone and sex hormones act early in life to reg- ity in the vestibulo-ocular reflex arc. In G. Lennerstrand and P. ulate development and differentiation of the brain, whereas Bach-y-Rita, Eds., Basic Mechanisms of Ocular Motility and the activity of the stress hormone axis is programmed by Their Clinical Implications. New York: Pergamon Press. early experiences via mechanisms which may depend to Merzenich, M. M., and K. Sameshima. (1993). Cortical plasticity some degree on the actions of glucocorticoid hormones. and memory. Current Opinion in Neurobiology 3: 187–196. O’Keefe, J. (1976). Place units in the hippocampus of the freely For thyroid hormone, both excesses and deficiencies of moving rat. Expt. Neurol. 51: 78–109. thyroid hormone secretion are associated with altered brain Ramachandran, V. S. (1993). Filling in gaps in perception: Part 2. development; extremes in thyroid hormone secretion lead to Scotomas and phantom limbs. Curr. Dir. Psychol. Sci. 2: 56–65. major deficiencies in cognitive function (e.g., cretinism), Squire, L. (1994). Declarative and nondeclarative memory: Multi- whereas smaller deviations in thyroid hormone secretion are ple brain systems supporting learning and memory. In D. L. linked to more subtle individual variations in brain function Schachter and E. Tulving, Eds., Memory Systems. Cambridge, and cognitive activity. MA: MIT Press, pp. 203–231. For sex hormones, the story is more complicated in that Weinberger, N. M. (1995). Dynamic regulation of receptive fields testosterone secretion during midgestation in the human and maps in the adult sensory cortex. Annu. Rev. Neurosci. 18: male and then again during the first two years of postnatal 129–158. life alters brain development and affects cognitive function as well as reproductive function. There are comparable peri- Neural Synchrony ods of testosterone production in early development in other mammals. Absence of testosterone in females leads to the See BINDING BY NEURAL SYNCHRONY female behavioral and body phenotype; and absence of androgen receptors or the lack of normal androgen secretion in genetic males leads to a feminine phenotype, whereas Neuroendocrinology exposure of genetic females to androgens early in develop- ment produces a masculine phenotype. Sexual differentia- Neuroendocrinology studies the relationships between the tion of the brain has been investigated in animals, and there endocrine system and the brain. The endocrine system pro- are subtle sex differences in a variety of brain structures, duces a variety of hormones, which are chemical messen- ranging from the hypothalamus (which governs reproduc- gers that signal changes that the body needs to make to tion) to the HIPPOCAMPUS and CEREBRAL CORTEX (which adapt to new situations. The brain controls the endocrine subserve cognitive function). There are also indications for system through the hypothalamus and pituitary gland, and structural and functional sex differences in the human brain the secretions of the gonads, adrenals, and thyroid act on tis- that are similar to those found in lower animals. For exam- sues throughout the body, and on the brain and pituitary, to ple, in both animals and humans, sex differences are found produce a wide variety of effects. Some hormone effects in the strategies used for spatial learning and memory, with occur during development and are generally long lasting males using the more global spatial cues and females rely- and even permanent for the life of the individual. Other hor- ing upon local contextual cues. mone actions take place in the mature nervous system and For stress hormones, early experience has a profound are usually reversible. Still other hormone actions in adult role in shaping the reactivity of the stress hormone axis and life are related to permanent changes in brain function asso- the secretion not only of ACTH and glucocorticoids but also ciated with disease processes or with aging. the activity of the autonomic nervous system. Prenatal stress Nerve cells in the hypothalamus produce hormones, and certain types of aversive experience in infant rodents called releasing factors, which are released into a portal (e.g., several hours of separation from the mother) increase blood supply and travel to the anterior pituitary gland where reactivity of the stress hormone axis for the lifetime of the they trigger the release of trophic hormones such as adreno- individual. In contrast, handling of newborn rat pups (a corticotropic hormone (ACTH), thyroid-stimulating hormone much briefer form of separation of the pup from the mother) 602 Neuroendocrinology produces a lifelong reduction in activity of the stress hor- erogeneously distributed in the brain, with each hormone mone axis. Actions of glucocorticoid and thyroid hormones having a unique regional pattern of localization across brain play a role in these effects. There is growing evidence that regions. The hypothalamus and amygdala have receptors for for rodents elevated stress hormone activity over a lifetime sex hormones, with both sexes expressing receptors for increases the rate of brain aging, whereas a lifetime of androgens, estrogens, and progestins, although, because of reduced stress hormone activity reduces the rate of brain sexual differentiation, there are somewhat different amounts aging (see below). of these receptors expressed in male and female brains. The Whereas the developmental actions of hormones on the hippocampus and amygdala have receptors for adrenal ste- brain are confined to windows of early development during roids, whereas thyroid hormone receptors are widely dis- fetal and neonatal life and the peripubertal period, these tributed throughout the nervous system, particularly in the same hormones produce reversible effects on brain struc- forebrain and CEREBELLUM. ture and function throughout the life of the mature nervous Effects mediated by intracellular receptors are generally system. Sex hormones activate reproductive behaviors, slow in onset over minutes to hours and long-lasting including defense of territory, courtship, and mating, and because alterations in gene expression produce effects on they regulate neuroendocrine function to ensure successful cells that can last for hours and days, or longer. Steroid hor- reproduction; however, reflecting sexual differentiation of mones also produce rapid effects on the membranes of the brain and secondary sex characteristics of the body, the many brain cells via cell surface receptors that are like the activational effects of sex hormones in adult life are often receptors for neurotransmitters. These actions are rapid in gender-specific. onset and short in duration. However, the precise nature of Thyroid hormone actions maintain normal neuronal the receptors for these rapid effects is in most cases largely excitability and promote a normal range of nerve cell struc- unknown. ture and function; excesses or insufficiencies of thyroid hor- Hormones participate in many disease processes, in some mone have adverse effects on brain function and cognition, cases as protectors and in other cases as promoters of abnor- which are largely reversible. Among these effects are exac- mal function. Adrenal steroids exacerbate neural damage erbation of depressive illness. from strokes and seizures and mediate damaging effects of There are two types of adrenal steroids—mineralocorti- severe and prolonged stress. Estrogens enhance normal coids and glucocorticoids—which regulate salt intake and declarative and episodic memory in estrogen-deficient food intake, respectively, and also modulate metabolic and women, and estrogen replacement therapy appears to reduce cognitive function during the diurnal cycle of activity the risk of Alzheimer’s disease in postmenopausal women. and rest. Adrenal steroids act to maintain homeostasis and Estrogens also have antidepressant effects; they modulate glucocorticoids do so in part by opposing, or containing, the pain mechanisms; and they regulate the neural pathways actions of other neural systems that are activated by stress involved in movement, with the result that estrogens and also by promoting adaptation of the brain to repeatedly enhance performance of fine motor skills and enhance reac- stressful experiences. Containment effects of glucocorti- tion times in a driving simulation test in women. Androgen coids oppose stress-induced activity of the noradrenergic effects are less well studied in these regards. arousal system and the hypothalamic system that releases Age-related decline of gonadal function reduces the ben- ACTH from the pituitary. Adaptational effects of stress hor- eficial and protective actions of these hormones on brain mones during prolonged or repeated stress increase or function. At the same time, age-related increases in adrenal decrease neurochemicals related to brain excitability and steroid activity promote age-related changes in brain cells neurotransmission and produce changes in neuronal struc- that can culminate in neuronal damage or cell death. Life- ture. Adrenal steroids biphasically modulate LONG-TERM long patterns of adrenocortical function, determined by POTENTIATION (LTP) in the hippocampus, with high levels early experience (see above), contribute to rates of brain of stress hormones also promoting long-term depression aging, at least in experimental animals. (LTD). LTP and LTD may be involved in learning and mem- Hormones are mediators of change, acting in large part by ory mechanisms. modulating expression of the genetic code, and they provide Primary targets of stress hormones are the hippocampal an interface between experiences of the individual and the formation and also the AMYGDALA. Repeated stress causes structure and function of the brain, as well as other organs of atrophy of hippocampal pyramidal neurons and inhibits the the body. Hormone action during development and in adult replacement of neurons of the dentate gyrus by cognitive life participates in the processes that determine individual function, enhancing episodic and declarative memory at low differences in physiology, behavior, and cognitive function. to moderate levels but inhibiting these same functions at See also AGING AND COGNITION; AGING, MEMORY, AND high levels or after acute stress. Along with adrenal steroids, THE BRAIN; MOTOR CONTROL; NEUROTRANSMITTERS; the sympathetic nervous system participates in creating the STRESS powerful memories associated with traumatic events, in —Bruce S. McEwen which the amygdala plays an important role. Glucocorticoid hormones act in both the amygdala and hippocampus to Further Readings promote consolidation. Steroid hormones and thyroid hormone act on cells Adkins-Regan, E. (1981). Early organizational effects of hor- throughout the body via intracellular receptors that regulate mones. In N. T. Adler, Ed., Neuroendocrinology of Reproduc- gene expression. Such intracellular receptors are found het- tion. New York: Plenum Press, pp. 159–228. Neuron 603 Cahill, L., B. Prins, M. Weber, and J. L. McGaugh. (1994). Beta- but not central amygdala modulates memory storage. Neurobi- adrenergic activation and memory for emotional events. Nature ology of Learning and Memory 67:176–179. 347: 702–704. Sapolsky, R. (1992). Stress, the Aging Brain and the Mechanisms Conn, P. M., and S. Melmud. (1997). Endocrinology: Basic and of Neuron Death. Cambridge, MA: MIT Press. Clinical Principles. Totowa, NJ: Humana Press. Sapolsky, R. M., L. C. Krey, and B. S. McEwen. (1986). The neu- Kimura, D. (1992). Sex differences in the brain. Sci. Amer. 267: roendocrinology of stress and aging: The glucocorticoid cas- 119–125. cade hypothesis. Endocr. Rev. 7: 284–301. Kimura, D. (1995). Estrogen replacement therapy may protect Tallal, P., and B. S. McEwen., Eds. (1991). Special Issue: Neuroen- against intellectual decline in postmenopausal women. Horm. docrine Effects on Brain Development and Cognition. Psycho- Behav. 29: 312–321. neuroendocrinology 16: 1–3. Lupien, S., A. R. Lecours, I. Lussier, G. Schwartz, N. P. V. Nair, and M. J. Meaney. (1994). Basal cortisol levels and cognitive Neuroimaging deficits in human aging. J. Neurosci. 14: 2893–2903. McEwen, B. S. (1991). Non-genomic and genomic effects of ste- roids on neural activity. Trends in Pharmacological Sciences See INTRODUCTION: NEUROSCIENCES; MAGNETIC RESO- 12: 141–147. NANCE IMAGING; POSITRON EMISSION TOMOGRAPHY McEwen, B. S. (1992). Re-examination of the glucocorticoid hypothesis of stress and aging. In D. Swaab, M. Hofman, M. Neuron Mirmiran, R. Ravid, and F. van Leeuwen, Eds., Prog. Brain Res. 93: 365–383. McEwen, B. S. (1995). Neuroendocrine interactions. In Psycho- The neuron is the main type of cell in the nervous system pharmacology: The Fourth Generation of Progress. New York: Raven Press, pp. 705–718. that, in association with neuroglial cells, mediates the infor- McEwen, B. S. (1995). Stressful experience, brain, and emotions: mation processing that underlies nervous function. As the Development, genetic and hormonal influences. In M. S. Gaz- main building block of the brain, the nerve cell is fundamen- zaniga, Ed., The Cognitive Neurosciences. Cambridge, MA: tal to the neural basis of cognitive abilities. Much of the MIT Press, pp. 1117–1135. research in contemporary neuroscience focuses on neuronal McEwen, B. S. (1998). Protective and damaging effects of stress structure, function, and pharmacology, using modern tech- mediators. N. Eng. J. Med. to appear. niques of molecular and cell biology. Based on these results, McEwen, B. S., R. Sakai, and R. Spencer. (1993). Adrenal ste- computational models of neurons and neuronal circuits are roid effects on the brain: Versatile hormones with good and being constructed to provide increasingly powerful insights bad effects. In J. Schulkin, Ed., Hormonally Induced into how the brain mediates cognitive functions. Changes in Mind and Brain. San Diego: Academic Press, pp. 157–189. In their speculations on the mind, the ancients knew McEwen, B. S., and R. M. Sapolsky. (1995). Stress and cognitive nothing of cells or neurons, nor did DESCARTES, Locke, or function. Current Opinion in Neurobiology 5: 205–216. KANT, or any other scientist or natural philosopher until the McEwen, B. S., and E. Stellar. (1993). Stress and the individual: nineteenth century. The first step toward this understanding Mechanisms leading to disease. Arch. Intern. Med. 153: 2093– was the cell theory of Schwann, in 1839, which stated that 2101. all body organs and tissues are composed of individual cells. McEwen, B. S., E. Gould, M. Orchinik, N. G.Weiland, and C. S. A nerve cell was recognized to consist of three main parts: Woolley. (1995). Oestrogens and the structural and functional the cell body (soma), containing the nucleus; short pro- plasticity of neurons: Implications for memory, ageing and neu- cesses (dendrites); and a single long process (axon) that rodegenerative processes. Ciba Foundation Symposium 191: connects to the dendrites or somata of other nerve cells 52–73. McEwen, B. S., D. Albeck, H. Cameron, H. Chao, E. Gould, N. within the region or of cells in other regions. However, the Hastings, Y. Kuroda, V. Luine, A. M. Magarinos, C. R. McKit- branches of dendrites and axons could not be clearly visual- trick, M. Orchinik, C. Pavlides, P. Vaher, Y. Watanabe, and N. ized, leading to the belief among some workers that nerve Weiland. (1995). Stress and the brain: A paradoxical role for cells are different from other cells in that their finest adrenal steroids. Vitamins and Hormones 51: 371–402. branches form a continuous functional network (called the McGaugh, J. L., L. Cahill, and B. Roozendaal. (1996). Involve- reticular theory). It was also recognized that numerous non- ment of the amygdala in memory storage: Interaction with neuronal cells, called neuroglia, surround the neurons and other brain systems. Proc. Natl. Acad. Sci. U.S.A. 93: 13508– contribute to their functions. 13514. Led by Ramón y CAJAL, neuroanatomists in the 1880s and Meaney, M. J., B. Tannenbaum, D. Francis, S. Bhatnagar, N. 1890s, using the GOLGI stain, showed that most regions of Shanks, V. Viau, D. O'Donnell, and P. M. Plotsky. (1994). Early environmental programming hypothalamic-pituitary-adrenal the brain contain several distinctive types of nerve cell with responses to stress. Seminars in the Neurosciences 6: 247–259. specific axonal and dendritic branching patterns. Dendrites Pfaff, D. W. (1980). Estrogens and Brain Function. Heidelberg: and axons were found not to be continuous, so that the nerve Springer-Verlag. cell belongs under the cell theory, as summarized in the neu- Pugh, C. R., D. Tremblay, M. Fleshner, and J. W. Rudy. (1997). A ron doctrine. How then are signals transferred between neu- selective role for corticosterone in contextual-fear conditioning. rons? Sherrington in 1897 suggested that this occurs by Behav. Neurosci. 111: 503–511. means of specialized junctions which he termed synapses. Purifoy, F. (1981). Endocrine-environment interaction in human Electron microscopists in the 1950s showed that such con- variability. Annu. Rev. Anthropol. 10: 141–162. tacts between neurons exist. Since that time, neuroanatomists Roozendaal, B., and J. L. McGaugh. (1997). Glucocorticoid recep- have elucidated the ultrastructure of the synapse as well as tor agonist and antagonist administration into the basolateral 604 Neuron the patterns of synaptic connections between the different current research on antidepressant drugs is focused on their types of neurons. The patterns of connections within a region ability to act as selective serotonin reuptake inhibitors are called canonical circuits, mediating the main types of (SSRIs). information processing within each region. Neurons and Many of the neurotransmitters can activate not only iono- their canonical circuits are organized at the next higher level tropic receptors but also metabotropic receptors coupled to into neural modules, such as the columns found in the CERE- second messenger systems, which then phosphorylate target BRAL CORTEX (see also VISUAL CORTEX). At a still higher proteins or bring about other slower metabolic changes in organization level, the patterns of neuronal connections the postsynaptic neuron (and also can act back on the pre- between regions are called distributed circuits, which consti- synaptic terminal as well). Acetylcholine acting on the heart tute the pathways and systems underlying behavior (e.g., see and epinephrine acting as a hormone are examples, as are a wide range of neuropeptides. These include such molecules VISUAL ANATOMY AND PHYSIOLOGY). Physiological studies of neuron properties have paral- as hypothalamic factors (e.g., luteinizing hormone–releasing leled these anatomical developments. The axon generates a hormone, corticotropin), opioids (e.g., enkephalins and nerve impulse (action potential), a wave of depolarization of endorphins), and numerous types found also in the gut and the surface membrane, which propagates rapidly along the other organs (vasoactive intestinal polypeptide, cholecysto- membrane from its site of initiation (the axon hillock) to the kinin, substance P, etc.; see NEUROENDOCRINOLOGY). These axon terminals. Already in 1850 a finite rate of propagation molecules, acting as neuromodulators, set behavioral states (approximately 100 m per second in the largest axons) was (e.g., the role of acetylcholine, norepinephrine, and seroto- established; this overturned the historical assumption of a nin in waking, sleeping, and levels of consciousness, or the mysterious instantaneous “nervous force” underlying the role of neuropeptide Y in feeding behavior and anxiety and mind. stress responses). Another role is in learning and memory; In the 1950s a slower potential (synaptic potential) was LONG-TERM POTENTIATION (LTP) and long-term depression discovered by Katz at the synapse. It was found that an action (LTD) of synaptic responses involve second messengers, potential invading a synapse causes it to secrete small vesi- calcium, and cyclic nucleotides, and actions on the genome cles containing a chemical neurotransmitter, which diffuses that may implement associative (Hebbian) learning. These across the cleft from the presynaptic to the postsynaptic cell. latter actions overlap with the activation of receptors con- There it acts on a membrane receptor to bring about an open- trolling growth and differentiation during development. ing of membrane channels; this lets electrically charged ions In summary, a better understanding of the effects of neu- flow across the membrane to change the membrane potential rotransmitters and neuromodulators on different types of of the postsynaptic site. Membrane proteins that contain both neurons and neuronal circuits is the necessary foundation receptor sites and ionic channels are called ionotropic recep- for an understanding of normal cognition and changes tors. Depending on which ions flow, the membrane may be underlying psychotic states. depolarized (excitatory postsynaptic potential, EPSP) or Analysis of neuronal function has been aided enormously hyperpolarized (inhibitory postsynaptic potential, IPSP). by the development of modern techniques. With the advent Among the cells of the body, chemical synapses are unique to of DNA engineering in the 1970s, the molecular basis of neurons. Thus the study of synapses lies at the heart of the neuronal structure and function is being increasingly eluci- study of brain function at the neuronal level. In addition to dated. Modern physiological research employs a variety of these chemical synapses, neurons, like other cells, may be methods in analyzing neuronal function. These include sin- interconnected by gap junctions (electrical synapses), which gle- and multiple-neuron recordings in awake behaving ani- permit electric currents and small molecules to pass directly mals (SINGLE-NEURON RECORDING); patch pipette recordings between cells. These also connect neuroglial cells, and are from neurons in slices taken from different brain regions especially prevalent during development. (this includes slices of human cerebral cortex in tissue Knowledge of NEUROTRANSMITTERS and how they gener- obtained in operations for relief of chronic epilepsy); patch ate EPSPs and IPSPs began with the study of the autonomic recordings of single membrane channels in isolated cells nervous system around 1900. The first neurotransmitter to be grown in tissue culture; and different types of functional identified was acetylcholine, shown by Loewi in the 1920s to imaging (including movement of calcium ions, glucose mediate the slow action of the vagus nerve in slowing the uptake, and voltage-sensitive dyes). A long-term goal is to heart rate, and by Katz in the 1950s to mediate the rapid relate these changes at the neuronal level to changes in blood action of motor nerves in exciting skeletal muscles. Cannon flow revealed by brain imaging methods (POSITRON EMIS- in the 1920s established epinephrine (Adrenaline) as the neu- SION TOMOGRAPHY, functional MAGNETIC RESONANCE rohormone mediating “flight-or-fight” responses, and other IMAGING, fMRI) at the systems level. This will enable an biogenic amines acting on autonomic organs were revealed integrated view of neuronal function and the neural basis of in the 1930s. The introduction of the neuroleptics chlorpro- cognition and cognitive disorders to begin to emerge. Data- mazine and reserpine for the treatment of schizophrenia in bases to support this effort are becoming available on the the 1950s shifted interest in the pharmacology of neurotrans- World Wide Web. Examples are membrane receptors and mitters to the central nervous system. The resultant growth channels (www.le.ac.uk/csn), canonical neurons and their of the field of psychopharmacology has generated mechanis- compartmental models (http://senselab.med.yale.edu/neu- tic hypotheses for all the major mental disorders. For exam- ron), and brain scans (human brain project). ple, a common action of neuroleptic (antipsychotic) drugs is Experimental approaches to the study of the neuron are a blockade of D2 receptors at dopaminergic synapses. And being greatly aided by the development of computer models Neurotransmitters 605 and the emergence of the new field of COMPUTATIONAL where in the central nervous system (CNS) or periphery. In NEUROSCIENCE. This began in the 1950s with the pioneering addition, certain other body chemicals, for example, ade- model of the axonal action potential by Hodgkin and Hux- nosine, histamine, enkephalins, endorphins, and epineph- ley. In the 1960s Rall showed that complex dendritic trees rine, have neurotransmitter-like properties, and many could be modeled as chains of compartments incorporating additional true neurotransmitters may await discovery. properties representing action potentials and synaptic poten- The first of these families, and the group about which tials. With the rise of powerful modern desktop computers it most is known, are the amine neurotransmitters, a group of is now possible for neuroscientists to construct increasingly compounds containing a nitrogen molecule that is not part accurate models of different types of neurons to aid them in of a ring structure. Among the amine neurotransmitters are analyzing the neural basis of function of a given neuron in a acetylcholine, norepinephrine, dopamine, and serotonin. given region. This work is reported in mainstream neuro- Acetylcholine is possibly the most widely used neurotrans- science journals as well as in new journals such as Neural mitter in the body, and all axons that leave the CNS (e.g., Computation and the Journal of Computational Neuro- those running to skeletal muscle, or to sympathetic or science. In contrast to this approach, connectionist networks parasympathetic ganglia) use acetylcholine as their neu- reduce the soma and dendrite of a neuron to a single node, rotransmitter. Within the brain, acetylcholine is the trans- thereby excluding most of the interesting properties of neu- mitter of, among other neurons, those generating the tracts rons as summarized above. A merging of neuron-based that run from the septum to the HIPPOCAMPUS, and from compartmental models with NEURAL NETWORKS will there- the nucleus basalis to the CEREBRAL CORTEX—both of fore be welcome, because it will provide insights into how which seem to be needed to sustain memory and learning. real brains actually carry out their system functions in medi- It is also the neurotransmitter released by short-axon inter- ating cognitive behavior. It is not unreasonable to expect neurons of the BASAL GANGLIA. Norepinephrine is the that this will also provide the philosophical foundation for a neurotransmitter released by sympathetic nerves (e.g., more profound functional explanation of the relation those innervating the heart and blood vessels) and, within between the brain and the mind. the brain, those of the locus ceruleus, a nucleus activated in the process of focusing attention. Dopamine and seroto- See also CORTICAL LOCALIZATION, HISTORY OF; ELEC- nin apparently are neurotransmitters only within the CNS. TROPHYSIOLOGY, ELECTRIC AND MAGNETIC EVOKED FIELDS Some dopaminergic (i.e., dopamine-releasing) neurons run —Gordon Shepherd from the substantia nigra to the corpus striatum; their loss gives rise to the clinical manifestations of Parkinson’s dis- Further Readings ease (Korczyn 1994); others, involved in the rewarding effects of drugs and natural stimuli, run from the mesen- Shepherd, G. M. (1991). Foundations of the Neuron Doctrine. New cephalon to the nucleus accumbens. Dopaminergic neu- York: Oxford University Press. rons involved in the actions of most antipsychotic drugs Snyder, S. H. (1996). Drugs and the Brain. New York: Scientific (which antagonize the effects of dopamine on its recep- American Books. tors) run from the brain stem to limbic cortical structures in the frontal region, while the dopamine released from Neuropsychological Deficits hypothalamic cells travels via a private blood supply, the pituitary portal vascular system, to the anterior pituitary See MENTAL RETARDATION; MODELING NEUROPSYCHOLOGI- gland, where it tonically suppresses release of the hor- mone prolactin. (Drugs that interfere with the release or CAL DEFICITS actions of this dopamine can cause lactation as a side effect, even in men.) Neurotransmitters The cell bodies, or perikarya, of serotoninergic (serotonin- releasing) neurons reside in the brain stem; their axons can Neurotransmitters are chemicals made by neurons and used descend in the spinal cord (where they “gate” incoming sen- by them to transmit signals to other neurons or non-neuronal sory inputs and also decrease sympathetic nervous outflow, cells (e.g., skeletal muscle, myocardium, pineal glandular thus lowering blood pressure) or ascend to other parts of the cells) that they innervate. The neurotransmitters produce brain. Within the brain such serotoninergic nerve terminals their effects by being released into synapses when their neu- are found in virtually all regions, enabling this transmitter to ron of origin fires (i.e., becomes depolarized) and then modulate a wide variety of behavioral and nonbehavioral attaching to receptors in the membrane of the postsynaptic functions, including, among others, mood, sleep, total food cells. This causes changes in the fluxes of particular ions intake and macronutrient (carbohydrate vs. protein) selection across that membrane, making cells more likely to become (Wurtman and Wurtman 1989), aggressive behaviors, and depolarized if the neurotransmitter happens to be excitatory, PAIN sensitivity (Frazer, Molinoff, and Winokur 1994). or less likely if it is inhibitory. Neurotransmitters can also Brains of women produce only about two thirds as much produce their effects by modulating the production of other serotonin as those of men (Nishizawa et al. 1997); this may signal-transducing molecules (second messengers) in the explain their greater vulnerability to serotonin-related dis- postsynaptic cells (Cooper, Bloom, and Roth 1996). Nine eases like depression and obesity. Within the pineal gland compounds—belonging to three chemical families—are serotonin is also the precursor of the sleep-inducing hormone generally believed to function as neurotransmitters some- melatonin (Dollins et al. 1994). 606 Neurotransmitters The second neurotransmitter family includes amino like GABA activate receptors that cause other ions—usually acids, compounds that contain both an amino group (NH2) chloride—to pass through the membrane; this usually hyper- and a carboxylic acid group (COOH) and which are also the polarizes the postsynaptic cell, and decreases the likelihood building blocks of peptides and proteins. The amino acids that it will become depolarized. (The neurotransmitter known to serve as neurotransmitters are glycine, and glutamic acid, acting via its N-methyl-d-aspartate (NMDA) glutamic and aspartic acids, present in all proteins, and γ- receptor, can also open channels for calcium ions. Some aminobutyric acid (GABA), produced only in brain neurons. investigators believe that excessive activation of these recep- Glutamic acid and GABA are the most abundant neurotrans- tors in neurological diseases can cause toxic quantities of cal- mitters within the CNS, particularly in the cerebral cortex; cium to enter the cells and kill them.) glutamic acid tends to be excitatory and GABA inhibitory. If the postsynaptic cell is a muscle cell rather than a neu- Aspartic acid and glycine subserve these functions in the ron, an excitatory neurotransmitter will cause the muscle to spinal cord (Cooper et al. 1996). contract. If the postsynaptic cell is a glandular cell, an exci- The third neurotransmitter family is composed of pep- tatory neurotransmitter will cause the cell to secrete its con- tides, compounds that contain at least two and sometimes as tents. many as 100 amino acids. Peptide neurotransmitters are While most neurotransmitters interact with their recep- poorly understood: evidence that they are, in fact, transmit- tors to change the voltage of postsynaptic cells, some neu- ters tends to be incomplete and restricted to their location rotransmitter interactions, involving a different type of within nerve terminals and to the physiological effects pro- receptor, modify the chemical composition of the postsyn- duced when they are applied to neurons. Probably the best aptic cell by either causing or blocking the formation of sec- understood peptide neurotransmitter is substance P, a com- ond messenger molecules. These second messengers pound that transmits signals generated by pain. regulate many of the postsynaptic cell’s biochemical pro- In general each neuron uses only a single compound as cesses, including gene expression; they generally produce its neurotransmitter. However some neurons contain both an their effects by activating enzymes that add high-energy amine and a peptide and may release both into synapses. phosphate groups to specific cellular proteins. Examples of Moreover, many neurons release adenosine, an inhibitory second messengers formed within the postsynaptic cell compound, along with their “true” transmitter, for instance, include cyclic adenosine monophosphate, diacylglycerol, norepinephrine or acetylcholine. The stimulant effect of caf- and inositol phosphates. Once neurotransmitters have been feine results from its ability to block receptors for this ade- secreted into synapses and have acted on their receptors, nosine. they are cleared from the synapse either by enzymatic Neurotransmitters are manufactured from circulating pre- breakdown—for example, acetylcholine, which is converted cursor compounds like amino acids, glucose, and the dietary by the enzyme acetylcholinesterase to choline and acetate, amine choline. Neurons modify the structure of these precur- neither of which has neurotransmitter activity—or, for neu- sor compounds through a series of enzymatic reactions that rotransmitters like dopamine, serotonin, and GABA, by a often are limited not by the amount of enzyme present but by physical process called reuptake. In reuptake, a protein in the concentration of the precursor, which can change, for the presynaptic membrane acts as a sort of sponge, causing example, as a consequence of eating (Wurtman 1988). Neu- the neurotransmitter molecules to reenter their neuron of rotransmitters that come from amino acids include serotonin, origin, where they can be broken down by other enzymes which is derived from tryptophan; dopamine and norepi- (e.g., monoamine oxidase, in dopaminergic, serotoninergic, nephrine, which are derived from tyrosine; and glycine, or noradrenergic neurons) or repackaged for reuse. which is derived from threonine. Among the neurotransmit- As indicated above, particular neurotransmitters are ters made from glucose are glutamate, aspartate, and GABA. now known to be involved in many neurological and Choline serves as the precursor of acetylcholine. behavioral disorders. For example, in Alzheimer’s disease, Once released into the synapse, each neurotransmitter whose victims exhibit loss of intellectual capacity (particu- combines chemically with one or more highly specific recep- larly short-term memory), disintegration of personality, tors; these are protein molecules which are embedded in the mental confusion, hallucinations, and aggressive—even postsynaptic membrane. As noted above, this interaction can violent—behaviors, many families of neurons, utilizing affect the electrical properties of the postsynaptic cell, its many neurotransmitters, die (Wurtman et al. 1996). How- chemical properties, or both. When a NEURON is in its resting ever, the most heavily damaged family seems to be the long-axon acetylcholine-releasing neurons, originating in state, it sustains a voltage of about –70 mV as the conse- the septum and the nucleus basalis, which innervate the quence of differences between the concentrations of certain hippocampal and cerebral cortices. Acetylcholinesterase ions at the internal and external sides of its bounding mem- inhibitors, which increase brain levels of acetylcholine, brane. Excitatory neurotransmitters either open protein-lined can improve short-term memory, albeit transiently, in some channels in this membrane, allowing extracellular ions, like Alzheimer’s disease patients. sodium, to move into the cell, or close channels for potas- Most drugs—therapeutic or recreational—that affect sium. This raises the neuron’s voltage toward zero, and brain and behavior do so by acting at synapses to affect the makes it more likely that—if enough such receptors are occu- production, release, effects on receptors, or inactivation of pied—the cell will become depolarized. If the postsynaptic neurotransmitter molecules (Bernstein 1988). Such drugs cell happens also to be a neuron (i.e., as opposed to a muscle can also constitute important and specific probes for under- cell), this depolarization will cause it to release its own neu- standing cognition and other brain functions. rotransmitter from its terminals. Inhibitory neurotransmitters Newell, Allen 607 computers might solve problems, non-numerical or numeri- See also MEMORY STORAGE, MODULATION OF; NEU- cal, by selective HEURISTIC SEARCH, as people do. Needing ROENDOCRINOLOGY; WORKING MEMORY, NEURAL BASIS OF programming languages that would provide flexible memory —Richard J. Wurtman structures, they invented list processing languages (or Infor- mation Processing Languages, IPLs) in 1956. Today, list pro- References cessing is an indispensable tool for artificial intelligence and computer science, central to such widely used languages as Bernstein, J. G. (1988). Drug Therapy in Psychiatry, 2nd ed. Lit- LISP and OPS5, and providing a basis for structured pro- tleton, MA: PSG Publishing. gramming. Cooper, J. R., F. E. Bloom, and R. H. Roth. (1996). The Biochemi- The research continued at Carnegie Institute of Technol- cal Basis of Neuropharmacology. 7th ed. New York: Oxford University Press. ogy (after 1965, Carnegie Mellon University), where New- Dollins, A. B., I. V. Zhdanova, R. J. Wurtman, H. J. Lynch, and M. ell enrolled in 1955 to pursue a Ph.D. in Industrial H. Deng. (1994). Effect of inducing nocturnal serum melatonin Administration, with a thesis—probably the first—in artifi- concentrations in daytime on sleep, mood, body temperature, cial intelligence. Over the next few years, Newell and his and performance. Proc. Natl. Acad. Sci. U.S.A. 91:1824–1828. associates at Rand and Carnegie used the IPLs to create the Frazer, A., P. Molinoff, and A. Winokur. (1994). Biological Bases first artificial intelligence programs, including the Logic of Brain Function and Disease. New York: Raven Press. Theorist (1956), the General Problem Solver (1959), and the Korczyn, A. D. (1994). Parkinson’s disease. In F. E. Bloom and D. NSS chess program (1958), introducing fundamental ideas J. Kupfer, Eds., Psychopharmacology: The Fourth Generation that are still at the core of PROBLEM SOLVING theory, includ- of Progress. New York: Raven Press, pp. 1479–1484. ing means-ends analysis and PLANNING. To test how well Nishizawa, S., C. Benkelfat, S. N. Young, M. Leyton, S. Mzengeza, C. de Montigny, P. Blier, and M. Diksic. (1997). these simulations accounted for human problem solving, the Differences between males and females in rates of serotonin group used thinking-aloud protocols. Newell received his synthesis in human brain. Proc. Natl. Acad. Sci U.S.A. 94: doctorate in 1958, joined the faculty of Carnegie Tech in 5308–5313. 1961 as a full professor, and retained this position for the Wurtman, R. J. (1988). Effects of their nutrient precursors on the remaining three decades of his life. synthesis and release of serotonin, the catecholamines, and ace- In 1972, Newell and Simon summarized their psychologi- tylcholine: Implications for behavioral disorders. Clin. Neuro- cal research, which employed verbal protocols and computer pharmacol. 11 (Suppl. 1): S187–193. simulation, in their book Human Problem Solving. Recog- Wurtman, R. J., and J. J. Wurtman. (January, 1989). Carbohydrates nizing the potential of PRODUCTION SYSTEMS (programs con- and depression. Scientific American 260: 50–57. sisting of condition-action statements, employed in most AI Wurtman, R. J., S. Corkin, J. H. Growdon, and R. M. Nitsch. (1996). The neurobiology of Alzheimer’s disease. Annals of the programs and expert systems), in 1981, Newell designed the New York Academy of Sciences 777.9. language OPS5. To generalize psychological simulations and endow them with a more realistic control structure, Newell’s research Newell, Allen focused increasingly on devising a powerful and veridical cognitive architecture that would provide a framework for Allen Newell (1927–1992), cognitive psychologist and general cognitive theories. A major product of this work computer scientist, made profound contributions to fields was the Soar system, developed with Paul Rosenbloom and ranging from computer architecture and programming soft- John Laird, a substantial extension of GPS that operates in ware to artificial intelligence, cognitive science, and psy- multiple problem spaces and has powerful learning capabil- chology. One of the founding fathers of the new domains of ities. Dozens of investigators are now using Soar as the artificial intelligence and cognitive science, his work con- architecture for intelligent systems, both simulations of tinues to exercise a major influence on these developing human thinking and expert systems for AI. Soar and the fields. future of unified theories were the subject of Newell’s last Newell was born on March 19, 1927, in San Francisco, book, Unified Theories of Cognition (1990), based on his the son of Dr. Robert R. Newell, a distinguished radiologist William James Lectures at Harvard. on the faculty of the Stanford Medical School, and Jeanette Apart from the Soar research, much of Newell’s produc- LeValley Newell. He attended San Francisco public schools, tive effort went into what he called his “diversions,” which and served in the Navy after World War II, assisting in map- almost all produced important contributions to cognitive sci- ping radiation intensities at the Eniwetok A-bomb tests, an ence. These included investigations with Gordon Bell of experience that awoke his interest in science. In 1949, he computer architectures, reported in Computer Structures received a B.S. degree in Physics at Stanford, then spent a (1971), and participation in a team at CMU designing paral- postgraduate year studying mathematics at Princeton Uni- lel computer architectures. He also served as chair of the versity. A desire to learn more about applications domains committee that monitored the research in computer SPEECH led him to a position studying logistics and air defense orga- RECOGNITION sponsored by the Defense Department’s nization at Rand in Santa Monica, a “think tank” supported Advanced Research Projects Agency (ARPA). Yet another by the U.S. Air Force, and gave him early contact with the “diversion” was research with Card and Moran (The Psy- then emerging electronic digital computers. chology of Human-Computer Interaction, 1983) that rein- At almost the beginning of the computer era, Newell, col- vigorated human factors studies, extending them to complex laborating with J. C. Shaw and H. A. Simon, conceived that cognitive processes. 608 Nonmonotonic Logics In addition to his scientific work, Newell provided lead- standard logics can easily represent the argument: ership in such organizations as the American Association All men are mortal for Artificial Intelligence, and the Cognitive Science Soci- Socrates is a man ety (serving as president of both), and he provided advice to ——————— agencies of the national government. He played a leading Therefore, Socrates is mortal role in creating and developing the School of Computer Sci- (∀x)(Man(x) ⇒Mortal(x)); Man(Socrates) ence at Carnegie Mellon University and the innovations in ——————— computing and electronic networking of its campus. Mortal(Socrates) For his scientific and professional contributions, Newell received numerous honors and awards, including the U.S. but cannot represent reasoning such as: National Medal of Science, the Lifetime Contributions Birds typically fly Award of the International Joint Conference on Artificial Tweety is a bird Intelligence, the Distinguished Scientific Contributions —————— Award of the American Psychological Association, and the Therefore, Tweety (presumably) flies A. M. Turing Award of the Association for Computing Machinery, and honorary degrees from the Universities of Such arguments are characteristic of commonsense reason- Groningen (Netherlands) and Pennsylvania. He was elected ing, where default rules and the absence of complete infor- to both the National Academy of Engineering and the mation are prevalent. National Academy of Sciences. The most salient feature of nonmonotonic reasoning is See also COGNITIVE MODELING, SYMBOLIC; HUMAN-COM- that the conclusion of a nonmonotonic argument may not be PUTER INTERACTION; INTELLIGENT AGENT ARCHITECTURE correct. For example, if Tweety is a penguin, it is incorrect to conclude that Tweety flies. Nonmonotonic reasoning —Herbert A. Simon often requires jumping to a conclusion and subsequently retracting that conclusion as further information becomes References available. Thus, as the set of assumptions grows, the set of Bell, C. G., and A. Newell. (1971). Computer Structures: Readings conclusions (theorems) may shrink. This reasoning is called and Examples. New York: McGraw-Hill. nonmonotonic in contrast to standard logic, which is mono- Brownston, L., Farrell, R., and E. Kent. (1985). Programming tonic: as one’s set of assumptions grows, one’s set of theo- Expert Systems in OPS5: An Introduction to Rule-Based Pro- rems grows as well. (Formally, a system is monotonic if for gramming. Reading, MA: Addison-Wesley. any two theories A and B, whenever A is a subset of B, the Card, S., T. P. Moran, and A. Newell. (1983). The Psychology of theorems of A are a subset of the theorems of B.) Human-Computer Interaction. Hillsdale, NJ: Erlbaum. All systems of nonmonotonic reasoning are fundamen- Forgy, C. L. (1979). On the Efficient Implementation of Production tally concerned with the issue of consistency: ensuring that Systems. Ph.D. diss., Department of Computer Science, Carne- conclusions drawn are consistent with one another and with gie Mellon University. Newell, A. (1990). Unified Theories of Cognition. Cambridge, the assumptions. MA: Harvard University Press. Newell, A., and J. C. Shaw. (1957). Programming the logic theory Major Systems of Nonmonotonic Reasoning machine. Proceedings of the 1957 Western Joint Computer Conference. New York: Institute of Radio Engineers. pp. 230– Default Logic (Reiter 1980) introduces a default rule, an 240. inference rule consisting of an assumption, an appeal to the Newell, A., J. C. Shaw, and H. A. Simon. (1959). Report on a gen- consistency of some formula, and a conclusion. For exam- eral problem-solving program. Proceedings of the International ple, the rule Birds typically fly could be written as Conference on Information Processing. Paris, pp. 256–264. Newell, A., J. C. Shaw, and H. A. Simon. (1958b). Chess-playing Bird(x):Fly(x) programs and the problem of complexity. IBM Journal of —————— Research and Development 2: 320–335. Fly(x) Newell, A., J. C. Shaw, and H. A. Simon. (1960). Report on a gen- which reads: if x is a bird, and it is consistent that x flies, eral problem-solving program for a computer. Proceedings of the International Conference on Information Processing. Paris: then conclude that x flies. UNESCO, pp. 256–264. Default rules must be applied with care, since conflicting Newell, A., and H. A. Simon. (1956). The logic theory machine: a default rules could cause inconsistency if used together. For complex information processing system. IRE Transactions on example, the default Quakers are usually pacifists conflicts Information Theory IT-2, no. 3: 61–79. with the default Republicans are usually nonpacifists in the Newell, A., and H. A. Simon. (1972). Human Problem Solving. case of Richard Nixon, who was both a Quaker and a Englewood Cliffs, NJ: Prentice-Hall. Republican. Applying the first default yields the conclusion that Nixon was a pacifist; applying the second default yields Nonmonotonic Logics the conclusion that Nixon was a nonpacifist; applying both yields inconsistency. One generates extensions of a default Nonmonotonic logics are used to formalize plausible rea- theory by applying as many default rules as possible. Multi- soning. They allow more general reasoning than standard ple extensions, or their equivalent, arise in all nonmonotonic logics, which deal with universal statements. For example, logics. The existence of multiple extensions may be seen as Nonmonotonic Logics 609 a feature or a problem. On the one hand, they allow the Integrating Nonmonotonic Reasoning with Other expression of reasonable but conflicting arguments within Theories one system, and thus model well commonsense reasoning Nonmonotonic reasoning systems are useful only if they can and discourse. However, conflicting defaults may give rise be successfully integrated with other theories of common- to unexpected extensions, corresponding to arguments that sense reasoning. Attempts at integration are often surpris- seem odd or unreasonable. An example of problematic mul- ingly difficult. For example, the Yale shooting problem tiple extensions is the Yale shooting problem, discussed (Hanks and McDermott 1987) showed that integrating non- below. monotonic reasoning with TEMPORAL REASONING was com- Autoepistemic Logic (Moore 1985) formalizes nonmono- plicated by the multiple extension problem. The Yale tonicity using sentences of a MODAL LOGIC of belief with shooting problem consists of determining what happens to a belief operator L. Autoepistemic Logic focuses on stable turkey when we know that sets of sentences (Stalnaker 1980)—sets of sentences that can be viewed as the beliefs of a rational agent—and the 1. a gun is loaded at 1:00 and fired at 3:00 stable expansions of a premise set. Properties of stable sets 2. firing a loaded gun at a turkey results in the turkey’s include consistency and a version of negative introspection: immediate death if a sentence P does not belong to a belief set, then the sen- 3. guns typically stay loaded (default rule D1) tence ¬ L P belongs to the belief set. This corresponds to the 4. turkeys typically stay alive (default rule D2) principle that if an agent does not believe a particular fact, he believes that he does not believe it. To formalize the Two extensions arise, only one of which is expected. In Tweety example, one represents the rule that birds typically the expected extension, D1 is applied, the gun remains fly as an appeal to an agent’s beliefs: L(Bird (x)) ∧ ¬ L (¬ loaded at 3:00, and the turkey dies. In the unexpected exten- Fly (x)) ⇒ Fly (x). If I believe that x is a bird and I don’t sion, D2 is applied and the turkey is therefore alive after believe that x cannot fly, then (I will conclude that) x flies. 3:00. This entails that the gun mysteriously becomes Any stable expansion of the premise set consisting of this unloaded between 1:00 and 3:00. The problem of formaliz- premise and the premise that Tweety is a bird will contain ing a system of temporal reasoning so that these unexpected the conclusion that Tweety flies. extensions do not arise has become a central topic in non- Circumscription (McCarthy 1980, 1986) seeks to formal- monotonic research. In nonmonotonic temporal reasoning, ize nonmonotonic reasoning within classical logic by cir- it has resulted in the development of theories of CAUSATION cumscribing, or limiting the extension of, certain predicates. and EXPLANATION (Morgenstern 1996 and Shanahan 1997 The logic limits the objects in a particular class to those that give summaries and analyses). must be in the class. For example, consider the theory con- Integrating nonmonotonic logics with MULTIAGENT SYS- taining assumptions that typical birds fly, atypical (usually TEMS is also difficult. The major problem is modeling nested called abnormal) birds do not fly, penguins are atypical, nonmonotonic reasoning: agents must reason about other Opus is a penguin, and Tweety is a bird. Opus must be in the agents’ nonmonotonic reasoning processes. Most nonmono- class of atypical, nonflying birds, but there is no reason for tonic formalisms are not expressive enough to model such rea- Tweety to be in that class; thus we conclude that Tweety can soning. Moreover, nested nonmonotonic reasoning requires fly. The circumscription of a theory is achieved by adding a that agents know what other agents do not believe, a difficult second-order axiom (or, in a first-order theory, an axiom requirement to satisfy (Morgenstern and Guerreiro 1993). schema), limiting the extension of certain predicates, to a set In general, integration may require extending both the of axioms. nonmonotonic formalism and the particular theory of com- The systems above describe different ways of determin- monsense reasoning. ing the nonmonotonic consequences of a set of assump- tions. Entailment Relations (Kraus, Lehmann, and Implementations and Applications Magidor 1990) generalize these approaches by considering Applications of nonmonotonic systems are scarce. Imple- a nonmonotonic entailment operator |~, where P |~ Q mentors run into several difficulties. First, most nonmono- means that Q is a nonmonotonic consequence of P, and by tonic logics explicitly refer to the notion of consistency of a formulating general principles characterizing the behavior set of sentences. Determining consistency is in general of |~. These principles specify how |~ relates to the stan- undecidable for first-order theories; thus predicate non- dard entailment operator |- of classical logic, and how monotonic logic is undecidable. Determining inconsistency meta-statements referring to the entailment operator can be is decidable but intractable for propositional logic; thus per- combined. forming propositional nonmonotonic reasoning takes expo- Belief Revision (Alchourron, Gardenfors, and Makin- nential time. This precludes the development of general son 1985) studies nonmonotonic reasoning from the efficient nonmonotonic reasoning systems (Selman and dynamic point of view, focusing on how old beliefs are Levesque 1993). retracted as new beliefs are added to a knowledge base. However, efficient systems have been developed for lim- There are four interconnected operators of interest: con- ited cases. LOGIC PROGRAMMING, the technique of pro- traction, withdrawal, expansion, and revision. In general, gramming using a set of logical sentences in clausal form, revising a knowledge base follows the principle of mini- uses a nonmonotonic technique known as “negation as fail- mal change: one conserves as much information as possi- ure”: a literal, consisting of an atomic formula preceded by ble. 610 Nonmonotonic Logics a nonclassical negation operator, is considered true if the Moore, R. (1985). Semantical considerations on nonmonotonic logic. Artificial Intelligence 25 (1): 75–94. atomic formula cannot be proven. Although logic programs Morgenstern, L. (1996). The problem with solutions to the frame cannot express all types of nonmonotonic reasoning, they problem. In K. Ford and Z. Pylyshyn, Eds., The Robot’s can be very efficient (Gottlob 1992). Likewise, inheritance Dilemma Revisited. Norwood: Ablex, pp. 99–133. with exceptions is an efficient, although limited, form of Morgenstern, L. (1998). Inheritance comes of age: Applying non- nonmonotonic reasoning (Horty, Thomason, and Touretzky monotonic techniques to problems in industry. Artificial Intelli- 1990; Stein 1992). These limited cases handle many com- gence (103): 237–271. Shorter version in Proceedings of the mon types of nonmonotonic reasoning. Fifteenth International Joint Conference on Artificial Intelligence Due to its efficiency, logic programming has been used (IJCAI-97). Los Altos: Morgan Kaufmann, pp. 1613–1621. for many applications, ranging from railway control to med- Morgenstern, L., and R. Guerreiro. (1993). Epistemic logics for ical diagnosis (Proceedings of the Conference on Practical multiple agent nonmonotonic reasoning I. Proceedings of the Second Symposium on Logical Formalizations of Common- Applications of Prolog 1996, 1997), although few applica- sense Reasoning (CS-93). Austin, TX, pp. 147–156. tions exploit logic programming’s nonmonotonic reasoning Reiter, R. (1980). A logic for default reasoning. Artificial Intelli- abilities. Aside from logic programming, nonmonotonic gence 13: 81–132. logic is still rarely used in the commercial world. This may Selman, B., and H. Levesque. (1993). The complexity of path-based be because the nonmonotonic reasoning community and the defeasible inheritance. Artificial Intelligence 62 (2): 303–340. commercial world focus on different problems, and because Shanahan, M. (1997). Solving the Frame Problem. Cambridge, there are few industrial-strength nonmonotonic tools (Mor- MA: MIT Press. genstern 1998). Stalnaker, R. (1980). A note on nonmonotonic modal logic. Ithaca, There are similarities between nonmonotonic logics and NY: Department of Philosophy, Cornell University. other areas of research that seek to formalize reasoning Stein, L. (1992). Resolving ambiguity in nonmonotonic inherit- ance hierarchies. Artificial Intelligence 55: 259–310. under UNCERTAINTY, such as FUZZY LOGIC and PROBABILIS- TIC REASONING (especially BAYESIAN NETWORKS). These Further Readings fields are united in their attempt to represent and reason with incomplete knowledge. The character of nonmonotonic Antoniou, G. (1997). Nonmonotonic Reasoning. Cambridge, MA: reasoning is different in that it uses a qualitative rather than MIT Press. a quantitative approach to uncertainty. Attempts to investi- Besnard, P. (1989). An Introduction to Default Logics. Berlin: gate the connections between these areas include the work Springer. of Goldszmidt and Pearl (1992, 1996). Brewka, G. (1991). Nonmonotonic Reasoning: Logical Foundations See also COMPUTATIONAL COMPLEXITY; FRAME PROB- of Commonsense. Cambridge: Cambridge University Press. Dix, J., U. Furbach, and A. Nerode, Eds. (1997). Logic Program- LEM; KNOWLEDGE REPRESENTATION; MULTIAGENT SYS- ming and Nonmonotonic Reasoning. Berlin: Springer. TEMS; PROBABILITY, FOUNDATIONS OF Etherington, D. (1988). Reasoning with Incomplete Information. —Leora Morgenstern London: Pitman. Gabbay, D., C. J. Hogger, and J. A. Robinson, Eds. (1994). Hand- book of Logic in Artificial Intelligence and Logic Program- References ming, vol. 3: Nonmonotonic Reasoning and Uncertain Alchourron, C. E., P. Gardenfors, and D. Makinson. (1985). On the Reasoning. Oxford: Clarendon Press. logic of theory change: Partial meet functions for contraction Gardenfors, P. (1988). Knowledge in Flux: Modeling the Dynamics and revision. Journal of Symbolic Logic 50: 510–530. of Epistemic States. Cambridge, MA: MIT Press. Goldszmidt, M., and J. Pearl. (1992). Rank-based systems: A sim- Gardenfors, P., Ed. (1992). Belief Revision. Cambridge: Cam- ple approach to belief revision, belief update, and reasoning bridge University Press. about evidence and actions. Third International Conference on Geffner, H. (1992). Default Reasoning: Causal and Conditional Principles of Knowledge Representation and Reasoning: (KR- Theories. Cambridge: MIT Press. 92). San Mateo, CA: Morgan Kaufmann, pp. 661–672. Genesereth, M., and N. Nilsson. (1987). Foundations of Artificial Goldszmidt, M., and J. Pearl. (1996). Qualitative probabilities for Intelligence. San Mateo: Morgan Kaufmann. default reasoning, belief revision, and causal modeling. Artifi- Ginsberg, M., Ed. (1987). Readings in Nonmonotonic Reasoning. cial Intelligence, 84: 57–112. San Mateo: Morgan Kaufmann. Gottlob, G. (1992). Complexity results for nonmonotonic logics. Konolige, K. (1988). On the relation between default and Journal of Logic and Computation 2 (3): 397–425. autoepistemic logic. Artificial Intelligence 35: 343–382. Hanks, S., and D. McDermott. (1987). Nonmonotonic logic and Lloyd, J. (1987). Foundations of Logic Programming. 2nd ed. temporal projection. Artificial Intelligence 33 (3): 379–412. Springer. Horty, J., R. Thomason, and D. Touretzky. (1990). A skeptical the- Marek, W., and M. Truszcyznski. (1993). Nonmonotonic Logic. ory of inheritance in nonmonotonic semantic networks. Artifi- Springer. cial Intelligence 42: 311–349. McDermott, D. (1982). Non-monotonic logic II: Non-monotonic Kraus, S., D. Lehmann, and M. Magidor. (1990). Nonmonotonic modal theories. Journal of the Association for Computing reasoning, preferential models, and cumulative logics. Artificial Machinery 29: 33–57. Intelligence 44: 167–207. McDermott, D., and J. Doyle (1980). Non-monotonic logic I. Arti- McCarthy, J. (1980). Circumscription—a form of nonmonotonic ficial Intelligence 25: 41–72. reasoning. Artificial Intelligence 13: 27–39. Pearl, J. (1990). Probabilistic semantics for nonmonotonic reason- McCarthy, J. (1986). Applications of circumscription to formaliz- ing: A survey. In G. Shafer and J. Pearl, Eds., Readings in ing common-sense knowledge. Artificial Intelligence 28: 86– Uncertain Reasoning. San Mateo: Morgan Kaufmann, pp. 699– 116. 710. Numeracy and Culture 611 and artifacts (e.g., symbol systems) that support numeracy. Poole, D. (1988). A logical framework for default reasoning. Arti- ficial Intelligence 36: 27–47. A broad array of human activities to which mathematical Shoham, Y. (1988). Reasoning about Change: Time and Causation thinking is applied are interwoven with cultural artifacts, from the Standpoint of Artificial Intelligence. Cambridge, MA: social conventions, and social interactions (Nunes, Schlie- MIT Press. mann, and Carraher 1993; Saxe 1991). Williams, M. (1995). Iterated theory base change: A computational Cultures have developed systems of signs that provide model. Proceedings of the Fourteenth International Joint Con- ways of thinking about quantitative information (see LAN- ference on Artificial Intelligence (IJCAI-95). Los Altos: Mor- GUAGE AND CULTURE). Different systems shed light on dif- gan Kaufmann, pp. 1541–1547. ferent aspects of knowing. That is, they provide a means to extend the ability to deal with numbers; at the same time, Numeracy and Culture they constrain numerical activities. For example, the Oksap- min of Papua New Guinea have a counting system using Numeracy is a term that has been used in a variety of ways. body parts, with no base structure, that only goes up to 27 It encompasses formal and informal mathematics, cultural (Saxe 1982). This way of quantifying is fully adequate for practices with mathematical content, and behavior mediated the numerical tasks of traditional life. It does not, however, by mathematical properties even when these properties are facilitate easy computation or the counting of objects not verbally accessible. The study of numeracy explores a beyond 27. In contrast, the perfectly regular base-10 system broad range of mathematical competencies across species, of many Asian languages appears to make the mastery of cultures, and the human lifespan. Human numeracy has uni- base-10 concepts easier for children beginning school than versal characteristics based on biological mechanisms and the less regular base-10 systems of many European lan- developmental trajectories as well as culturally variable rep- guages, including English (Miura et al. 1993). These vari- resentation systems, practices, and values. ous representational systems are culture-specific tools to Studies with diverse species show that animals are sensi- deal with counting and computing, and all cultures seem to tive to number (see Gallistel 1990 for a review). This litera- have them. Other culture-specific representation systems ture suggests that there are innate capabilities that have have been identified for locating (geometry, navigation), evolved to support numeracy in humans (Gallistel and Gel- measuring, designing (form, shape, pattern), playing (rules, man 1992). A variety of sources of evidence point to early strategies), and explaining (Bishop 1991). numerical abilities in human infants (see INFANT COGNITION). Further cultural variations in mathematical behavior are For example, infants as early as the first week of life have manifest in the ways that people use mathematical represen- been shown to discriminate between different small numbers tations in the context of everyday activities (see SITUATED (Antell and Keating 1983). The possibility that this discrimi- COGNITION AND LEARNING). Although they use the same nation is carried out by a cognitive mechanism encompassing counting numbers for different activities, child street ven- several modalities has been debated (Starkey, Spelke, and dors in Brazil were observed to use different computational Gelman 1990). In addition, studies have shown that infants strategies when selling than when doing school-like prob- have some knowledge of the effects of numerical transforma- lems (Nunes, Schliemann, and Carraher 1993). While sell- tions such as addition and subtraction (Wynn 1992). ing, they chose to use oral computation and strategies such Numerical competencies evident in the human infant are as decomposition and repeated groupings. On school-like strong candidates for universal aspects of human numeracy. problems, they chose to use paper and pencil with standard However, this does not necessarily discount the role of cul- algorithms and showed a markedly higher rate of error. ture in developing human numeracy. Within the framework Research in other domains, for example measurement (Gay of EVOLUTIONARY PSYCHOLOGY, it is argued that universal and Cole 1967) and proportional reasoning (Nunes, Schlie- characteristics of numeracy provide innate starting points for mann, and Carraher 1993), further confirms that informal numeracy development, which, in turn, is influenced by cul- mathematics can be effective and does not depend upon turally specific systems of knowledge (Cosmides and Tooby schooling for its development. 1994). Similarly, within the neo-Piagetian framework, chil- One characteristic of ethnomathematics, as informal dren are seen not simply as passing through a universal set of mathematics or NAIVE MATHEMATICS is commonly called, is stages, but also as setting out on a unique cognitive journey that the mathematics is used in pursuit of other goals rather that is guided by cultural practices (Case and Okamoto than solely for the sake of the mathematics as in school or 1996). In this sense, numeracy is viewed as a cultural prac- among professional mathematicians. As new goals arise, tice that builds on innate mechanisms for understanding representational systems and practices develop to address quantities. The result is a conceptual structure for numeracy the emergent goals. For instance, as the Oksapmin became that reflects both universal and culture-sensitive characteris- more involved with the money economy, their “body” count- tics. These conceptual structures are relatively similar across ing system began to change toward a base system (Saxe cultures that provide similar problem-solving experiences in 1982). Although the differences between informal and terms of schooling and everyday life. On the other hand, school mathematics are often stressed (Bishop 1991), skills mastery levels of particular tasks or skills may differ from developed in the informal domain can be used to address one culture to another depending on the degree to which they new goals and practices in the school setting. In Liberian are valued in each culture (Okamoto et al. 1996). In addition, schools it was found that the most successful elementary cultures influence mathematical practices through the belief students were those who combined the strategies from their systems associated with numeracy, as well as through tools indigenous mathematics with the algorithms taught in 612 Numeracy and Culture school (Brenner 1985). Similarly, Oksapmin and Brazilian Okamoto, Y., R. Case, C. Bleiker, and B. Henderson. (1996). Cross cultural investigations. In R. Case and Y. Okamoto, Eds., The children benefit from using their informal mathematics to Role of Central Conceptual Structures in the Development of learn school mathematics (Saxe 1985, 1991). Because con- Children’s Thought. Monographs of the Society for Research in flicts between informal and school mathematics frequently Child Development 61 (1–2, serial no. 246), pp. 131–155. arise, a number of authors have argued for building bridges Saxe, G. B. (1982). Developing forms of arithmetic operations between these different cultures of mathematics (Bishop among the Oksapmin of Papua New Guinea. Developmental 1991; Gay and Cole 1967; Gerdes 1988). Psychology 18: 583–594. In addition to the overt mathematical practices already Saxe, G. B. (1985). The effects of schooling on arithmetical under- described, Gerdes (1988) has described frozen mathematics standings: Studies with Oksapmin children in Papua New as the mathematics embodied in the products of a culture Guinea. Journal of Educational Psychology 77: 503–513. such as baskets, toys, and houses. Although the history of Saxe, G. B. (1991). Culture and cognitive development: Studies in mathematical understanding. Hillsdale, NJ: Erlbaum. these objects has typically been lost, the original designers Starkey, P., E. S. Spelke, and R. Gelman. (1990). Numerical of these cultural artifacts employed mathematical principles abstraction by human infants. Cognition 36: 97–128. in their design, according to Gerdes. Mathematical tradi- Wynn, K. (1992). Addition and subtraction by human infants. tions embodied in these artifacts can provide interesting Nature 358: 749–750. mathematical investigations that help children understand their own cultural heritage as well as contemporary school Further Readings mathematics. The study of numeracy and culture draws from diverse Barkow, J. H., L. Cosmides, and J. Toob, Eds. (1992). The Adapted Mind: Evolutionary Psychology and the Generation of Culture. disciplines within the cognitive sciences including psychol- New York: Oxford University Press. ogy, linguistics, biology, and anthropology. The strengths of Crump, T. (1990). The Anthropology of Numbers. New York: Cam- each discipline should be utilized to provide a more coherent bridge University Press. view of what numeracy is and how it interacts with culture. Ginsburg, H. P., J. K. Posner, and R. L. Russell. (1981). The devel- Much future work remains to be done to better understand opment of mental addition as a function of schooling and cul- the universal and culture-specific aspects of numeracy. ture. Journal of Cross-cultural Psychology 12: 163–178. See also COGNITIVE ARTIFACTS; COGNITIVE DEVELOP- Hatano, G., S. Amaiwa, and K. Shimizu. (1987). Formation of a MENT; CULTURAL VARIATION; NATIVISM mental abacus for computation and its use as a memory device for digits: A developmental study. Developmental Psychology —Yukari Okamoto, Mary E. Brenner, and Reagan Curtis 23: 832–838. Lancy, D. F. (1983). Cross-Cultural Studies in Cognition and Mathematics. New York: Academic Press. References Miller, K. F., and J. W. Stigler. (1987). Counting in Chinese: Cul- Antell, S., and D. Keating. (1983). Perception of numerical invari- tural variation in a basic cognitive skill. Cognitive Development ance in neonates. Child Development 54: 695–701. 2: 279–305. Bishop, A. (1991). Mathematical Enculturation: A Cultural Per- Moore, D., J. Beneson, J. S. Reznick, P. Peterson, and J. Kagan. spective on Mathematics Education. Dordrecht: Kluwer. (1987). Effect of auditory numerical information on infants’ Brenner, M. E. (1985). The practice of arithmetic in Liberian looking behavior: Contradictory evidence. Developmental Psy- schools. Anthropology and Education Quarterly 16: 177–186. chology 23: 665–670. Case, R., and Y. Okamoto. (1996). The role of central conceptual Nunes, T. (1992). Ethnomathematics and everyday cognition. In D. structures in the development of children’s thought. Mono- Grouws, Ed., Handbook of Research on Mathematics Teaching graphs of the Society for Research in Child Development 61 (1– and Learning. New York: Macmillan, pp. 557–574. 2, serial no. 246). Reed, H. J., and J. Lave. (1981). Arithmetic as a tool for investigat- Cosmides, L., and J. Tooby. (1994). Origins of domain specificity: ing relations between culture and cognition. In R. W. Casson, The evolution of functional organization. In L. A. Hirschfeld Ed., Language, Culture and Cognition: Anthropological Per- and S. A. Gelman, Eds., Mapping the Mind: Domain Specificity spectives. New York: Macmillan, pp. 437–455. in Cognition and Culture. Cambridge: Cambridge University Saxe, G. B., and J. K. Posner. (1983). The development of numeri- Press, pp. 85–116. cal cognition: Cross-cultural perspectives. In H. P. Ginsburg, Gallistel, C. R. (1990). The Organization of Learning. Cambridge, Ed., The Development of Mathematical Thinking. Rochester, MA: MIT Press. NY: Academic Press, pp. 291–317. Gallistel, C. R., and R. Gelman. (1992). Preverbal and verbal Song, M. J., and H. P. Ginsburg. (1987). The development of infor- counting and computation. Cognition 44: 43–74. mal and formal mathematics thinking in Korean and U. S. chil- Gay, J., and M. Cole. (1967). The New Mathematics and an Old dren. Child Development 58: 1286–1296. Culture. New York: Holt, Rinehart and Winston. Sophian, C., and N. Adams. (1987). Infants’ understanding of Gerdes, P. (1988). On culture, geometrical thinking and mathemat- numerical transformations. British Journal of Developmental ics education. Educational Studies in Mathematics 19: 137–162. Psychology 5: 257–264. Miura, I. T., Y. Okamoto, C. C. Kim, M. Steere, and M. Fayol. Starkey, P., and R. G. Cooper, Jr. (1980). Perception of numbers by (1993). First graders’ cognitive representation of number and human infants. Science 210: 1033–1035. understanding of place value: Cross-national comparisons— Stevenson, H. W., T. Parker, A. Wilkinson, B. Bonnevaux, and M. France, Japan, Korea, Sweden, and the United States. Journal Gonzalez. (1978). Schooling, environment and cognitive devel- of Educational Psychology 85: 24–30. opment: A cross-cultural study. Monographs of the Society for Nunes, T., A. D. Schliemann, and D. W. Carraher. (1993). Street Research in Child Development 43 (3, serial no. 175). Mathematics and School Mathematics. New York: Cambridge Strauss, M. S., and L. E. Curtis. (1981). Infant perception of University Press. numerosity. Child Development 52: 1146–1152. Object Recognition, Animal Studies 613 view, as confirmed by information-theoretic analysis of the Object Recognition, Animal Studies neuronal responses. The representation of objects or faces provided by these One of the major problems which must be solved by a neurons is distributed, in that each NEURON does not, in gen- visual system used for object recognition is the building of eral, respond to only one object or face, but instead responds a representation of visual information which allows recog- to a subset of the faces or objects. They thus showed ensem- nition to occur relatively independently of size, contrast, ble, sparsely distributed, encoding (Rolls and Tovee 1995; spatial frequency, position on the RETINA, and angle of Rolls et al. 1997). One advantage of this encoding is that it view, etc. It is important that invariance in the visual sys- allows receiving neurons to generalize to somewhat similar tem is made explicit in the neuronal responses, for this exemplars of the stimuli, because effectively it is the activity simplifies greatly the output of the visual system to mem- of the population vector of neuronal firing which can be ory systems such as the HIPPOCAMPUS and AMYGDALA, read out by receiving neurons (Rolls and Treves 1998). A which can then remember or form associations about second advantage is that the information available from objects (Rolls 1999). The function of these memory sys- such a population about which face or object was seen tems would be almost impossible if there were no consis- increases approximately linearly with the number of neu- tent output from the visual system about objects rons in the sample (Abbott, Rolls, and Tovee 1996; Rolls et (including faces), for then the memory systems would al. 1997). This means that the number of stimuli that can be need to learn about all possible sizes, positions, etc. of represented increases exponentially with the number of cells each object, and there would be no easy generalization in the sample (because information is a logarithmic mea- from one size or position of an object to that object when sure). This has major implications for brain operation, for it seen with another retinal size, position, or view (see Rolls means that a receiving neuron or neurons can receive a great and Treves 1998). deal of information from a sending population if each The primate inferior temporal visual cortex is implicated receiving neuron receives only a limited number of afferents by lesion evidence in providing invariance. For example, (100–1000) from a sending population. Weiskrantz and Saunders (1984; see also Weiskrantz 1990) A way in which artificial vision systems might encode showed that macaques with inferior temporal cortex lesions information about objects is to store the relative coordinates performed especially poorly in visual discrimination tasks in 3-D object-based space of parts of objects in a database, when one of the objects was shown in a different size or in and to use general-purpose algorithms on the inputs to per- different lighting. form transforms such as translation, rotation, and scale Using the population of neurons in the cortex in the supe- change in 3-D space to see if there is any match to a stored rior temporal sulcus and inferior temporal cortex with 3-D representation (e.g., Marr 1982). One problem (see also responses selective for faces, it has been found that the Rolls and Treves 1998) with implementing such a scheme in responses are relatively invariant with respect to size and the brain is that a detailed syntactical description of the rela- contrast (Rolls and Baylis 1986); spatial frequency (Rolls, tions between the parts of the 3-D object is required, for Baylis, and Leonard, 1985; Rolls, Baylis, and Hasselmo, example, body > thigh > shin > foot > toes. Such syntactical 1987) and retinal translation, that is, position in the visual networks are difficult to implement in neuronal networks, field (Tovee, Rolls, and Azzopardi 1994; cf. earlier work by because if the representations of all the features just men- Gross 1973; Gross et al. 1985). Some of these neurons even tioned were active simultaneously, how would the spatial have relatively view-invariant responses, responding to dif- relations between the features also be encoded? (How ferent views of the same face but not of other faces (Has- would it be apparent just from the firing of neurons that the selmo et al. 1989; see FACE RECOGNITION). toes were linked to the rest of foot but not to the body?) To investigate whether view-invariant representations of Another more recent suggestion for a syntactically linked objects are also encoded by some neurons in the inferior set of descriptors is that of Biederman (1987; see also Hum- temporal cortex (area TE) of the rhesus macaque, the activ- mel and Biederman 1992). ity of single neurons was recorded while monkeys were An alternative, more biologically plausible scheme is shown very different views of ten objects (Booth and Rolls that the brain might store a few associated 2-D views of 1998). The stimuli were presented for 0.5 sec on a color objects, with generalization within each 2-D view, in order video monitor while the monkey performed a visual fixation to perform invariant object and face recognition (Koen- task. The stimuli were images of ten real plastic objects derink and Van Doorn 1979; Poggio and Edelman 1990; which had been in the monkey’s cage for several weeks to Rolls 1992, 1994; Logothetis et al. 1994; Wallis and Rolls enable him to build view-invariant representations of the 1997). The way in which the brain could learn and access objects. Control stimuli were views of objects which had such representations is described next. never been seen as real objects. The neurons analyzed were Cortical visual processing for object recognition is con- in the TE cortex in and close to the ventral lip of the anterior sidered to be organized as a set of hierarchically connected part of the superior temporal sulcus. Many neurons were cortical regions consisting at least of V1, V2, V4, posterior found that responded to some views of some objects. How- inferior temporal cortex (TEO), inferior temporal cortex ever, for a smaller number of neurons, the responses (e.g., TE3, TEa, and TEm), and anterior temporal cortical occurred only to a subset of the objects, irrespective of the areas (e.g., TE2 and TE1). There is convergence from each viewing angle. These latter neurons thus conveyed informa- small part of a region to the succeeding region (or layer in tion about which object had been seen, independently of the hierarchy) in such a way that the receptive field sizes of 614 Object Recognition, Animal Studies neurons (e.g., one degree near the fovea in V1) become larger by a factor of approximately 2.5 with each succeeding stage (and the typical parafoveal receptive field sizes found would not be inconsistent with the calculated approxima- tions of, for example, eight degrees in V4, twenty degrees in TEO, and fifty degrees in inferior temporal cortex; Bous- saoud, Desimone, and Ungerleider 1991; see figure 1). Such zones of convergence would overlap continuously with each other. This connectivity would be part of the architecture by which translation-invariant representations are computed. Each layer is considered to act partly as a set of local self- organizing competitive neuronal networks with overlapping inputs. These competitive nets (described, e.g, by Rolls and Treves 1998) operate to detect correlations between the activity of the input neurons, and to allocate output neurons to respond to each cluster of such correlated inputs. These networks thus act as categorizers, and help to build feature Figure 1. Schematic diagram showing convergence achieved by the analyzers. In relation to visual information processing, they forward projections in the visual system, and the types of would remove redundancy from the input representation. representation that may be built by competitive networks operating at each stage of the system, from the primary visual cortex (V1) to Translation invariance would be computed in such a sys- the inferior temporal visual cortex (area TE; see text). LGN— tem by utilizing competitive learning to detect statistical lateral geniculate nucleus. Area TEO forms the posterior inferior regularities in inputs when real objects are translated in the temporal cortex. The receptive fields in the inferior temporal visual physical world. The hypothesis is that because objects have cortex (e.g., in the TE areas) cross the vertical midline (not shown). continuous properties in space and time in the world, an object at one place on the retina might activate feature ana- lyzers at the next stage of cortical processing, and when the object was translated to a nearby position, because this View-independent representations could be formed by the would occur in a short period (e.g., 0.5 sec), the membrane same type of computation, operating to combine a limited set of the postsynaptic neuron would still be in its “Hebb- of views of objects. Consistent with the suggestion that the modifiable” state (caused for example by calcium entry as a view-independent representations are formed by combining result of the voltage-dependent activation of N-methyl-d- view-dependent representations in the primate visual system aspartate receptors), and the presynaptic afferents activated is the fact that in the temporal cortical areas, neurons with with the object in its new position would thus become view-independent representations of faces are present in the strengthened on the still-activated postsynaptic neuron. It is same cortical areas as neurons with view-dependent repre- proposed that the short temporal window (e.g., 0.5 sec) of sentations (from which the view-independent neurons could Hebb modifiability helps neurons to learn the statistics of receive inputs; Hasselmo et al. 1989; Perrett, Mistlin, and objects moving in the physical world, and at the same time Chitty 1987). to form different representations of different feature combi- This hypothesis about the computation of invariant repre- nations or objects, as these are physically discontinuous sentations has been implemented in a computational model and present less regular statistical correlations to the visual by Wallis and Rolls (1997), and a related model with a trace system. Foldiak (1991) has proposed computing an average version of the Hebb rule implemented in recurrent collateral activation of the postsynaptic neuron to assist with the connections has been analyzed using the methods of statisti- same problem. The idea here is that the temporal properties cal physics (Parga and Rolls 1998). of the biologically implemented learning mechanism are Another suggestion for the computation of translation such that it is well suited to detecting the relevant continu- invariance is that the image of an object is translated to stan- ities in the world of real objects. Rolls (1992, 1994) has dard coordinates using a circuit in V1 that has connections also suggested that other invariances, for example, size, for every possible translation, and switching on in a multi- spatial frequency, and rotation invariance, could be learned plication operation just the correct set of connections by a comparable process. (Early processing in V1, which (Olshausen, Anderson, and Van Essen 1993). This scheme enables different neurons to represent inputs at different does not appear to be fully plausible biologically, in that all spatial scales, would allow combinations of the outputs of possible sets of connections do not appear to be present (in such neurons to be formed at later stages. Scale invariance the brain), the required multiplier inputs and multiplication would then result from detecting at a later stage which neu- synapses do not appear to be present; and such a scheme rons are almost conjunctively active as the size of an object could perform translation-invariant mapping in one stage, alters.) It is proposed that this process takes place at each whereas in the brain it takes place gradually over the whole stage of the multiple-layer cortical-processing hierarchy, so series of visual cortical areas V1, V2, V4, posterior inferior that invariances are learned first over small regions of temporal, and anterior inferior temporal, with an expansion space, and then over successively larger regions. This limits of the receptive field size (and thus of translation invariance) the size of the connection space within which correlations of approximately 2.5 at each stage (see figure 1 and Rolls must be sought. 1992, 1994; Wallis and Rolls 1997; Rolls and Treves 1998). Object Recognition, Human Neuropsychology 615 See also HIGH-LEVEL VISION; MEMORY, ANIMAL STUDIES; Rolls, E. T., G. C. Baylis, and M. E. Hasselmo. (1987). The responses of neurons in the cortex in the superior temporal sul- MID-LEVEL VISION; OBJECT RECOGNITION, HUMAN NEUROP- cus of the monkey to band-pass spatial frequency filtered faces. SYCHOLOGY; VISUAL OBJECT RECOGNITION, AI; VISUAL Vision Research 27: 311–326. PROCESSING STREAMS Rolls, E. T., G. C. Baylis, and C. M. Leonard. (1985). Role of low —Edmund T. Rolls and high spatial frequencies in the face-selective responses of neurons in the cortex in the superior temporal sulcus. Vision Research 25: 1021–1035. References Rolls, E. T., and M. J. Tovee. (1995). Sparseness of the neuronal Abbott, L. A., E. T. Rolls, and M. J. Tovee. (1996). Representa- representation of stimuli in the primate temporal visual cortex. tional capacity of face coding in monkeys. Cerebral Cortex 6: Journal of Neurophysiology 73: 713–726. 498–505. Rolls, E. T., and A. Treves. (1998). Neural Networks and Brain Biederman, I. (1987). Recognition-by-components: A theory of Function. Oxford: Oxford University Press. human image understanding. Psychological Review 94: 115–147. Rolls, E. T., A. Treves, M. Tovee, and S. Panzeri. (1997). Informa- Booth, M. C. A., and E. T. Rolls. (1998). View-invariant represen- tion in the neuronal representation of individual stimuli in the tations of familiar objects by neurons in the inferior temporal primate temporal visual cortex. Journal of Computational Neu- visual cortex. Cerebral Cortex 8: 510–523. roscience 4: 309–333. Boussaoud, D., R. Desimone, and L. G. Ungerleider. (1991). Tovee, M. J., E. T. Rolls, and P. Azzopardi. (1994). Translation Visual topography of area TEO in the macaque. Journal of invariance and the responses of neurons in the temporal visual Comparative Neurology 306: 554–575. cortical areas of primates. Journal of Neurophysiology 72: Foldiak, P. (1991). Learning invariance from transformation 1049–1060. sequences. Neural Computation 3: 193–199. Wallis, G., and E. T. Rolls. (1997). Invariant face and object recog- Gross, C. G. (1973). Inferotemporal cortex and vision. Progress in nition in the visual system. Progress in Neurobiology 51: 167– Psychobiology and Physiological Psychology. 5: 77–123. 194. Gross, C. G., R. Desimone, T. D. Albright, and E. L. Schwartz. Weiskrantz, L. (1990). Visual prototypes, memory and the infer- (1985). Inferior temporal cortex and pattern recognition. Exper- otemporal cortex. In E. Iwai and M. Mishkin, Eds., Vision, imental Brain Research 11 (Suppl.) 179–201. Memory and the Temporal Lobe. New York: Elsevier, pp. 13–28. Hasselmo, M. E., E. T. Rolls, G. C. Baylis, and V. Nalwa. (1989). Weiskrantz, L., and R. C. Saunders. (1984). Impairments of visual Object-centered encoding by face-selective neurons in the cor- object transforms in monkeys. Brain 107: 1033–1072. tex in the superior temporal sulcus of the monkey. Experimen- tal Brain Research 75: 417–429. Further Readings Hummel, J. E., and I. Biederman. (1992). Dynamic binding in a Ullman, S. (1996). High-Level Vision: Object Recognition and neural network for shape recognition. Psychological Review 99: Visual Cognition. Cambridge, MA: MIT Press/Bradford Books. 480–517. Koenderink, J. J., and A. J. Van Doorn. (1979). The internal repre- sentation of solid shape with respect to vision. Biological Object Recognition, Human Cybernetics 32: 211–216. Neuropsychology Logothetis, N. K., J. Pauls, H. H. Bülthoff, and T. Poggio. (1994). View-dependent object recognition by monkeys. Current Biol- ogy 4: 401–414. Most of what we know about the neural mechanisms of Marr, D. (1982). Vision. San Francisco: W. H. Freeman. object recognition in humans has come from the study of Olshausen, B. A., C. H. Anderson, and D. C. Van Essen. (1993). A agnosia, or impaired object recognition following brain neurobiological model of visual attention and invariant pattern damage. In addition, in recent years, functional neuroimag- recognition based on dynamic routing of information. Journal of Neuroscience 13: 4700–4719. ing in normal humans has begun to offer insights into object Parga, N., and E. T. Rolls. (1998). Transform invariant recognition recognition. This article reviews both literatures, with by association in a recurrent network. Neural Computation 10: greater emphasis accorded to agnosia because of its cur- 1507–1525. rently greater contribution to our understanding of human Perrett, D. I., A. J. Mistlin, and A. J. Chitty. (1987). Visual neurons object recognition. responsive to faces. Trends in Neuroscience 10: 358–364. To be considered agnosia, an object recognition impair- Poggio, T., and S. Edelman. (1990). A network that learns to rec- ment must be selective in the sense of not being attributable to ognize three-dimensional objects. Nature 343: 263–266. impaired elementary perceptual function or general intellec- Rolls, E. T. (1992). Neurophysiological mechanisms underlying tual decline. It must also be a true impairment of recognition, face processing within and beyond the temporal cortical visual as opposed to an impairment of naming. Agnosias are gener- areas. Philosophical Transactions of the Royal Society of Lon- don, Series B: Biological Sciences 335: 11–21. ally confined to a single perceptual modality, such as auditory Rolls, E. T. (1994). Brain mechanisms for invariant visual recogni- (Vignolo 1969), tactile (Reed, Caselli, and Farah 1996), or tion and learning. Behavioural Processes 33: 113–138. visual (Farah 1990), suggesting that for each perceptual Rolls, E. T. (1995). Learning mechanisms in the temporal lobe modality there is a stage of processing beyond elementary visual cortex. Behavioural Brain Research 66: 177–185. perceptual processes that is nevertheless modality-specific, Rolls, E. T. (1999). The Brain and Emotion. Oxford: Oxford Uni- and that represents learned information about objects’ sounds, versity Press. tactile qualities, and visual appearances. In the case of visual Rolls, E. T., and G. C. Baylis. (1986). Size and contrast have only agnosia, which is the focus of this article, this stage presum- small effects on the responses to faces of neurons in the cortex ably corresponds to the inferior temporal regions that have of the superior temporal sulcus of the monkey. Experimental been studied using physiological techniques in the monkey Brain Research 65: 38–48. 616 Object Recognition, Human Neuropsychology (see also OBJECT RECOGNITION, ANIMAL STUDIES). The differ- ent types of visual agnosia provide insights into the organiza- tion of high-level visual object representations in humans, by showing us the “fracture lines” of the system. Lissauer (1890) introduced a fundamental distinction between two broad classes of agnosia: those in which per- ception seemed clearly at fault, which he termed appercep- tive, and those in which perception seemed at least roughly intact, which he termed associative. Lissauer hypothesized that the latter type of patient suffered from an inability to associate percepts with meaning. Although the theory behind this classification system is now widely questioned, the classification itself—that is, the separation of patients with obvious perceptual disorders from patients without obvious perceptual disorders—has proved useful. The apperceptive agnosias have received less attention than the associative agnosias, perhaps because they are less surprising or counterintuitive. Although such elementary visual functions as acuity and color perception are roughly intact, higher levels of perception such as visual grouping appear to be disrupted, and the object recognition impair- ment is secondary to these perceptual impairments. In this article I focus on the associative agnosias because they are the most directly relevant to object recognition per se. Read- ers may consult chapters 2 and 3 of Farah (1990) for further information on the apperceptive agnosias. In contrast to the apperceptive agnosias, perception seems roughly normal in associative agnosia, and yet patients can- Figure 1. Three drawings that an associative agnosic patient could not recognize much of what they see. Most often, associative not recognize (left) and the good-quality copies that he was agnosia follows bilateral inferior occipitotemporal lesions, nevertheless able to produce (right). although unilateral lesions of either the left or right hemi- sphere are sometimes sufficient (see the final section). The in which associative agnosic patients copy and match is classic case of Rubens and Benson (1971) shows all of the consistent with the use of lower-level visual representations cardinal signs of associative agnosia, including preserved in which objects per se are not explicitly represented: they recognition of objects through modalities other than vision, copy line by line, and match feature by feature, unlike nor- failure to indicate visual recognition verbally and nonver- mal subjects who organize their copying and matching of bally, and apparently good visual perception. these local elements into more global, object-based units The patient could not identify common objects presented vi- (see Farah 1990, chaps. 4 and 5 for a review). The left side sually, and did not know what was on his plate until he tasted of figure 1 shows three drawings that an associative agnosic it. He identified objects immediately on touching them. patient was unable to recognize. When I asked him to copy When shown a stethoscope, he described it as “a long cord the drawings, he produced the very adequate copies shown with a round thing at the end,” and asked if it could be a on the right, but only after a laborious, line-by-line process. watch. He identified a can opener as “could be a key. . . .” He was never able to describe or demonstrate the use of an ob- Studies using POSITRON EMISSION TOMOGRAPHY (PET) ject if he could not name it. . . . He could match identical ob- and functional MAGNETIC RESONANCE IMAGING (fMRI) jects but not group objects by categories (clothing, food). He have confirmed the most basic conclusion to be drawn from could draw the outlines of objects which he could not identi- the agnosia literature, namely, that there are visual modal- fy. . . . Remarkably, he could make excellent copies of line ity-specific brain regions in inferior temporo-occipital cor- drawings and still fail to name the subject. tex whose function is object perception. Relative to baselines involving the viewing of gratings, random lines, or The modality-specific failure of object recognition, in the disconnected object pieces, the viewing of objects is gener- context of normal intellect, is what one would expect fol- ally associated with temporal or occipital activation, or lowing destruction of the kinds of object representations both, in both hemispheres (e.g., see Menard et al. 1996; found at higher levels of the primate visual system, and Kanwisher et al. 1996, 1997; Sergent, Ohta, and MacDonald indeed the most common lesion locations are roughly con- 1992; see Aguirre and Farah 1998 for a review). Further- sistent with this hypothesis (the human lesions are perhaps a more, this localization held both for studies that required bit more posterior). Although the good copies and success- subjects to perform active information retrieval (e.g., is the ful matching performance of associative agnosic patients depicted object living or nonliving?) and for others that might seem inconsistent with the hypothesis of a visual per- required only passive viewing, suggesting that the critical ceptual impairment, these perceptual tasks do not require determinant of the region’s activation is object perception, the use of object representations per se. Indeed, the manner Object Recognition, Human Neuropsychology 617 abstract shapes (Farah and Wallace 1991; Kinsbourne and rather than the association of stored memory knowledge Warrington 1962; Levine and Calvanio 1978; Sekuler and with a percept. Indeed, Kanwisher et al. (1997) found no Behrmann 1996; see Farah and Wallace 1991 for a discus- greater activation for unfamiliar objects than for familiar sion of some apparently conflicting data). Although clinical objects, which have a preexisting memory representation. descriptions suggest that in some cases orthographic stimuli Agnosia does not always affect all types of stimuli may be disproportionately affected, for example, relative to equally. The scope of the deficit varies from case to case, numerical stimuli, this can be understood in terms of segre- with recognition of faces, objects, and printed words all gation of representations for orthographic stimuli within a pairwise dissociable. These dissociations provide us with visual area dedicated to rapid encoding of multiple shapes insights into the internal organization of high-level visual in general (Farah in press). Polk and Farah (1995) describe object representation. Similarly, neuroimaging studies have and test a mechanism by which such segregation could sometimes found differing patterns of activation for differ- occur in a self-organizing network, based on the statistics of ent stimulus types (Aguirre and Farah 1998). co-occurrence among letter and nonletter stimuli in the When agnosia is confined to faces or is disproportionately environment. severe for faces, it is prosopagnosia. There are many cases of Just as pure alexia is an impairment of printed word rec- profound FACE RECOGNITION impairment, with little or no ognition in the absence of obvious impairments of single- evident object agnosia, in the literature. Pallis (1955) pro- object recognition, there are cases of object recognition vides a detailed case study of a patient whose impairment of impairment with preserved reading. For example, the case face recognition was so severe, he mistook his own reflection of Gomori and Hawryluk (1984) was impaired at recogniz- in a mirror for a rude stranger staring at him. ing a variety of objects and the faces of his friends and fam- Are faces really disproportionately impaired in ily. He nevertheless continued to read with ease, even when prosopagnosia, consistent with a distinct subsystem for face interfering lines were drawn across the words. Thus, like recognition, or does the appearance of a selective deficit prosopagnosia and object agnosia, pure alexia and object result from the need for exceedingly fine discrimination agnosia are doubly dissociable. among visually similar members of a single category? Only one neuroimaging study has directly compared the Recent evidence suggests that there is specialization within patterns of activation evoked by printed words and objects, the visual system for faces. McNeill and Warrington (1993) and did find a degree of separation (Menard et al. 1996). showed that a prosopagnosic patient was better able to rec- Localization of high-level object representation in the ognize individual sheep faces than individual human faces, human visual system: Ironically, although the primary goal even though normal subjects find the human faces easier to of neuroimaging is localization, and agnosia research is sub- recognize. Farah, Klein, and Levinson (1995) showed that a ject to the vagaries of naturally occurring lesions, the clear- prosopagnosic patient was disproportionately impaired at est evidence concerning the anatomy of object recognition face recognition relative to common object recognition, tak- comes from patient research. In general, the intrahemi- ing into account the difficulty of the stimulus sets for nor- spheric location of damage is generally occipitotemporal, mal subjects. This was true even when the common objects involving both gray and white matter. In order to understand were all eyeglass frames, a large and visually homogeneous the laterality of visual recognition processes, it is crucial to category. Farah et al. (1995) showed that the same subject distinguish between subtypes of agnosia. Cases of associa- was impaired at upright face perception relative to inverted tive agnosia have been reported following unilateral right face perception, even though normal subjects find the latter hemisphere lesions, unilateral left hemisphere lesions, and harder. The existence of patients who are more impaired bilateral lesions. The dual systems hypothesis presented with objects than with faces also supports the independence above helps reduce the variability in lesion site. Agnosic of prosopagnosia and object agnosia. Feinberg et al. (1994) patients presumed to have an impairment of just the first documented impaired object recognition in a series of ability (in mild form affecting just faces, in more severe patients with preserved face recognition. form affecting faces and objects but not words) usually have Attempts to dissociate face and object perception using bilateral inferior lesions, although occasionally unilateral neuroimaging have produced variable results, although in at right hemisphere lesions are reported (Farah 1991). Agnosic least some studies the patterns of activation were different patients presumed to have an impairment of just the second (Sergent et al. 1992; Kanwisher et al. 1996). ability (in mild form affecting just words, in more severe Orthography-specific processing systems? So-called form affecting words and objects but not faces) generally pure alexics typically read words letter by letter, in a slow have unilateral left inferior lesions (Farah 1991; Feinberg et and generally error-prone manner. Their impairment is al. 1994). Agnosic patients presumed to have an impairment called “pure” because they are able to comprehend spoken in both abilities (affecting faces, objects, and words) gener- words, they have no problem writing words, and their recog- ally have bilateral lesions. nition of objects and faces seems normal. Although pure Aside from confirming the generalization that face, alexia is generally discussed in the context of language and object, and VISUAL WORD RECOGNITION tasks involve poste- reading disorders, it is clearly also an impairment of visual recognition affecting printed words. Furthermore, in all the rior cortices, the neuroimaging literature tells us little about cases so far examined, the visual recognition impairment is the localization of different subtypes of object recognition not confined to words, but also affects the processing of (Aguirre and Farah 1998). The precise locations of areas nonorthographic stimuli whenever rapid processing of mul- responsive to faces, nonface objects, and words differ widely, tiple shapes is required, be they letters in words or sets of from study to study, within posterior association cortex. 618 Oculomotor Control Whether this reflects individual variability in brain organiza- Polk, T. A., and M. J. Farah. (1995). Brain localization for arbi- trary stimulus categories: A simple account based on Hebbian tion, problems with normalization and statistical procedures learning. Proceedings of the National Academy of Sciences 92: for analyzing images of brain activity, or the difference 12370–12373. between localizing areas that are activated (as revealed by Reed, C. L., R. Caselli, and M. J. Farah. (1996). Tactile agnosia: neuroimaging) vs. areas that are necessary (as revealed by Underlying impairment and implications for normal tactile lesions) for visual recognition remains to be discovered. object recognition. Brain 119: 875–888. See also AMYGDALA, PRIMATE; HIGH-LEVEL VISION; Rubens, A. B., and D. F. Benson. (1971). Associative visual agno- MODELING NEUROPSYCHOLOGICAL DEFICITS; SELF-ORGA- sia. Archives of Neurology 24: 305–316. NIZING SYSTEMS; SHAPE PERCEPTION; VISUAL OBJECT REC- Sekuler, E., and M. Behrmann. (1996). Perceptual cues in pure alexia. Cognitive Neuropsychology 13: 941–974. OGNITION, AI Sergent, I., S. Ohta, and B. MacDonald. (1992). Functional neu- —Martha J. Farah roanatomy of face and object processing. Brain 115: 15–36. Vignolo, L. A. (1969). Auditory agnosia. In A. L. Benton, Ed., Contributions to Clinical Neuropsychology. Chicago: Aldine. References Aguirre, G. K., and M. J. Farah. (1998). Imaging visual recogni- Oculomotor Control tion. Trends in Cognitive Sciences to appear. Farah, M. J. (1990). Visual Agnosia: Disorders of Object Recogni- tion and What They Tell Us About Normal Vision. Cambridge, Eye movements fall into two broad classes. Gaze-stabiliza- MA: MIT Press/Bradford Books. tion movements shift the lines of sight of the two eyes to Farah, M. J. (1991). Patterns of co-occurrence among the associa- tive agnosias: Implications for visual object representation. precisely compensate for an animal’s self-motion, stabiliz- Cognitive Neuropsychology 8: 1–19. ing the visual world on the RETINA. Gaze-aligning move- Farah, M. J. (Forthcoming). Are there orthography-specific brain ments point a portion of the retina specialized for high regions? Neuropsychological and computational investigations. resolution (the fovea in primates) at objects of interest in the In R. M. Klein and P. A. McMullen, Eds., Converging Methods visual world. for Understanding Reading and Dyslexia. Cambridge, MA: In mammals, gaze-stabilization movements are accom- MIT Press. plished by two partially independent brain systems. The Farah, M. J., K. L. Klein, and K. L. Levinson. (1995). Face percep- vestibulo-ocular system employs the inertial velocity sen- tion and within-category discrimination in prosopagnosia. Neu- sors attached to the skull (the semicircular canals) to deter- ropsychologia 33: 661–674. mine how quickly and in what direction the head is moving Farah, M. J., and M. A. Wallace. (1991). Pure alexia as a visual impairment: A reconsideration. Cognitive Neuropsychology 8: and then rotates the eyes an equal and opposite amount to 313–334. keep the visual world stable on the retina. The optokinetic Farah, M. J., K. D. Wilson, H. M. Drain, and J. R. Tanaka. (1995). system extracts information from the visual signals of the The inverted inversion effect in prosopagnosia: Evidence for retina to determine how quickly and in what direction to mandatory, face-specific perceptual mechanisms. Vision rotate the eyes to stabilize the visual world. Research 35: 2089–2093. Gaze-aligning movements also fall into two broad Feinberg, T. E., R. J. Schindler, E. Ochoa, P. C. Kwan, and M. J. classes: saccades and smooth pursuit movements. Saccadic Farah. (1994). Associative visual agnosia and alexia without eye movements rapidly shift the lines of sight of the two prosopagnosia. Cortex 30: 395–411. eyes, with regard to the head, from one place in the visual Gomori, A. J., and G. A. Hawryluk. (1984). Visual agnosia without world to another at rotational velocities up to 1000°/sec. alexia. Neurology 34: 947–950. Kanwisher, N., M. M. Chun, J. McDermott, and P. J. Ledden. Smooth pursuit eye movements rotate the eyes at a velocity (1996). Functional imaging of human visual recognition. Cog- and in a direction identical to those of a moving visual tar- nitive Brain Research 5: 55–67. get, stabilizing that moving image on the retina. In humans Kanwisher, N., R. Woods, M. Ioacoboni, and J. Mazziotta. (1997). and other binocular animals, a third class of gaze-shifting A locus in human extrastriate cortex for visual shape analysis. movements, vergence movements, operates to shift the lines Journal of Cognitive Neuroscience 9: 133–142. of sight of the two eyes with regard to each other so that Kinsbourne, M., and E. K. Warrington. (1962). A disorder of both eyes can remain fixated on a visual stimulus at differ- simultaneous form perception. Brain 85: 461–486. ent distances from the head. Levine, D. N., and R. Calvanio. (1978). A study of the visual In humans, all eye movements are rotations accom- defect in verbal alexia-simultanagnosia. Brain 101: 65–81. plished by just six muscles operating in three antagonistic Lissauer, H. (1890). Ein Fall von Seelenblindheit nebst einem Bei- trag zur Theorie derselben. Archiv für Psychiatrie und Nerven- pairs. One pair of muscles located on either side of each krankheiten 21: 222–270. eyeball controls the horizontal orientation of each eye. A McNeil, J. E., and E. K. Warrington. (1993). Prosopagnosia: A second pair controls vertical orientation and a third pair con- face-specific disorder. Quarterly Journal of Experimental Psy- trols rotations of the eye around the line of sight (torsional chology A. Human Experimental Psychology 46: 1–10. movements). These torsional movements are actually quite Menard M. T., S. M. Kosslyn, W. L. Thompson, N. M. Alpert, and common, though usually less than 10° in amplitude. S. L. Rauch. (1996). Encoding words and pictures: A positron These six muscles are controlled by three brain stem emission tomography study. Neuropsychologia 34: 185–194. nuclei. These nuclei contain the cell bodies for all of the Pallis, C. A. (1955). Impaired identification of faces and places motor neurons that innervate the oculomotor muscles and with agnosia for colors. Journal of Neurology, Neurosurgery thus serve as a final common path through which all eye and Psychiatry 18: 218–224. Oculomotor Control 619 saccades are possible (Shiller, True, and Conway 1980). movement control must be accomplished. Engineering mod- The superior colliculus and frontal eye fields, in turn, els of the eye and its muscles indicate that motor neurons receive input from many areas within the VISUAL PROCESS- must generate two classes of muscle forces to accomplish any eye rotation: a pulsatile burst of force that regulates the ING STREAMS, including the VISUAL CORTEX, as well as the velocity of an eye movement and a long-lasting increment or BASAL GANGLIA and brain structures involved in audition decrement in maintained force that, after the movement is and somatosensation. These areas are presumed to partici- complete, holds the eye stationary by resisting the elasticity pate in the processes that must precede the decision to of the muscles which would slowly draw the eye back to a make a saccade, processes like ATTENTION. straight-ahead position (Robinson 1964). Physiological In the smooth pursuit system, signals carrying informa- experiments have demonstrated that all motor neurons par- tion about target MOTION are extracted by motion-process- ticipate in the generation of both of these two types of ing areas in visual cortex and then passed to the forces. dorsolateral pontine nucleus of the brain stem. There, neu- These two forces, in turn, appear to be generated by rons have been identified which code either the direction separable neural circuits. In the 1960s it was suggested and velocity of pursuit eye movements, the direction and that changes to the long-lasting force required after each velocity of visual target motion, or both. These signals pro- eye rotation could be computed from the pulse, or velocity, ceed to the cerebellum where neurons have been shown to signal by the mathematical operation of integration. In the specifically encode the velocity of pursuit eye movements 1980s the lesion of a discrete brain area, the nucleus prep- (Suzuki and Keller 1984). These neurons, in turn, make ositus hypoglossi, was shown to eliminate from the motor connections with cells known to be upstream of the nucleus neurons the long-lasting force change required for left- prepositus hypoglossi (the integrator of the oculomotor ward and rightward movements without affecting eye system described above). As in the saccadic system, the velocity during these movements (Cannon and Robinson brain stem integrator appears to compute the long-term 1987). This, in turn, suggested that most or all eye move- holding force from this signal and then to pass the sum of ments are specified as velocity commands and that brain these signals to the motor neurons. stem circuits involving the nucleus prepositus hypoglossi All eye movement control signals must pass through the compute, by integration, the long-lasting force required by ocular motor neurons which serve as a final common path. a particular velocity command. More recently, a similar In all cases these neurons carry signals associated both circuit has been identified that appears to generate the with the instantaneous velocity of the eye and the holding holding force required for upward, downward, and tor- force required at the end of the movement. Eye movement sional movements. systems must provide control signals of this type, presum- The saccadic system, in order to achieve a precise gaze ably by first specifying a velocity command from which shift, must supply these brain stem circuits with a com- changes in holding force can be computed. In the case of mand that controls the amplitude and direction of a move- saccades, this command is produced by brain structures ment. Considerable research now focuses on how this that topographically map all permissible saccades in ampli- signal is generated. Current evidence indicates that this tude and direction coordinates. In the case of pursuit, the command can originate in either of two brain structures: brain appears to extract target motion and to use this signal the superior colliculus of the midbrain or the frontal eye as the oculomotor control input. Together these systems fields of the neocortex. Both of these structures contain allow humans to redirect the lines of sight to stimuli of laminar sheets of neurons that code all possible saccadic interest and to stabilize moving objects on the retina for amplitudes and directions in a topographic maplike organi- maximum acuity. zation (Robinson 1972; Wurtz and Goldberg 1972; Bruce See also ATTENTION IN THE HUMAN BRAIN; EYE MOVE- and Goldberg 1985). Activation of neurons at a particular MENTS AND VISUAL ATTENTION location in these maps is associated with a particular sac- —Paul W. Glimcher cade, and activation of neurons adjacent to that location is associated with saccades having adjacent coordinates. References Lesion experiments indicate that either of these structures can be removed without permanently preventing the gener- Bruce, C. J., and M. E. Goldberg. (1985). Primate frontal eye fields ation of saccades. How these signals that topographically I. Single neurons discharging before saccades. Journal of Neu- encode the amplitude and direction of a saccade are trans- rophysiology 53: 603–635. lated into a form appropriate for the control of the oculo- Cannon, S. C., and D. A. Robinson. (1987). Loss of the neural inte- grator of the oculomotor system from brainstem lesions in the motor brain stem is not known. One group of theories monkey. Journal of Neurophysiology 57: 1383–1409. proposes that these signals govern a brain stem feedback Robinson, D. A. (1964). The mechanics of human saccadic eye loop which accelerates the eye to a high velocity and keeps movements. Journal of Physiology 174: 245–264. the eye in motion until the desired eye movement is com- Robinson, D. A. (1972). Eye movements evoked by collicular plete (cf. Robinson 1975). Other theories place this feed- stimulation in the alert monkey. Vision Research 12: 1795– back loop outside the brain stem or generate saccadic 1808. commands without the explicit use of a feedback loop. In Robinson, D. A. (1975). Oculomotor control signals. In G. Ienner- any case, it seems clear that the superior colliculus and strand and P. Bach-y-Rita, Eds., Basic Mechanisms of Ocular frontal eye fields are important sources of these signals Motility and Their Clinical Implications. Oxford: Pergamon because if both of these structures are removed, no further Press, pp. 337–374. 620 Olfaction Olfaction Schiller, P. H., S. D. True, and J. L. Conway. (1980). Deficits in eye movements following frontal eye field and superior collicu- lus ablations. Journal of Neurophysiology 44: 1175–1189. See SMELL Suzuki, D A., and E. L. Keller. (1984). Visual signals in the dorso- lateral pontine nucleus of the monkey: Their relationship to smooth pursuit eye movements. Experimental Brain Research Ontology 53: 473–478. Wurtz, R. H., and M. E. Goldberg. (1972). Activity of superior col- liculus in the behaving monkey. 3. Cells discharging before eye SeeCONCEPTS; KNOWLEDGE REPRESENTATION; MIND-BODY movements. Journal of Neurophysiology 35: 575–586. PROBLEM; NATURAL KINDS Further Readings Optimality Theory Berthoz, A., and G. M. Jones, Eds. (1985). Mechanisms in Gaze Control: Facts and Theories. Reviews of Oculomotor Research, vol. 1. New York: Elsevier. Optimality Theory (“OT,” Prince and Smolensky 1991, Buttner-Ennever, J. A., Ed. (1988). Neuroanatomy of the Oculomo- 1993) is a theory of LINGUISTIC UNIVERSALS AND UNIVER- tor System. Reviews of Oculomotor Research, vol. 2. New York: SAL GRAMMAR. According to OT, the grammars of all Elsevier. human languages share a set of constraints, denoted Con. Carpenter, R. H. S. (1988). Movements of the Eyes. 2nd ed. Lon- These constraints are sufficiently simple and general that don: Pion. they conflict in many contexts: they cannot all be satisfied Carpenter, R. H. S., Ed. (1991). Eye Movements. Vision and Visual Dysfunction, vol. 8. Boston: CRC Press. simultaneously. The grammar of an individual language Collewijn, H. (1981). The Oculomotor System of the Rabbit and Its resolves these conflicts: it ranks the universal constraints of Plasticity. New York: Springer. Con into a constraint hierarchy, conflicts being resolved in Fuchs, A. F., C. R. S. Kaneko, and C. A. Scudder. (1985). Brain- favor of higher-ranked constraints, with each constraint hav- stem control of saccadic eye movements. Annual Review of ing absolute priority over all lower-ranked constraints. Neuroscience 8: 307–337. Grammars may differ only in how they rank the universal Fuchs, A. F., and E. S. Lushei. (1970). Firing patterns of abducens constraints; the TYPOLOGY of all possible human languages neurons of alert monkeys in relationship to horizontal eye may be computed as the result of all possible rankings of movements. Journal of Neurophysiology 33: 382–392. these constraints. An OT analysis explains why some gram- Jones, G. M. (1991). The vestibular contribution. In R. H. S. Car- matical patterns are possible while others are not. (That a penter, Ed., Eye Movements. Boston: CRC Press, pp.13–44. Keller, E. L. (1974). Participation of the medial pontine reticular particular language happens to have a particular constraint formation in eye movement generation in the monkey. Journal ranking is not considered a fact to be explained within of Neurophysiology 37: 316–332. grammatical theory proper.) Kowler, E., Ed. (1990). Eye Movements and Their Role in Visual Consider, for example, the difference between the simple and Cognitive Processes. Reviews of Oculomotor Research, English sentence it rains and its Italian counterpart piove— vol. 4. New York: Elsevier. literally, “rains.” What do these sentences reveal about the Leigh, R. J., and D. S. Zee. (1991). The Neurology of Eye Move- commonalities and differences between the two grammars? ments, 2nd ed. Philadelphia: F. A. Davis. According to the OT analysis of Grimshaw and Samek- Lisberger, S. G., E. J. Morris, and L. Tychsen. (1987). Visual Lodovici (1995, 1998), at issue here is a conflict between motion processing and sensory-motor integration for smooth two constraints—SUBJECT: “Every sentence has a subject,” pursuit eye movements. Annual Reviews of Neuroscience 10: 97–129. and FULL-INT(ERPRETATION): “Every element of a linguis- Lushei, E. S., and A. F. Fuchs. (1972). Activity of brainstem neu- tic expression contributes to its interpretation.” In English, rons during eye movements of alert monkeys. Journal of Neu- the conflict is resolved in favor of SUBJECT: to provide a rophysiology 35: 445–461. subject, it must appear, even though it has no referent and Miles, F. A., and J. Wallman, Eds. (1993). Visual Motion and Its contributes nothing to the interpretation of the sentence, Role in the Stabilization of Gaze. Reviews of Oculomotor violating FULL-INT. In Italian, the conflict is resolved the Research, vol. 5. New York: Elsevier. other way: no meaningless subject may appear, and FULL- Raphan, T., and B. Cohen. (1978). Brainstem mechanisms for INT prevails over SUBJECT. rapid and slow eye movements. Annual Review of Physiology In many other contexts, SUBJECT and FULL-INT do not 40: 527–552. conflict, and both constraints must be satisfied in both lan- Robinson, D. A. (1981). Control of eye movements. In V. B. Brooks, Ed., The Nervous System. Handbook of Physiology, guages. Both constraints are parts of the grammars of both part 2, vol. 2. Baltimore: Williams and Wilkins, pp. 1275–1320. languages, but they do not have equal status: in English, Sparks, D. L. (1986). Translation of sensory signals into com- SUBJECT has priority, or dominates; we write: SUBJECT >> mands for saccadic eye movements: Role of primate superior FULL-INT. In Italian, the reverse constraint ranking holds. colliculus. Physiological Reviews 66: 118–171. The lower-ranked constraint in each language must be Sparks, D. L., R. Holland, and B. L. Guthrie. (1976). Size and dis- obeyed, except in contexts in which doing so would violate tribution of movement fields in the monkey superior colliculus. the higher-ranked constraint; in this sense, constraints in Brain Research 113: 21–34. OT are minimally violable. OT thus differs from earlier Wurtz, R. H., and M. E. Goldberg, Eds. (1989). The Neurobiology grammatical theories employing inviolable constraints, of Saccadic Eye Movements. Reviews of Oculomotor Research, where any violation of a constraint renders a structure vol. 3. New York: Elsevier. Optimality Theory 621 ungrammatical (e.g., prising that within such a simple mechanism, reranking can RELATIONAL GRAMMAR, LEXICAL succeed in accounting for such a diversity of observed FUNCTIONAL GRAMMAR, HEAD-DRIVEN PHRASE STRUCTURE grammatical patterns. GRAMMAR). To sketch the broad outline of the OT picture of cross- linguistic variation, we fix attention on a universal con- Faithfulness to Targets straint C (e.g., FULL-INT). In some languages, C is very Why is it rains optimal, when its violation of FULL-INT highly ranked (e.g., Italian); the effect is that those linguistic could be avoided by selecting another candidate with an structures (e.g., meaningless it) that violate the constraint— interpreted subject, say, John smiles? Implicit thus far in the those that are marked by it—are altogether banned from the competition for optimality is the target proposition, , to which it rains, but not John smiles, is so that the structures it marks (e.g., it) now appear—but faithful. In OT, each candidate is evaluated relative to a tar- only in those highly restricted contexts in which the marked get, faithfulness to which is demanded by constraints in Con element is needed to satisfy one of the few constraints more collectively called FAITHFULNESS. John smiles is indeed highly ranked than C (e.g., SUBJECT in English). Looking optimal, but for a different target, . The multiplicity of grammatical—optimal— that the structures it marks appear in more and more con- structures in a single language arises from the multiplicity texts, as more and more other constraints force violations of of possible targets. In PHONOLOGY, the target is a sequence because they outrank it. The OT literature documents many of phones, an underlying form such as /bat + d/ for the past specific cases of this general cross-linguistic pattern, which can be captured entirely by the simple statement: C ∈ Con. tense of to bat. Optimal for this target is [batId] “batted”; this includes a vowel ([I]) not present in the target, so it vio- Once this has been stated, the rest of the pattern follows lates a FAITHFULNESS constraint, F. This minimally unfaith- from the formal structure of OT: languages differ in how ful candidate is optimal because of a universal constraint they rank C, and depending on this ranking, those structures against certain word-final consonant clusters, including td; marked by C will be either banned altogether (highest rank- this constraint is higher ranked than F in the phonological ing), allowed but only in a highly restricted set of contexts, component of the English grammar. That a morpheme (like or allowed in a wide range of contexts (lowest ranking). past-tense /d/) receives different (but closely related) pro- Each universal constraint C defines a class of dispreferred nunciations, depending on its context, follows in OT from a or marked structures: those that violate it. Through the sin- fixed underlying form for the morpheme, FAITHFULNESS to gle mechanism of constraint ranking, such marked elements which is (minimally) violated in many optimal forms, are banned in some languages, and restricted in their distri- forced by higher-ranked well-formedness constraints gov- bution in all languages. OT thus builds on the notion of erning phones in various contexts. Violability of FAITHFUL- markedness developed in the 1930s by N. S. Trubetzkoy, NESS plays a less obvious role in SYNTAX; Legendre, Roman JAKOBSON, and others of the Prague Linguistics Cir- Smolensky, and Wilson (1998) use it to explain why some cle; OT provides a formal, general markedness-based calcu- syntactic targets have no grammatical expression in a partic- lus within the tradition of GENERATIVE GRAMMAR. OT’s ular language: for such an ineffable target, every faithful formalization of markedness computation brings into sharp candidate violates sufficiently high-ranking constraints that focus a number of issues otherwise obscure. an unfaithful candidate, with a different interpretation, is optimal. Competition The candidates competing for a target I form a set written To say that a linguistic structure S is grammatical in a lan- Gen(I); I is often called the input, and sometimes the index, guage L because it optimally satisfies L’s constraint hierar- of this candidate set. The set of targets and the candidate- chy is to exploit a comparative property: even though S generating function Gen are universal. might not satisfy all the universal constraints, every alterna- tive incurs more serious violations of L’s hierarchy than Implications does S. Specifying an OT grammar includes specifying the candidate sets of linguistic structures that compete for opti- A framework employing a novel type of grammatical com- mality. This must be universal, for in OT, only constraint putation, optimization, OT has cognitive implications for ranking varies across grammars. the classic questions of generative grammar that concern the nature of knowledge of language, its use, its acquisition, and its neural realization. Aggregation of Multiple Dimensions Violable constraints profoundly alter the analytic options of Markedness in syntactic theory. When a grammatical sentence S appears to violate a putative simple, general, universal constraint C, What defines optimality when the constraints defining dif- it becomes possible to simply say that it actually does; with ferent dimensions of markedness disagree on which candi- inviolable constraints, it is typically necessary to posit invis- date is preferred? OT’s answer is constraint ranking. S is ible structures that allow S to covertly satisfy C, or to com- optimal if and only if it is more harmonic than all other plicate C, often via language-particular parameters, so that members S' of its candidate set, written S S': this means it is no longer violated by S. Topics of OT syntactic analyses that, of the constraints differentiating the markedness of S include grammatical voice alternations (GRAMMATICAL and S', S is favored by the highest ranked. It is perhaps sur- 622 Origins of Intelligence RELATIONS and THEMATIC ROLES), case, ANAPHORA, HEAD Report 2. Piscataway, NJ: Rutgers Center for Cognitive Sci- ence, Rutgers University, and Boulder, CO: Department of MOVEMENT, subject distribution, wh-questions (WH-MOVE- Computer Science, University of Colorado. MENT), scrambling, and clitic inventories and placement. Prince, A., and Smolensky, P. (1997). Optimality: From neural net- In phonological theory, the shift from serial, process- works to universal grammar. Science 275: 1604–1610. oriented frameworks (PHONOLOGICAL RULES AND PRO- CESSES) to OT’s parallel, violable constraint optimization Further Readings has enabled explanation of typological variation in a num- ber of areas: segmental inventories, syllable structure, Barbosa, P., D. Fox, P. Hagstrom, M. McGinnis, and D. Pesetsky, STRESS, TONE, vowel harmony, reduplicative and templa- Eds. (1998). Is the Best Good Enough? Papers from the Work- tic MORPHOLOGY, phonology-morphology relations, the shop on Optimality in Syntax. Cambridge, MA: MIT Press and phonology-PHONETICS interface, and many others. (For an MIT Working Papers in Linguistics. extensive bibliography and on-line database of OT papers Beckman, J., L. Walsh-Dickey, and S. Urbanczyk, Eds. (1995). and software, see the Rutgers Optimality Archive ROA at University of Massachusetts Occasional Papers in Linguistics http://ruccs.rutgers.edu/roa.html.) 18: Papers in Optimality Theory. Amherst, MA: GLSA, Uni- versity of Massachusetts. A unified grammatical framework for syntax and phonol- Grimshaw, J. (1997). Projection, heads, and optimality. Linguistic ogy, OT also provides results that span both these modules, Inquiry 28: 373–422. including the relation of general to more specific con- Legendre, G., W. Raymond, and P. Smolensky. (1993). An Opti- straints, the compatibility among related grammatical pro- mality-Theoretic typology of case and grammatical voice sys- cesses, and the computation and learnability of grammars. tems. In Proceedings of the Nineteenth Annual Meeting of the Formal results on the latter topics address algorithms for Berkeley Linguistics Society. Berkeley, CA, pp. 464–478. learning constraint rankings from positive examples, algo- Legendre, G., and P. Smolensky. (Forthcoming). Towards a Calcu- rithms for computing optimal forms, and the complexity of lus of the Mind/Brain: Neural Network Theory, Optimality, and formal languages specified by OT grammars. Empirical Universal Grammar. findings on the course of acquisition of PHONOLOGY in chil- Legendre, G., S. Vikner, and J. Grimshaw, Eds. (Forthcoming). Optimal Syntax. dren, and on real-time SENTENCE PROCESSING, have been McCarthy, J., and A. Prince. (1993). Prosodic Morphology I: Con- analyzed within OT. While detailed OT proposals for the straint Interaction and Satisfaction. RuCCS Technical Report neural basis of language and the neural basis of phonology 3. Piscataway, NJ: Rutgers Center for Cognitive Science, Rut- do not currently exist, theoretical connections between opti- gers University. mization in OT and in NEURAL NETWORK models have McCarthy, J., and A. Prince. (1993). Generalized Alignment. In G. proved fruitful for the continuing development of both OT Booij and J. van Marle, Eds., Yearbook of Morphology 1993. and the theory of complex symbol processing in neural net- Dordrecht: Kluwer, pp. 79–153. works (Prince and Smolensky 1997). McCarthy, J., and A. Prince. (1995). Faithfulness and reduplicative See also CONNECTIONIST APPROACHES TO LANGUAGE; identity. In J. Beckman, L. Walsh Dickey, and S. Urbanczyk, Eds., University of Massachusetts Occasional Papers in Lin- LANGUAGE, NEURAL BASIS OF; PHONOLOGY, NEURAL BASIS guistics 18: Papers in Optimality Theory. Amherst, MA: OF GLSA, University of Massachusetts, pp. 249–384. Smolensky, P. (1996). On the comprehension/production dilemma —Paul Smolensky in child language. Linguistic Inquiry 27: 720–731. Tesar, B., and P. Smolensky. (1998). Learnability in Optimality References Theory. Linguistic Inquiry 29: 229–268. Grimshaw, J., and V. Samek-Lodovici. (1995). Optimal subjects. Origins of Intelligence In J. Beckman, L. Walsh-Dickey, and S. Urbanczyk, Eds., Uni- versity of Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality Theory. Amherst, MA: GLSA, University SeeCOGNITIVE ARCHAEOLOGY; EVOLUTIONARY PSYCHOL- of Massachusetts, pp. 589–605. OGY; INTELLIGENCE; MACHIAVELLIAN INTELLIGENCE Grimshaw, J., and V. Samek-Lodovici. (1998). Optimal subjects HYPOTHESIS and subject universals. In P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis, and D. Pesetsky, Eds., Is the Best Good Enough? Papers from the Workshop on Optimality in Syntax. Cambridge, PAC Learning MA: MIT Press and MIT Working Papers in Linguistics. Legendre, G., P. Smolensky, and C. Wilson. (1998). When is less more? Faithfulness and minimal links in wh-chains. In P. Bar- See COMPUTATIONAL LEARNING THEORY; MACHINE LEARN- bosa, D. Fox, P. Hagstrom, M. McGinnis, and D. Pesetsky, ING Eds., Is the Best Good Enough? Papers from the Workshop on Optimality in Syntax. Cambridge, MA: MIT Press and MIT Pain Working Papers in Linguistics. Prince, A., and P. Smolensky. (1991). Notes on Connectionism and Harmony Theory in Linguistics. Technical Report CU-CS-533- To a degree matched by no other component of somatic sen- 91. Boulder, CO: Department of Computer Science, University sation and by no other sensory system, pain carries with it of Colorado. an emotional quality. From one person to the next, differ- Prince, A., and P. Smolensky. (1993). Optimality Theory: Con- ences in personal traits and past experience play major roles straint Interaction in Generative Grammar. RuCCS Technical Pain 623 in the perception of pain. For any individual, changes in pain is followed after a substantial delay by second or burn- mood or expectation are similarly important for judging and ing pain, a poorly localized, agonizing pain carried by C reacting to pain. Moreover, thresholds for a stimulus per- fibers. ceived as painful vary across the body surface, as every per- Pain afferents are segregated from other somatosensory son can attest to by comparing how he or she reacts to one afferents that carry discriminative information of touch and grain of sand under the eyelid vs. thousands of grains under body position. They enter the spinal cord in the dorsal root the feet. These variations in the perception of pain and its as a more lateral bundle and synapse directly upon neurons strong affective component make it difficult to study clini- of the cord’s dorsal and intermediate horns. Convergence of cally and experimentally. inputs from many pain afferents at this level produces a situ- The anatomy and physiology of pain (nociception) begin ation amplified at higher levels in which painful stimuli are with two types of specialized receptors in the skin, muscles, localized very poorly when unaccompanied by information and viscera. One responds only to very forceful mechanical from other cutaneous afferents. Further synaptic conver- energy and the other to noxious stimuli of many kinds. The gence at higher levels between cutaneous nociceptors and first of these is the mechanical nociceptor, a type of afferent visceral nociceptors leads to the misplacement of pain that responds only to physical force intense enough to pro- occurring in viscera to sites that are more peripheral. This duce tissue damage. Far more general is the response of referred pain is a common part of abnormal situations, such polymodal nociceptors, as seen by comparing the response as those that occur during heart attacks. of the two receptor types to heat: mechanical nociceptors Many spinal neurons driven by nociceptive afferents send have a very high threshold for the initial application of heat, their axons across the spinal cord, where they ascend to vari- whereas polymodal nociceptors begin responding to stimuli ous locations in the brain stem and THALAMUS. That immedi- of 40° C and show a linear increase in response to stimuli up ate crossing in the cord of pain information contrasts with the to 60° C. Both receptor types are notable for the fact that, delayed crossing of fine touch and proprioceptive axons, unlike all other somatosensory receptors, they are left which occurs in the medulla. Such a wide difference in the uncovered by specialized cells or connective tissue sheaths site of crossing produces a situation in which hemisection of and are thus unprotected from the diffusion of chemical the cord leads to a loss of pain sensation on the contralateral agents released by surrounding cells. These agents include a side of the body but loss of discriminative sensation on the variety of small molecules such as amines and peptides that ipsilateral side. Other neurons directly driven by nociceptive can produce or change activity in nociceptors over a dis- afferents have intraspinal axons that end on spinal motor neu- tance of several millimeters. rons. These synapses are part of a rapid reflex that produces Whereas other somatosensory receptors adapt to repeated withdrawal of a limb from the location of a painful stimulus. stimulation by becoming less sensitive and less responsive to Much of the ascending pain information reaches a region each subsequent stimulus, nociceptors participate in a of the midbrain around the cerebral aqueduct of the ventric- heightened response to repeated noxious stimulation, ular system. This periaqueductal gray (PAG) region is a referred to as hyperalgesia. Both neural and non-neuronal principal component of a descending system, stimulation of mechanisms appear to participate in this phenomenon by which relieves pain. Axons from the PAG innervate a col- which application of noxious thermal or mechanical stimuli lection of serotonin-synthesizing and -secreting neurons of produces a lower threshold for and a greater response to the medulla, the nucleus raphe magnus. These neurons, in other noxious stimuli. Primary hyperalgesia occurs at the turn, send long, descending axons into the dorsal horn of the site of injury through the local release of chemical agents, spinal cord, where they form synapses with interneurons such as bradykinin, which directly stimulate nociceptors to that use opiate-like peptides, the enkephalins, as NEU- become active. Other chemical agents, including prostaglan- ROTRANSMITTERS. By modulating the activity of nociceptor din E2, also play a role in primary hyperalgesia, not by afferents and of spinal pain neurons, the enkephalinergic directly driving nociceptors, but by making them much more interneurons control the perception of the intensity of a nox- responsive to subsequent non-noxious mechanical stimuli. It ious stimulus. Opiates and other pharmacological agents is through inhibition of these chemical agents that aspirin that mimic the effects of enkephalins are effective as analge- and ibuprofen work as analgesics. A second type of hyperal- sics in part because of their action at these spinal synapses. gesia is of strictly neural origin and probably includes a A fraction of spinal neurons that respond to nociceptive component that originates in the spinal cord rather than in inputs send their axons to the contralateral thalamus. Part of the periphery. that spinothalamic system reaches nuclei of the intralaminar The two nociceptor types send their responses into the group, which provides the great mass of the cerebral cortex central nervous system (CNS) by way of different kinds of with a diffuse innervation. That system and the projection of peripheral axons. Larger, lightly myelinated (Aδ) axons end spinal nociceptive neurons to the brain stem reticular forma- as mechanical nociceptors, whereas the smallest, unmyeli- tion are the anatomical substrates for the generally arousing nated (C) axons end as polymodal nociceptors. These differ- and motivating qualities of pain. Most spinothalamic axons, ences in axon diameter and level of myelination necessarily however, synapse on clusters of small cells in the ventral translate into differences of conduction velocity and thus in posterior lateral (VPL) nucleus, which together make up a the time over which signals from the two nociceptor types complete body representation of pain. A comparable group reach the CNS. Pricking pain, carried by Aδ fibers, is the of cells in the ventral posterior medial (VPM) nucleus is more rapidly transmitted, better localized, and more easily innervated by neurons in the pars caudalis of the spinal tolerated component of pain. The perception of pricking trigeminal system and includes a nociceptive representation 624 Parallelism in AI of the face. Both VPL and VPM send axons to the first Parameter-Setting Approaches to Acquisi- somatosensory area of the cerebral cortex, found in the post- tion, Creolization, and Diachrony central gyrus of monkeys, apes, and humans. By this route and by a separate innervation of the second somatosensory area, nociceptive information reaches the CEREBRAL CORTEX How is knowledge of one’s idiolect—I(nternal)-language, in a way that it can be compared with other somatosensory in Noam Chomsky’s (1986) terminology—represented in information and localized with some precision to particular the mind/brain? How is such knowledge acquired by chil- sites along the body. Neurological studies of soldiers suffer- dren? Answers to these questions are intricately and con- ing head wounds in the two world wars clearly demonstrate structively related. In the principles and parameters/ that injuries confined to the postcentral gyrus produce per- minimalist approach (Chomsky 1981, 1986, 1995; see SYN- manent analgesia along the contralateral body surface. TAX, ACQUISITION OF and MINIMALISM), linguistic knowl- Where the affective quality to pain arises is poorly under- edge, in addition to a (language-specific) LEXICON (see stood. Those few studies to have addressed the question WORD MEANING, ACQUISITION OF and COMPUTATIONAL have focused on areas of temporal and orbitofrontal cortex LEXICONS), consists of a computational system that is sub- in humans and nonhuman primates. Perhaps the best current ject to an innate set of formal constraints, partitioned into guess is that more than one area of cerebral cortex is principles and parameters. The principles are argued to be involved in the agony that accompanies extreme pain such universal; they formalize constraints obeyed by all lan- as that produced by solid tumors or burns. These extreme guages (see LINGUISTIC UNIVERSALS). Alongside these prin- cases of pain and the need to control them and other lesser ciples—and perhaps within some of these principles (e.g., or more acute nociceptive events often raise questions of the Rizzi 1982)—what allows for diversity in TYPOLOGY (possi- advantage conferred by painful affect. Rare clinical cases of bly, in addition to the lexicon proper) are the parameters. patients who perceive a painful event as differing from an These parameters constitute an innate and finite set of innocuous stimulus but who experience no affect accompa- “switches,” each with a fixed range of settings. These nying that event are test cases for such a question. Most of switches give the learner a restricted number of options in these patients die at an early age, victims of numerous determining the complete shape of the attained I-language. destructive wounds and crippling conditions of joints. In such a framework, syntax acquisition reduces to fixing Apparently the failure of these patients to avoid or discon- the values of parameters on the basis of primary linguistic tinue actions that are painful significantly shortens their data (PLD) (cf. LANGUAGE ACQUISITION). Taken together, lives despite intensive training in detecting and responding principles and parameters bring a solution to the “logical to painful stimuli. From these cases, then, it can be con- problem of language acquisition”: cluded that both precise localization and emotional reaction (1) UG / S0 to pain are parts of successful strategies for survival. (universal principles cum UNSET parameters) See also EMOTION AND THE ANIMAL BRAIN; EPIPHENOM- + ENALISM; PHANTOM LIMB; WHAT-IT'S-LIKE PLD / “triggers” — Stewart Hendry = Further Readings Idiolect-Specific Grammar / Sf (universal principles cum SET parameters) Basbaum, A. I., and H. L. Fields. (1984). Endogenous pain control systems: Brainstem spinal pathways and endorphin circuitry. Per (1), language acquisition is the process in which Annual Review of Neuroscience 7: 309–338. exposure to PLD transforms our innately specified faculté Dubner, R., and C. J. Bennett. (1983). Spinal and trigeminal mech- de langage (from an initial state S0) into a language-specific anisms of nociception. Annual Review of Neuroscience 6: 381– grammar (at the final state Sf) by assigning values (settings) 418. to an array of (initially unset) parameters (see Chomsky Light, A. R. (1992). The Initial Processing of Pain and Its Descending Control: Spinal and Trigeminal Systems. Basel: 1981, HISTORICAL LINGUISTICS, and INNATENESS OF LAN- Karger. GUAGE). Perl, E. R. (1984). Pain and nociception. In The Handbook Of The schema just sketched delineates a fascinating and Physiology, sec. 1: The Nervous System, vol. 3. Sensory Pro- productive research program. Yet our understanding is still cesses, part 2. Bethesda, MD: American Physiological Society. very incomplete as to how (which aspects of) the PLD “lead” Price, D. D. (1988). Psychological and Neural Mechanisms of the learner to adopt (what) settings for (what) parameters. Pain. New York: Raven Press. What are the major questions raised by this program? In Willis, W. D. (1985). The Pain System: The Neural Basis of Noci- order to flesh out the structure in (1), generativists are ceptive Transmission in the Mammalian Nervous System. advancing on three complementary theoretical fronts toward: Basel: Karger. 1. A characterization of parameters. For example, are Parallelism in AI parameters distributed across various grammatical prin- ciples (cf. Rizzi 1982; Chomsky 1986) or are parameters restricted to “inflectional systems” (Borer 1983; Chom- See sky 1995: ch 2), to the inventory and properties of func- COGNITIVE MODELING, CONNECTIONIST; NEURAL NET- tional heads (Ouhalla 1991; cf. SYNTAX and HEAD WORKS Parameter-Setting Approaches to Acquisition, Creolization, and Diachrony 625 or to (“weak” vs. “strong”) morphological thus, pidgin grammatical morphologies tend to be impover- MOVEMENT), features of functional heads (see Chomsky 1995: ch. 3)? ished; (2) the privileged role of children in creole genesis: in 2. A theory that would delineate what (kinds and amounts acquiring their native language with pidgin PLD, such chil- of) forms from the PLD are used by the learner in what dren appear to stabilize and expand inconsistent, unstable, contexts and at what stages as triggers or cues for assign- and restricted patterns in their PLD. ing particular settings to particular parameters (cf. Gib- Regarding the morphologies of contact languages, it was son and Wexler 1994; Lightfoot 1999; Roberts 1999; noted long ago (e.g., by Meillet 1919) that inflectional para- also see further readings below). digms are singularly susceptible to the vagaries of language 3. A proposal as to how (i.e., by what chains of deductions) learning in contact situations. Thus, in a theory where the triggering in (2) takes place. parameter settings are derived from the properties of inflec- Current approaches to 1–3 still remain controversial and tional morphemes (cf. Borer 1983; see e.g., Rohrbacher in need of refinements. Here I discuss one proposal with 1994), the question arises as to what settings are attainable some promise regarding certain (diachronic and ontoge- in an environment that causes attrition to the PLD’s inflec- netic) developmental data. Rohrbacher (1994), following tional morphemes. Bickerton’s proposal is that, in the insights from works by, among others, Borer, Pollock, absence of the relevant (morphological) triggers, certain Platzack and Holmberg, Roberts, and Chomsky, proposes parameters are assigned default settings by UG. Although that “rich” verbal inflections constitute a trigger for verb various creole structures are inherited from the parent lan- displacement (V-raising) because such MORPHOLOGY is guages, contra Bickerton (see e.g., Chaudenson 1992; Lumsden 1999), Bickerton’s intuition seems partly con- listed in the lexicon and inserted in the syntax outside of the firmed in certain domains of creole syntax, for example, for verbal phrase (VP): as affixal heads, such verbal inflection Haitian Creole, in the domains of nonverbal predication induces the verb to raise outside VP via head movement in (sans verbal copulas; DeGraff 1992), and of verb and order for the verb stem to become bound with its affixes. object-pronoun syntax (with adverb-verb-pronoun order; Such verb raising is diagnosed by, for example, placement DeGraff 1994, 1997). These are among patterns that distin- of the finite verb to the left of certain adverbs, as in French. guish Haitian Creole both from its European ancestor In languages with “poor” verbal inflection, affixes (if any) (French) and from its West-African ancestors (e.g., Kwa). generally do not behave as independent syntactic elements Much recent work in this vein (see further readings below) that induce V-raising; they are introduced post-syntactically is preliminary and exploratory, but appears to hold promise in the morpho-phonological component. Such proposals for toward understanding the mechanics of parameter-setting the morphology-syntax interface in parameter-setting are and of creole genesis. not unproblematic. Yet they make interesting predictions Beyond creolization, Bickerton’s proposal, once embed- that can (in principle, if not always in practice) be tested ded in a parameter-setting framework, also has ramifica- against phenomena in language acquisition, creolization, tions for the relationship between acquisition and language and language change. In turn, research in these three areas change. Given (1) and some implementation thereof, the has brought forward data and insights that may prove useful language acquisition device with the appropriate parameter- in elucidating parameter setting. setting algorithm is one locus of confluence for creolization Starting with CREOLES, there has been much recent work and language-change data (cf. DeGraff 1999b; Lightfoot by young creolists proceeding from the hunch that paths of 1999). As language learners and field and historical lin- creolization may provide much needed hints toward under- guists often experience (the learners more successfully than standing: the mechanics of parameter-setting in acquisition the linguists), “language data do not wear their grammars on and through language change; and whether parameter set- their sleeves,” and parameter values must be fixed anew tings are hierarchically organized along a cline of each time an I-language is attained, that is, at each instantia- (un)markedness (with unmarked settings being those that tion of (1). This is in keeping with Meillet’s (1929: 74) and are attained with few or no triggers of the appropriate type). others’ classic idea that the transmission of language is The hunch that creolization phenomena should shed light on inherently discontinuous: the grammar of each speaker is an parameter setting essentially goes back to Derek Bickerton’s individual re-creation, based on limited evidence. Further- language bioprogram hypothesis. In Bickerton’s hypothesis, more, parameter-setting takes the learner through paramet- creoles by and large manifest the sort of properties a lan- ric configurations distinct from the target grammar(s) (see guage learner would attain in the absence of reliable PLD, e.g., the papers in Roeper and Williams 1987). One must for example, in the presence of the overly impoverished and also remark that PLD sets—as uttered by the learner’s vari- unstable patterns that are typical of speakers in (early) pid- ous model speakers (caretakers, older peers, etc.)—are gin/interlanguage stages. Thus creole settings tend to indi- themselves determined by parameter-setting arrays that, cate values accessible with few or no triggers in the PLD. although overlapping, are in most cases (subtly) distinct Why should this state of affairs hold? from one another idiolect-wise. Thus, even in unexceptional For Bickerton, structural similarities across creoles arise instances of acquisition, target grammars, and the final as creole grammars tend to instantiate genetically specified grammars attained by the learners ineluctably diverge, if default values of parameter settings due to: (1) the restricted only along relatively few parameters. Yet such localized nature of the PLD: the source of the PLD that gave rise to parametric shifts are noticeable via the innovative structural the creole is a pidgin that is itself the outcome of adult lan- patterns they give rise to. In any case, it has been claimed guage acquisition under duress, that is, with restricted input (e.g., in DeGraff 1999b) that such innovation is of the same in contact situations that are unfriendly toward the learner; 626 Parameter-Setting Approaches to Acquisition, Creolization, and Diachrony character as that found in creolization and in the early stages Chomsky, N. (1981). Principles and parameters in syntactic theory. In N. Hornstein and D. Lightfoot, Eds., Explanation in Linguis- of language acquisition (modulo the degree of divergence); tics. London: Longman, pp. 123–146. they all are rooted in (1), which is a modern rendering of Chomsky, N. (1986). Knowledge of Language. New York: Praeger. Meillet’s observation about the discontinuity of language Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: transmission. As for the more radical nature of the changes MIT Press. observed in creolization, this stems from the unusual nature DeGraff, M. (1992). The syntax of predication in Haitian. In Pro- of the PLD. ceedings of the 22nd Annual Meeting of the North Eastern Lin- Thus, it should not come as surprise that creolization pat- guistics Society. University of Massachusetts at Amherst: terns (e.g., in Haitian Creole’s verbal syntax and morphol- Graduate Linguistics Students Association. ogy; see DeGraff 1997) present uncanny parallels with: (1) DeGraff, M. (1994). To move or not to move? Placement of verbs patterns in language acquisition, as with children who, in and object pronouns in Haitian Creole and in French. In K. Beals, J. Denton, R. Knippen, L. Melnar, H. Suzuki, and E. the initial stages of acquiring V-raising languages like Zeinfeld, Eds., Papers from the 30th Meeting of the Chicago French, (optionally) use noninflected unraised verbs in con- Linguistic Society. Chicago: Chicago Linguistics Society. texts where the target language requires inflected raised DeGraff, M. (1997). Verb syntax in creolization (and beyond). In verbs (Pierce 1992); and (2) patterns in language change, as L. Haegeman, Ed., The New Comparative Syntax. London: for example in the history of English where V-raising in Longman, pp. 64–94. Middle English gave way to V-in-situ in Modern English DeGraff, M., Ed. (1999a). Language Creation and Language with a prior decrease in verbal inflections (Rohrbacher Change: Creolization, Diachrony and Development. Cam- 1994; Vikner 1997; Roberts 1999; Lightfoot 1999). bridge, MA: MIT Press. Results of this sort would then confirm the view that mor- DeGraff, M., Ed. (1999b). Creolization, language change and lan- phology is one major source of syntactic variations and that guage creolization: An epilogue. In M. DeGraff, Ed., Language Creation and Language Change: Creolization, Diachrony and functional categories and their associated morphemes are the Development. Cambridge, MA: MIT Press. locus for parameter-setting. In this view, the learner, unlike Gibson, E., and K. Wexler. (1994). Triggers. Linguistic Inquiry 25 the linguist, need not consult actual “constructions” in order (3): 407–454. to set parameters. Instead, inflectional paradigms (once their Lightfoot, D. (1999). Creoles and cues. In M. DeGraff, Ed., Lan- “richness” and frequencies exceed certain thresholds) serve guage Creation and Language Change: Creolization, Diach- as triggers for syntax-related settings such as V-raising vs V- rony and Development. Cambridge, MA: MIT Press. in-situ, possibly alongside syntactic triggers qua robust and Lumsden, J. (1999). Language acquisition and creolization. In M. localized distributional patterns (e.g., verb-negation/adverb DeGraff, Ed., Language Creation and Language Change: Cre- orders; see, e.g., Roberts 1999; Lightfoot 1999). (As noted olization, Diachrony and Development. Cambridge, MA: MIT by Rohrbacher 1994: 274, the inflectional paradigms may be Press. Meillet, A. (1919). Le genre grammaticale et l’élimination de la key because they must be learned anyway.) In absence of rel- flexion. Scientia. (Rivista di Scienza) 25, 86, 6. Reprinted in atively copious morphological (and syntactic) triggers, the Meillet (1958), Linguistique Historique et Linguistique learner initially falls back on default options (e.g., V-in-situ) Générale, tome 1, pp. 199–211. as in the earliest stages of acquisition and in the linguistic Meillet, A. (1929). Le développement des langues. In Continu et environments that produced Haitian Creole and Modern Discontinu. Reprinted in Meillet (1951), Linguistique His- English—and other languages that lost V-raising through torique et Linguistique Générale, tome 2, pp. 70–81. language contact. Ouhalla, J. (1991). Functional Categories and Parametric Varia- The hypothesis sketched above regarding parameter-set- tion. London: Routledge. ting (in verbal syntax) has been much debated; see, inter Pierce, A. (1992). Language Acquisition and Syntactic Theory: A alios, Vikner 1997 for counterexamples, and Thráinsson Comparative Analysis of French and English Child Grammars. Dordrecht: Kluwer. 1996 and Lightfoot 1999 for alternative proposals. Yet, to Platzack, C., and A. Holmberg. (1989). The role of Agr and finite- advance our understanding of parameter-setting within cur- ness. Working Papers in Scandinavian Syntax 43: 51–76. rent (provisional) assumptions in syntax (particularly mini- Pollock, J.-Y. (1989). Verb movement, Universal Grammar and the malism), one may ask whether morphological triggering (or structure of IP. Linguistic Inquiry 20: 365–424. any other triggering that relies on narrowly defined, easily Rizzi, L. (1982). Issues in Italian Syntax. Dordrecht: Foris. accessible paradigms) must be the null hypothesis in any the- Roberts, I. (1999). Verb movement and markedness. In M. ory that both assumes “constructions” to be epiphenomenal DeGraff, Ed., Language Creation and Language Change: Cre- (see Chomsky 1995) and potentially ambiguous parameter- olization, Diachrony and Development. Cambridge, MA: MIT wise (e.g., Gibson and Wexler 1994), and seeks to solve the Press. logical problem of language acquisition by keeping learning Roeper, T., and E. Williams, Eds. (1987). Parameter Setting. Dor- drecht: Reidel. and induction from PLD to a strict minimum (see POVERTY Rohrbacher, B. (1994). The Germanic VO Languages and the Full OF THE STIMULUS ARGUMENTS). Paradigm: A Theory of V to I raising. PhD. diss., University of Massachusetts. —Michel DeGraff Thráinsson, H. (1996). On the (non-)universality of functional cat- egories. In W. Abraham, S. Epstein, and H. Thráinson, Eds., References Minimal Ideas. Amsterdam: Benjamins, pp. 253–281. Vikner, S. (1997). V0-to-I0 movement and inflection for person in Borer, H. (1983). Parametric Syntax. Dordrecht: Foris. all tenses. In L. Haegeman, Ed., The New Comparative Syntax. Chaudenson, R. (1992). Des Iles, des Hommes, des Langues. Paris: London: Longman, pp. 189–213. L’Harmattan. Parsimony and Simplicity 627 Further Readings Weinreich, U. (1953). Languages in Contact. Publications of the Linguistic Circle of New York, no. 1. Reprint: The Hague: Adone, D. (1994). The Acquisition of Mauritian Creole. Amster- Mouton (1968). dam: Benjamins. Wekker, H., Ed. (1996). Creole Languages and Language Acquisi- Adone, D., and A. Vainikka. (1999). Acquisition of WH-Questions tion. Berlin: Mouton. in Mauritian Creole. In M. DeGraff, Ed., Language Creation Wexler, K. (1994). Finiteness and head movement in early child and Language Change: Creolization, Diachrony and Develop- grammars. In D. Lightfoot and N. Hornstein, Eds., Verb Move- ment. Cambridge, MA: MIT Press. ment. Cambridge: Cambridge University Press. Arends, J., P. Muysken, and N. Smith. (1994). Pidgins and Cre- oles: An Introduction. Amsterdam: Benjamins. Parsimony and Simplicity Baptista, M. (1997). The Morpho-Syntax of Nominal and Verbal Categories in Capeverdean Creole. Ph.D. diss., Harvard Uni- versity. The law of parsimony, or Ockham’s razor (also spelled Bobaljik, J. (1995). Morphosyntax: The Syntax of Verbal Inflec- “Occam”), is named after William of Ockham (1285–1347/ tion. Ph.D. diss., MIT. Distributed by MIT Working Papers in 49). His statement that “entities are not to be multiplied Linguistics. beyond necessity” is notoriously vague. What counts as an Bruyn, A., P. Muysken, and M. Verrips. (1999). Double object con- entity? For what purpose are they necessary? Most agree structions in the creole languages: Development and acquisi- that the aim of postulated entities is to represent reality, or to tion. In M. DeGraff, Ed., Language Creation and Language “get at the truth” in some sense. But when are entities postu- Change: Creolization, Diachrony and Development. Cam- lated beyond necessity? bridge, MA: MIT Press. The role of parsimony and simplicity is important in all Clark, R., and I. Roberts. (1993). A computational model of lan- forms of INDUCTION, LEARNING, STATISTICAL LEARNING guage learnability and language change. Linguistic Inquiry 24: 299–345. THEORY, and in the debate about RATIONALISM VS. EMPIRI- DeGraff, M. (1993). A riddle on negation in Haitian. Probus 5: 63– CISM. However, Ockham’s razor is better known in scientific 93. theorizing, where it has two aspects. First, there is the idea DeGraff, M. (1995). On certain differences between Haitian and that one should not postulate entities that make no observ- French predicative constructions. In J. Amastae, G. Goodall, M. able difference. For example, Gottfried Wilhelm Leibniz Montalbetti, and M. Phinney, Eds., Contemporary Research in objected to Isaac Newton’s absolute space because one Romance Linguistics. Amsterdam: Benjamins, pp. 237–256. absolute velocity of the solar system would produce the DeGraff, M. (1996a). Creole languages and parameter setting: A same observable behavior as any other absolute velocity. case study using Haitian Creole and the pro-drop parameter. In The second aspect of Ockham’s razor is that the number H. Wekker, Ed., Creole Languages and Language Acquisition. of postulated entities should be minimized. One of the ear- Berlin: Mouton de Gruyter, pp. 65–105. DeGraff, M. (1996b). UG and acquisition in pidginization and cre- liest known examples was when Copernicus (1473–1543) olization. Open peer commentary on Epstein et al. (1996). argued in favor of his stationary-sun theory of planetary Behavioral and Brain Sciences 19 (4): 723–724. motion by arguing that it endowed one cause (the motion DeGraff, M., Ed. (1999). Language Creation and Language of the earth around the sun) with many effects (the appar- Change: Creolization, Diachrony and Development. Cam- ent motions of the planets). In contrast, his predecessor bridge, MA: MIT Press. (Ptolemy) unwittingly duplicated the Earth’s motion many Déprez, V. (1999). The roots of negative concord in French and times in order to “explain” the same effects. Newton’s ver- French-based creoles. In M. DeGraff, Ed., Language Creation sion of Ockham’s razor appeared in his first and second and Language Change: Creolization, Diachrony and Develop- rules of philosophizing: “We are to admit no more causes ment. Cambridge, MA: MIT Press. of natural things than such as are both true and sufficient Dresher, E. (Forthcoming). Charting the learning path: Cues to parameter setting. Linguistic Inquiry (to appear). to explain their appearances. Therefore, to the same natu- Fodor, J. (1995). Fewer but better triggers. CUNY Forum 19: 39–64. ral effects we must, as far as possible, assign the same Hymes, D., Ed. (1971). Pidginization and Creolization of Lan- causes.” guages. Cambridge: Cambridge University Press. Both aspects arise in modern empirical science, includ- Lightfoot, D. (1991). How to Set Parameters: Arguments from ing psychology. When data are modeled by an equation or Language Change. Cambridge, MA: MIT Press. set of equations, they are usually designed so that all theo- Lightfoot, D. (1995). Why UG needs a learning theory: Triggering retical parameters can be uniquely estimated for the data verb movement. In A. Battye and I. Roberts, Eds., Clause using standard statistical estimation techniques (like the Structure and Language Change. New York: Oxford University method of least squares, or maximum likelihood estima- Press, pp. 31–52. tion). When this condition is satisfied, the parameters are Mufwene, S. (1996). The Founder Principle in creole genesis. Diachronica 13 (1): 83–134. said to be identifiable. This practice ensures the satisfaction Roberts, I. (1993). Verbs and Diachronic Syntax: A Comparative of Ockham’s razor in its first aspect. For example, suppose History of English and French. Dordrecht: Kluwer. we have a set of data consisting in n pairs of (x, y) values: Veenstra, T. (1996). Serial Verbs in Saramaccan: Predication and {(x1, y1), (x2, y2), . . . (xn, yn)}. The model yi = a + b xi, for Creole Genesis. Ph.D. diss., University of Amsterdam. Distrib- i = 1, 2, . . . n, is identifiable because the parameters a and b uted by Holland Academic Graphics, The Hague. can be uniquely estimated by sufficiently varied data. But a Vrzic, Z. (1997). A minimalist approach to word order in Chinook ´ model like yi = a + (b + c) xi is not identifiable, because Jargon and the theory of creole genesis. In B. Bruening, Ed., many pairs of values of b and c fit the data equally well. Proceedings of the Eighth Student Conference in Linguistics. Different parameter values make no empirical difference. Cambridge, MA: MIT Working Papers in Linguistics, 31. 628 Parsimony and Simplicity Normally, this desideratum is so natural and common- nontechnical introduction). These include Akaike’s Infor- sensical that nonidentifiable models are not used. B. F. Skin- mation Criterion (AIC; Akaike 1974, 1985), the Bayesian ner resisted the introduction of intervening variables in Information Criterion (BIC; Schwarz 1978), and MINIMUM BEHAVIORISM for this reason. However, they do arise in DESCRIPTION LENGTH (MDL; Rissanen 1989). They trade NEURAL NETWORKS (see also COGNITIVE MODELING, CON- off simplicity and fit a little differently, but all of them NECTIONIST). In the simplest possible two-layered network, address the same problem as significance testing: Which one would have one input neuron, or node, with an activa- of the estimated “curves” from competing models best tion x, one hidden node, with activation y, and an output represents reality? This work has led to a clear under- node, with activation z, where the output is a function of the standing of why this form of simplicity is relevant to that hidden node activation, which is in turn a function of the question. input activation. In a simple linear network, this would However, the paucity of parameters is a limited notion. It mean that z = a.y, and y = b.x, where a and b are the connec- does not mark a difference in simplicity between a wiggly tion weights between the layers. The connection weights are curve and a straight curve. Nor does it capture the idea of the parameters of the model. But the hidden activations are simpler theories having fewer numbers of fundamental prin- not observed, and so the only testable consequence of the ciples or laws. Nor does it reward the repeated use of equa- model is the input-output function, z = (ab)x. Different pairs tions of the same form. A natural response is to insist that of values of the parameters a and b lead to the same input- there must be other kinds of simplicity or unification that output function. Therefore, the model is not identifiable. are relevant to theory choice. But there are well-known Perhaps the more difficult problem is to understand how problems in defining these alternative notions of simplicity to draw the line in cases in which extra parameters make a (e.g., Priest 1976; Kitcher 1976). Moreover, there are no difference, but a very little difference. This is the second precise proposals about how these notions of simplicity are aspect of Ockham’s razor. For example, how do we select traded off with fit. Nor are there any compelling ideas about between competing models like y = a + b x1 and y = a + b why such properties should count in favor of one theory x1 + c x2, where the parameters a, b, and c are adjustable being closer to the truth than another. parameters that range over a set of possible values? Each See also EXPLANATION; JUSTIFICATION; SCIENTIFIC THINK- equation represents a different model, which may be ING AND ITS DEVELOPMENT; SIMILARITY; UNITY OF SCIENCE thought of as a family of curves. Under one common notion —Malcolm R. Forster of simplicity, the first model is simpler than the second model because it has fewer adjustable parameters. Simplic- References ity is measured by the size, or dimension, of the family of curves. (Note that, in this definition, models are of greater Akaike, H. (1974). A new look at the statistical model identifica- or lesser simplicity, but all curves are equally simple tion. IEEE Transactions on Automatic Control, vol. AC-19: because their equations have zero adjustable parameters.) 716–723. Akaike, H. (1985). Prediction and entropy. In A. C. Atkinson and How does one decide when an additional parameter S. E. Fienberg, Eds., A Celebration of Statistics. New York: makes “enough” of an empirical difference to justify its Springer, pp. 1–24. inclusion, or when an additional parameter is “beyond Forster, M. R., and E. Sober. (1994). How to tell when simpler, necessity”? If the choice is among models that fit the data more unified, or less ad hoc theories will provide more accurate equally well (where the fit of a model is given by the fit of predictions. British Journal for the Philosophy of Science 45: its best case), then the answer is that simplicity should break 1–35. the tie. But in practice, competing models do not fit equally Kitcher, P. (1976). Explanation, conjunction and unification. Jour- well. For instance, when one model is a special case of, or nal of Philosophy 73: 207–212. nested in, another (as in the previous example), the more Priest, G. (1976). Gruesome simplicity. Philosophy of Science 43: complex model will always fit better (if only because it is 432–437. Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. able to better fit the noise in the data). Singapore: World Books. So, the real question is: How much better must the com- Schwarz, G. (1978). Estimating the dimension of a model. Annals plex model fit before we say that the extra parameter is nec- of Statistics 6: 461–465. essary? Or, when should the better fit of the complex model be “explained away” as arising from the greater tendency of Further Readings complex models to fit noise? How do we trade off fit with simplicity? That is the motivation for standard significance Forster, M. R. (1995). The curve-fitting problem. In R. Audi, Ed., The Cambridge Dictionary of Philosophy. Cambridge Univer- testing in statistics. Notice that significance testing does not sity Press. always favor the simpler model. Nor is the practice moti- Forster, M. R. (1999). The new science of simplicity. In H. A. vated by any belief in the simplicity of nature. In fact, when Keuzenkamp, M. McAleer, and A. Zellner, Eds., Simplicity, enough data accumulates, the choice will favor the complex Inference and Economertric Modelling. Cambridge: Cam- model eventually even if the added parameters have very bridge University Press. small (but nonzero) values. Gauch, H. G., Jr. (1993). Prediction, parsimony and noise. Ameri- In recent years, there have been many new model selec- can Scientist 81: 468–478. tion criteria developed in statistics, all of which define Geman, S., E. Bienenstock, and R. Doursat. (1992). Neural net- simplicity in terms of the paucity of parameters, or the works and the bias/variance dilemma. Neural Computation 4: dimension of a model (see Forster and Sober 1994 for a 1–58. Pattern Recognition and Feedforward Networks 629 Jefferys, W., and J. Berger. (1992). Ockham’s razor and Bayesian analysis. American Scientist 80: 64–72. Popper, K. (1959). In The Logic of Scientific Discovery. London: Hutchinson. Sakamoto, Y., M. Ishiguro, and G. Kitagawa. (1986). Akaike Infor- mation Criterion Statistics. Dordrecht: Kluwer. Sober, E. (1990). Let’s razor Ockham’s razor. In D. Knowles, Ed., Explanation and Its Limits. Royal Institute of Philosophy supp. vol. 27. Cambridge: Cambridge University Press, pp. 73–94. Parsing See PSYCHOLINGUISTICS; SENTENCE PROCESSING Figure 1. A feedforward network having two layers of adaptive parameters. Pattern Recognition and Feedforward Networks The goal in pattern recognition is to use a set of example solutions to some problem to infer an underlying regularity which can subsequently be used to solve new instances of A feedforward network can be viewed as a graphical repre- the problem. Examples include handwritten digit recogni- sentation of a parametric function which takes a set of input tion, medical image screening, and fingerprint identifica- values and maps them to a corresponding set of output val- tion. In the case of feedforward networks, the set of ues (Bishop 1995). Figure 1 shows an example of a feedfor- example solutions (called a training set) comprises ward network of a kind that is widely used in practical instances of input values together with corresponding applications. desired output values. The training set is used to define an Vertices in the graph represent either inputs, outputs, or error function in terms of the discrepancy between the pre- “hidden” variables, while the edges of the graph correspond dictions of the network for given inputs and the desired val- to the adaptive parameters. We can write down the analytic ues of the outputs given by the training set. A common function corresponding to this network as follows. The out- example of an error function would be the squared differ- put of the jth hidden node is obtained by first forming a ence between desired and actual output, summed over all weighted linear combination of the d input values xi to give outputs and summed over all patterns in the training set. d The learning process then involves adjusting the values of ∑ uji xi + bj . aj = (1) the parameters to minimize the value of the error function. Once the network has been trained, that is, once suitable i=1 values for the parameters have been determined, new inputs The value of hidden variable j is then obtained by trans- can be applied and the corresponding predictions (i.e., net- forming the linear sum in (1) using an activation function g( work outputs) calculated. · ) to give The use of layered feedforward networks for pattern rec- z j = g ( a j ). (2) ognition was widely studied in the 1960s. However, effec- tive learning algorithms were only known for the case of Finally, the outputs of the network are obtained by forming networks in which, at most, one of the layers comprised linear combinations of the hidden variables to give adaptive interconnections. Such networks were known vari- M ously as perceptrons (Rosenblatt 1962) and adalines (Wid- ∑ vkjzj + ck . row and Lehr 1990), and were seriously limited in their ak = (3) capabilities (Minsky and Papert 1969/1990). Research into j=1 artificial NEURAL NETWORKS was stimulated during the The parameters {uji, vkj} are called weights while {bj, ck} 1980s by the development of new algorithms capable of are called biases, and together they constitute the adaptive training networks with more than one layer of adaptive parameters in the network. There is a one-to-one corre- parameters (Rumelhart, Hinton, and Williams 1986). A key spondence between the variables and parameters in the ana- development involved the replacement of the nondifferen- lytic function and the nodes and edges respectively in the tiable threshold activation function by a differentiable non- graph. linearity, which allows gradient-based optimization Historically, feedforward networks were introduced as algorithms to be applied to the minimization of the error models of biological neural networks (McCulloch and Pitts function. The second key step was to note that the deriva- 1943), in which nodes corresponded to neurons and edges tives could be calculated in a computationally efficient man- corresponded to synapses, and with an activation function ner using a technique called backpropagation, so called g(a) given by a simple threshold. The recent development of because it has a graphical interpretation in terms of a propa- feedforward networks for pattern recognition applications gation of error signals from the output nodes backward has, however, proceeded largely independently of any bio- through the network. Originally these gradients were used logical modeling considerations. in simple steepest-descent algorithms to minimize the error 630 Pattern Recognition and Feedforward Networks function. More recently, however, this has given way to the lems, and indeed there are already many commercial appli- use of more sophisticated algorithms, such as conjugate gra- cations of feedforward neural networks in routine use. dients, borrowed from the field of nonlinear optimization See also COGNITIVE MODELING, CONNECTIONIST; CON- (Gill, Murray, and Wright 1981). NECTIONIST APPROACHES TO LANGUAGE; MCCULLOCH; During the late 1980s and early 1990s, research into NEURAL NETWORKS; PITTS; RECURRENT NETWORKS feedforward networks emphasized their role as function —Christopher M. Bishop approximators. For example, it was shown that a network consisting of two layers of adaptive parameters could References approximate any continuous function from the inputs to the outputs with arbitrary accuracy provided the number of hid- Anderson, J. A., and E. Rosenfeld, Eds. (1988). Neurocomputing: den units is sufficiently large and provided the network Foundations of Research. Cambridge, MA: MIT Press. parameters are set appropriately (Hornik, Stinchcombe, and Bishop, C. M. (1995). Neural Networks for Pattern Recognition. White 1989). More recently, however, feedforward net- Oxford: Oxford University Press. works have been studied from the much richer probabilistic Fukunaga, K. (1990). Introduction to Statistical Pattern Recogni- perspective (see PROBABILITY, FOUNDATIONS OF), which tion. 2nd ed. San Diego: Academic Press. sets neural networks firmly within the field of statistical Gill, P. E., W. Murray, and M. H. Wright. (1981). Practical Opti- pattern recognition (Fukunaga 1990). For instance, the out- mization. London: Academic Press. Hornik, K., M. Stinchcombe, and H. White. (1989). Multilayer puts of the network can be given a probabilistic interpreta- feedforward networks are universal approximators. Neural Net- tion, and the role of network training is then to model the works 2(5): 359–366. probability distribution of the target data, conditioned on the MacKay, D. J. C. (1992). A practical Bayesian framework for input variables. Similarly, the minimization of an error func- back-propagation networks. Neural Computation 4(3): 448– tion can be motivated from the well-established principle of 472. maximum likelihood that is widely used in statistics. An McCulloch, W. S., and W. Pitts. (1943). A logical calculus of the important advantage of this probabilistic viewpoint is that it ideas immanent in nervous activity. Bulletin of Mathematical provides a theoretical foundation for the study and applica- Biophysics 5: 115–133. Reprinted in Anderson and Rosenfeld tion of feedforward networks (see STATISTICAL LEARNING (1988). THEORY), as well as motivating the development of new Minsky, M. L., and S. A. Papert. (1969/1990). Perceptrons. Cam- bridge, MA: MIT Press. models and new learning algorithms. Neal, R. M. (1996). Bayesian Learning for Neural Networks. Lec- A central issue in any pattern recognition application is ture Notes in Statistics 118. Springer. that of generalization, in other words the performance of the Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons trained model when applied to previously unseen data. It and the Theory of Brain Mechanisms. Washington, DC: Spartan. should be emphasized that a small value of the error function Rumelhart, D. E., G. E. Hinton, and R. J. Williams. (1986). Learn- for the training data set does not guarantee that future predic- ing internal representations by error propagation. In D. E. tions will be similarly accurate. For example, a large network Rumelhart, J. L. McClelland, and the PDP Research Group, with many parameters may be capable of achieving a small Eds., Parallel Distributed Processing: Explorations in the error on the training set, and yet fail to model the underlying Microstructure of Cognition, vol. 1: Foundations. Cambridge, distribution of the data and hence achieve poor performance MA: MIT Press, pp. 318–362. Reprinted in Anderson and Rosenfeld (1988). on new data (a phenomenon sometimes called “overfitting”). Widrow, B., and M. A. Lehr. (1990). 30 years of adaptive neural This problem can be approached by limiting the complexity networks: Perceptron, madeline, and backpropagation. Pro- of the model, thereby forcing it to extract regularities in the ceedings of the IEEE 78(9): 1415–1442. data rather than simply memorizing the training set. From a fully probabilistic viewpoint, learning in feedforward net- Further Readings works involves using the network to define a prior distribu- tion over functions, which is converted to a posterior Anderson, J. A., A. Pellionisz, and E. Rosenfeld, Eds. (1990). Neu- distribution once the training data have been observed. It can rocomputing 2: Directions for Research. Cambridge, MA: MIT be formalized through the framework of BAYESIAN LEARN- Press. ING, or equivalently through the MINIMUM DESCRIPTION Arbib, M. A. (1987). Brains, Machines, and Mathematics. 2nd ed. LENGTH approach (MacKay 1992; Neal 1996). New York: Springer-Verlag. In practical applications of feedforward networks, atten- Bishop, C. M. (1994). Neural networks and their applications. tion must be paid to the representation used for the data. For Review of Scientific Instruments 65(b): 1803–1832. Bishop, C. M., P. S. Haynes, M. E. U. Smith, T. N. Todd, and D. L. example, it is common to perform some kind of preprocess- Trotman. (1995). Real-time control of a tokamak plasma using ing on the raw input data (perhaps in the form of “feature neural networks. Neural Computation 7: 206–217. extraction”) before they are used as inputs to the network. Block, H. D. (1962). The perceptron: A model for brain function- Often this preprocessing takes into consideration any prior ing. Reviews of Modern Physics 34(1): 123–135. Reprinted in knowledge we might have about the desired properties of Anderson and Rosenfeld (1988). the solution. For instance, in the case of digit recognition we Broomhead, D. S., and D. Lowe. (1988). Multivariable functional know that the identity of the digit should be invariant to the interpolation and adaptive networks. Complex Systems 2: 321– position of the digit within the input image. 355. Feedforward neural networks are now well established as Duda, R. O., and P. E. Hart. (1973). Pattern Classification and an important technique for solving pattern recognition prob- Scene Analysis. New York: Wiley. Penfield, Wilder 631 provided the “gold standard” by which the functional Geman, S., E. Bienenstock, and R. Doursat. (1992). Neural net- works and the bias/variance dilemma. Neural Computation properties of brain regions might be determined. This 4(1): 1–58. Montreal Procedure was used both to localize the epilep- Hertz, J., A. Krogh, and R. G. Palmer. (1991). Introduction to the togenic tissue itself and to minimize surgical damage by Theory of Neural Computation. Redwood City, CA: Addison- first mapping critical motor, somatosensory, language- Wesley. related brain tissue by applying brief, low-voltage electri- Jacobs, R. A., M. I. Jordan, S. J. Nowlan, and G. E. Hinton. (1991). cal current through thin wire electrodes to sites on the cor- Adaptive mixtures of local experts. Neural Computation 3(1): tical surface of the brains of fully conscious human 79–87. patients. It was then noted which parts of the body moved, Jordan, M. I., and C. M. Bishop. (1997). Neural networks. In A. B. or what bodily sensations were reported in response to Tucker, Ed., The Computer Science and Engineering Hand- each stimulus. By the late 1930s, Penfield and his cowork- book. Boca Raton, FL: CRC Press, pp. 536–556. Le Cun, Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, ers had created the first systematic maps of both the W. Hubbard, and L. D. Jackel. (1989). Backpropagation applied human primary motor and somatosensory cortex. Their to handwritten zip code recognition. Neural Computation 1(4): data indicated there was a point-to-point relation between 541–551. parts of the body and these neocortical regions (i.e., that Lowe, D., and A. R. Webb. (1990). Exploiting prior knowledge in motor and somatosensory cortex were both somatotopi- network optimization: An illustration from medical prognosis. cally organized) and that these distributions of the body Network: Computation in Neural Systems 1(3): 299–323. surface were distorted, leading to the construction of his MacKay, D. J. C. (1995). Bayesian non-linear modelling for the famous sensory and motor homunculistylized cartoons of 1993 energy prediction competition. In G. Heidbreder, Ed., the body surface with the relative prominence of different Maximum Entropy and Bayesian Methods, Santa Barbara body parts reflecting the extent of their representation in 1993. Dordrecht: Kluwer. MacKay, D. J. C. (1995). Probable networks and plausible predic- the cortex. A sensorimotor integrative conception of brain tions—a review of practical Bayesian methods for supervised organization was also promoted by his finding that 25 per- neural networks. Network: Computation in Neural Systems cent of the stimulation points yielding sensory experiences 6(3): 469–505. were located in precentral motor cortical regions. Subse- quent investigation of nonhuman subjects led to identifica- tion of analogous maps or representations of visual and Penfield, Wilder auditory external worlds and, together with Penfield’s own mapping work, helped shape our view of cortical organiza- Our knowledge about the organization of the CEREBRAL tion for decades. While his original work suggested that CORTEX is derived, in part, from the search for a therapeutic there were several somatosensory cortical representations intervention for a particular disease—epilepsy. Wilder Pen- of the external environment, it was not until the late 1970s field (1891–1976), stimulated by his postgraduate work that more refined anatomical and physiological techniques with Otfrid Foerster, a pioneer in the development of mod- revealed dozens of maps in each modality, rather than just ern neurosurgical procedures to relieve seizures in epileptic one or two. patients, began a prolonged scientific study of the surgical Careful study of hundreds of patients by Penfield and his treatment of epilepsy at McGill University in 1928. By coworkers (and more recently by George Ojemann and his 1934, Penfield had founded Montreal Neurological Institute colleagues at the University of Washington) also provided (MNI), which he served as director until his retirement in clear evidence of cerebral asymmetry or HEMISPHERIC SPE- 1960. Penfield was soon joined by others, including Herbert CIALIZATION. For example, the pooled data from many Jasper, who introduced the EEG to the operating room, and patients yielded the first direct confirmation of conclusions by D. O. HEBB and Brenda Milner, who introduced the idea inferred from previous postmortem correlations, by estab- of systematic neuropsychological assessment of surgical lishing a map of language-related zones of the left hemi- patients. The foundation and the establishment of the sphere that included not only the traditional areas of Paul endowment for the MNI, or “Neuro,” which has become an BROCA and Carl Wernicke, but also the supplementary international center for training, research, and treatment speech zone. Stimulation in these regions of the left hemi- related to the brain and diseases of the nervous system, may sphere usually arrested (or initiated) speech during the stim- be Penfield’s most lasting legacy. The idea of a neurological ulation period or produced other forms of language hospital, integrated with a multidisciplinary brain research interference such as misnaming and impaired word repeti- complex, providing a center where a mutidisciplinary team tion, whereas stimulation of the right hemisphere seldom of both scientists and physicians might study the brain, has did. Penfield’s research also furnished other evidence that served as a model for the establishment of similar units did not support traditional localizationist models of lan- throughout the world. guage. For example, stimulation of anterior and posterior By the mid-1930s, Penfield and his colleagues were speech zones had remarkably similar effects on speech employing electrical stimulation and systematic mapping function, and the extent of these cortical language zones techniques adapted from physiological work with animals. varied considerably among patients. These procedures were employed to aid in excising those Accounts of apparent awaking of long-lost childhood regions of brain tissue that served as the focus of epileptic and other memories by temporal lobe epileptics during elec- activity in patients whose seizures were not adequately trical stimulation of the region were recorded by Penfield in controlled by available drugs. Electrical brain stimulation the 1930s. This data, together with evidence provided in the 632 Perception neuropsychological studies of such patients after surgery by Perception Brenda Milner and others, made it clear to Penfield that the medial temporal region, including the HIPPOCAMPUS, was of See HAPTIC PERCEPTION; HIGH-LEVEL VISION; MID-LEVEL special importance in respect to human MEMORY (and emo- VISION; PERCEPTUAL DEVELOPMENT tion). Penfield’s early observations on seizures arising from Perception of Motion deep midline portions of the brain also had an important impact on the development of ideas about the neural sub- strate of CONSCIOUSNESS. In 1938 he proposed a “centren- See MOTION, PERCEPTION OF cephalic” system that stressed the role of the upper brain stem in the integration of higher functions. In arguing that Perceptrons consciousness is more closely related to the brainstem than the cortex, he foreshadowed Moruzzi and Magouns’s (1949) conception about the role of the midbrain reticular forma- SeeCOMPUTING IN SINGLE NEURONS; NEURAL NETWORKS; tion. “Consciousness,” he later wrote, “exists only in associ- PATTERN RECOGNITION AND FEEDFORWARD NETWORKS; ation with the passage of impulses through ever-changing RECURRENT NETWORKS circuits between the brainstem and cortex. One can not say that consciousness is here or there. But certainly without Perceptual Development centrencephalic integration, it is nonexistent.” Penfield’s lifelong search for a better understanding of the functional organization of the brain and its disorders during epileptic Just a century ago it was widely believed that the world seizures is symbolized by this hypothesis of the central inte- perceived by newborn infants was, in the words of WILL- grating mechanism. Never localized in any specific area of IAM JAMES, a “blooming, buzzing confusion.” In the gray matter, but “in wider-ranging mechanisms,” it repre- decades since then, developmental research has demon- sented a conceptual bridge he envisaged between brain and strated dramatically that James’s view was erroneous. The mind (cf. MIND-BODY PROBLEM). shift in view was prompted by research from various domains. In the 1930s, Piaget’s detailed descriptions of his See also CONSCIOUSNESS, NEUROBIOLOGY OF; CORTICAL infant children and Gesell’s charting of infants’ motor LOCALIZATION, HISTORY OF milestones created a climate of interest in infants as —Richard C. Tees research subjects and in developmental questions. The work of ethologists studying the behavior of animals in References their natural habitats (COMPARATIVE PSYCHOLOGY) paved the way for careful observations of spontaneous activity in Moruzzi, G., and H. W. Magoun. (1949). Brain stem reticular for- even the youngest animals. Observation of spontaneous mation and activation of EEG. EEG and Clinical Neurophysiol- activity ran counter to theories of stimulus-response (S-R) ogy 1: 455–473. chaining and the radical BEHAVIORISM fashionable in the Penfield, W. (1938). The cerebral cortex in man: 1. The cerebral 1930s; at the same time, it inspired the design of new cortex and consciousness. Archives of Neurology and Psychia- methods for studying infants, including methods for ask- try 40: 417–442. ing what infants perceive. By the 1960s, methods for studying infant perception Further Readings had multiplied as psychologists exploited infants’ natural Finger, S. (1994). The Origins of Neuroscience. New York: Oxford exploratory behaviors, especially looking. Preferential University Press. looking at one of two displays, habituation to one display Hebb, D. O. (1949). The Organization of Behavior: A Neuropsy- followed by a new display, and a paired comparison test of chological Theory. New York: Wiley. old and new displays were highly effective methods for Hebb, D. O., and W. Penfield. (1940). Human behavior after exten- studying visual discrimination of simple contrasting prop- sive bilateral removals from the frontal lobe. Archives of Neu- erties and even more complex patterns. Spontaneous rology and Psychiatry 44: 421–438. exploratory behavior was also the basis for research meth- Penfield, W. (1958). The Excitable Cortex in Conscious Man. ods, particularly operant conditioning of responses such as Springfield, IL: Charles C. Thomas. sucking, head turning, or moving a limb. Methods which Penfield, W., and E. Boldrey. (1937). Somatic motor and sensory representation in the cerebral cortex of man as studied by elec- provide infants with opportunities to control their environ- trical stimulation. Brain 60: 389–443. ment (e.g., operant conditioning, infant-controlled habitua- Penfield, W., and H. Jasper. (1954). Epilepsy and the Functional tion) were shown to be more effective than methods Anatomy of the Human Brain. Boston: Little Brown. without consequences for changing behavior (Horowitz et Penfield, W., and B. Milner. (1958). Memory deficit produced by al. 1972). Psychologists found that they could also investi- bilateral lesions in the hippocampal zone. Archives of Neurol- gate what is perceived utilizing natural actions in con- ogy and Psychiatry 79: 475–498. trolled experimental situations, such as reaching for Penfield, W., and T. Rasmussen. (1950). The Cerebral Cortex of objects varying in bulk or attainability, and locomotion Man. New York: MacMillan. across surfaces varying in rigidity, pitfalls, obstacles, and Penfield, W., and L. Roberts. (1959). Speech and Brain Mecha- slope. Methods borrowed from physiological research, nisms. Princeton, NJ: Princeton University Press. Perceptual development 633 including heart rate and electrophysiological responses, object’s surface is accessed by haptic as well as visual have been used effectively in studying sensitivity to change information and is differentiated early in the first year in stimulus dimensions. These measures, along with psy- (SURFACE PERCEPTION; see also HAPTIC PERCEPTION). chophysical procedures, have revealed impressive discrim- 3. Perception is multimodally unified (MULTISENSORY inatory abilities in very young infants. (See VISION AND INTEGRATION). From the earliest moments of life, LEARNING; AUDITION; TASTE; and SMELL.) Researchers are infants orient to sounds, particularly human voices, and now discovering the precursors of some of these competen- they engage in active visual exploration of faces and cies during the fetal period of prenatal development. sounding objects (Gibson 1988). In fact, infants as Major research topics include development of perception young as five months can match the sounds and visible of events, the persistent properties of objects, and the larger motions of faces and voices in a bimodal matching task layout of surfaces. Five key points that emerge are the fol- (Kuhl and Meltzoff 1982; Walker 1982). Infants can lowing: also match the sounds and visible motions of object events (Spelke 1976), evidently perceiving a unified 1. The perception of events is prospective or forward-look- event. At one month, infants appear to detect and unify ing. One example is infants’ differential response to haptic and visual information for object substance (Gib- approaching obstacles and apertures, the so-called son and Walker 1984). looming studies (Yonas, Petterson, and Lockman 1979). 4. Properties of the larger layout are made available mul- Infants respond with defensive blinking and head retrac- timodally as motor skills and new action patterns tion to approaching objects, but not to approaching develop. Experience and practice play an important apertures or to withdrawing objects. Studies of neonates role in this development. How far away things are must reaching out to catch moving objects provide another be perceived in units of body scale by infants. Observa- compelling demonstration of anticipatory perception as tion of the hands, in relation to surrounding objects, skilled reaching develops (Hofsten 1983). Infants also occurs spontaneously within the first month (van der anticipate occurrence of environmental happenings, for Meer, van der Weel, and Lee 1995). When reaching for example, by looking toward the locus of a predictable objects emerges as a skill, judging not only the dis- event (Haith 1993). tance of an object but its size improves rapidly. Infor- 2. Motion is important for revealing the persistent proper- mation for the major properties of the layout is best ties of events, objects, and layout of the world. A striking accessed when babies begin locomotion. While recog- example is the perception of biological motion when nition of obstacles, approaching objects, and surface visual information is minimized. When spots of light are properties is not unprepared, experience in traversing placed on key joints (e.g., elbows, ankles, hips), and all the ground surface brings new lessons. Crawling other illumination is eliminated, observers immediately infants tend to avoid a steep drop in the surface of sup- perceive a person engaging in a uniquely specified activ- port (Gibson and Walk 1960). The affordance of falling ity, such as walking, dancing, or lifting a heavy box, but is perceived early in locomotor history, but becomes only when the actor is moving. Infants differentiate these more dependable with experience in locomotion. Prop- biological motion displays from inverted displays and erties of the surface of support that afford locomotion from spots of light moving in a random fashion (its rigidity, smoothness, slope, etc.) are detected by (Bertenthal 1993). Motion makes possible pickup of experienced crawling infants. They learn to cope effec- information about social (communicative) events, such tively with steep slopes by avoiding them or adopting as smiling and talking, at an early age. The role of safe methods of travel, but the same infants as novice motion is also critical in visual detection of constant upright walkers attempt dangerous slopes and must properties of objects, such as unity, size, shape (SHAPE learn new strategies (Adolph 1997). Bipedal locomo- PERCEPTION), and substance. At four months of age, tion requires extensive adjustments of perceptual and infants perceive the unity of an object despite partial locomotor skills, as infants learn a new balancing act, occlusion by another object, provided that the occluded using multimodal information from ankles, joints, and object is in motion (Kellman and Spelke 1983). Constant visual cues provided by flow patterns created by their size of an object is given in changes in distance relative own movements. Novice walkers fall down in a “mov- to self and background, and neonates appear to detect ing room,” despite a firm and stable unmoving ground size as invariant (Slater, Mattock, and Brown 1990). surface (Lee and Aronson 1974). Flow patterns created Shape constancy, invariant over changes in orientation of by the room’s motion give false information that they an object, is perceived by five months (Gibson et al. are falling forward or backward. Perceiving the world 1979). Rigidity or elasticity of substance is differentiated entails coperception of the self; in this case, via visual via mouthing in neonates (Rochat 1983), and visually by information from perspective changes in the room’s five months (Gibson and Walker 1984). Methods of walls in relation to vestibular information about one’s visual habituation and observation of grasping skills own upright posture. converge on evidence for perceiving these properties. 5. Infants perceive the SELF as a unit distinct from the Such convergence is not surprising, since exploration of world (see SELF-KNOWLEDGE). By four to five months, object properties is naturally multimodal. Surface prop- infants watch their own legs moving currently on a tele- erties of objects, such as color (see COLOR VISION), are vision screen, contrasted with views of similarly clad not necessarily dependent on motion, but texture of an legs of another infant or their own at an earlier moment 634 Perceptual Development (Bahrick and Watson 1985). They reliably prefer to gaze Bertenthal, B. (1993). Infants’ perception of biological motions. In C. Granrud, Ed., Visual Perception and Cognition in Infancy. at the novel display rather than their own ongoing Hillsdale, NJ: Erlbaum, pp. 175–214. movements. However, introduction of a target that can Gibson, E. J. (1988). Exploratory behavior in the development of be kicked changes the preference to monitoring the perceiving, acting and the acquiring of knowledge. Annual ongoing self kicking at the target (Morgan and Rochat Review of Psychology 39: 1–41. 1995). An opportunity for making contact with an Gibson, E. J., and R. D. Walk. (1960). The “visual cliff.” Scientific object provides motivation for controlling the encoun- American 202: 64–71. ter. Considerable other research in a contingent rein- Gibson, E. J., and A. S. Walker. (1984). Development of knowl- forcement situation (e.g., kicking to rotate a mobile) edge of visual-tactual affordances of substance. Child Develop- confirms infants’ perception of a self in control. Disrup- ment 55: 453–460. tion of control results in frustration and emotional dis- Gibson, E. J., C. J. Owsley, A. S. Walker, and J. S. Megaw-Nyce. (1979). Development of the perception of invariants: Substance turbance (Lewis, Sullivan, and Brooks-Gunn 1985). and shape. Perception 5: 609–619. Haith, M. M. (1993). Future-oriented processes in infancy: The Early reaction to the explosion of knowledge about the case of visual expectations. In C. Granrud, Ed., Visual Percep- perceptual abilities of young infants was a burst of aston- tion and Cognition in Infancy. Hillsdale, NJ: Erlbaum, pp. 235– ished admiration (“Aren’t babies wonderful?”), and little 264. concern was given to how development progresses, Hofsten, C. von (1983). Catching skills in infancy. Journal of although previously popular Piagetian views were ques- Experimental Psychology: Human Perception and Performance tioned. Three current views vary in their assumptions about 2: 75–85. processes involved in perceptual development. Two are con- Horowitz, F., L. Paden, W. Bhana, and P. Self. (1972). An “infant struction theories: (1) The information processing view control” procedure for studying infant visual fixations. Devel- opmental Psychology 7: 90. assumes that bare sensory input is subject to cognitive pro- Kellman, P. J., and E. Spelke. (1983). Perception of partly occluded cessing that constructs meaningful perception. (2) The objects in infancy. Cognitive Psychology 15: 483–524. nativist view assumes that rules about order governing Kuhl, P. K., and A. N. Meltzoff. (1982). The bimodal perception of events in the world are inherently given and used to interpret speech in infancy. Science 218: 1138–1141. observed events. (3) The third view combines an ecological Lee, D. N., and E. Aronson. (1974). Visual proprioceptive control approach to perception and a systems view. Infants actively of standing in human infants. Perception and Psychophysics 15: seek information that comes to specify identities, places, 529–532. and affordances in the world. Processes that influence Lewis, M., M. W. Sullivan, and J. Brooks-Gunn. (1985). Emotional development are the progressive growth and use of action behavior during the teaming of a contingency in early infancy. systems, and learning through experience. Perceptual learn- British Journal of Developmental Psychology 3: 307–316. Morgan, R., and P. Rochat. (1995). The perception of self-pro- ing is viewed as a selective process, beginning with explor- duced leg movements in self- vs. object-oriented contexts by 3- atory activity, leading to observation of consequences, and to 5-month-old infants. In B. C. Bardy, R. J. Bootsma, and Y. to selection based on two criteria, an affordance fit and Guiard, Eds., Studies in Perception and Action 3. Hillsdale, NJ: reduction of uncertainty, exemplified by detection of order Erlbaum. and unity in what is perceived. Rochat, P. (1983). Oral touch in young infants: Response to varia- We know much less about perceptual development after tions of nipple characteristics in the first month of life. Interna- the first two years. After infancy, perceptual development tional Journal of Behavioral Development 6: 123–133. takes place mainly in complex tasks such as athletic skills, Slater, A., A. Mattock, and E. Brown. (1990). Size constancy at tool use, way-finding, steering vehicles, using language, birth: Newborn infants’ responses to retinal and real size. Jour- and READING—all tasks in which experience and learning nal of Experimental Child Psychology 49: 314–322. Spelke, E. S. (1976). Infants’ intermodal perception of events. become more and more specialized (cf. COGNITIVE DEVEL- Cognitive Psychology 8: 533–560. OPMENT). Theoretical applications to specialized tasks van der Meer, A. L. H., F. R. van der Weel, and D. N. Lee. (1995). involving perceptual learning can be profitable (Abernathy The functional significance of arm movements in neonates. Sci- 1993). ence 267: 693–695. See also AFFORDANCES; ECOLOGICAL PSYCHOLOGY; IMI- Walker, A. S. (1982). Intermodal perception of expressive behavior TATION; INFANT COGNITION; NATIVISM; PIAGET by human infants. Journal of Experimental Child Psychology 23: 514–535. —Eleanor J. Gibson, Marion Eppler, and Karen Adolph Yonas, A., L. Petterson, and J. Lockman. (1979). Young infants’ References sensitivity to optical information for collision. Canadian Jour- nal of Psychology 33: 1285–1290. Abernathy, B. (1993). Searching for the minimal information for skilled perception and action. Psychological Research 55: 131– Further Readings 138. Bertenthal, B., and R. K. Clifton. (1997). Perception and action. In Adolph, K. E. (1997). Learning in the Development of Infant Loco- D. Kuhn and R. Siegler, Eds., Handbook of Child Psychology, motion. Monographs of the Society for Research in Child vol. 2: Cognition, Perception and Language. New York: Wiley, Development. Serial No. 251, vol. 62, no. 3. pp. 51–102. Bahrick, L. E., and J. S. Watson. (1985). Detection of intermodal Bertenthal, B. (1996). Origins and early development of percep- proprioceptive-visual contingency as a potential basis of self- tion, action, and representation. Annual Reviews of Psychology perception in infancy. Developmental Psychology 21: 963– 973. 47: 431–459. Phantom Limb 635 limb before amputation. For example, the person may feel a Fantz, R. L. (1961). The origins of form perception. Scientific painful bunion that had been on the foot or even a tight ring American 204: 66–72. Gesell, A. (1928). Infancy and Human Growth. New York: Mac- on a phantom finger. millan. Phantoms of other body parts feel just as real as limbs Gibson, E. J. (1969). Principles of Perceptual Learning and Devel- do. Heusner describes two men who underwent amputation opment. New York: Appleton-Century-Crofts. of the penis. One of them, during a four-year period, was Gibson, E. J., and E. S. Spelke. (1983). The development of per- intermittently aware of a painless but always erect phantom ception. In P. H. Mussen, Ed., Handbook of Child Psychology, penis. The other man had severe PAIN of the phantom penis. 4th ed., vol. 3: Cognitive Development. New York: Wiley, pp. Phantom bladders and rectums have the same quality of 1–76. reality. The bladder may feel so real that patients, after a Johansson, G. (1973). Visual perception of biological motion and a bladder removal, sometimes complain of a full bladder and model for its analysis. Perception and Psychophysics 14: 201– even report that they are urinating. Patients with a phantom 211. Piaget, J. (1952). The Origins of Intelligence in Children. New rectum may actually feel that they are passing gas or feces. York: International Universities Press. Menstrual cramps may continue to be felt after a hysterec- Teller, D. Y. (1979). The forced-choice preferential looking proce- tomy. A painless phantom breast, in which the nipple is the dure: A psychophysical technique for use with human infants. most vivid part, is reported by about 25 percent of women Infant Behavior and Development 2: 135–153. after a mastectomy and 13 percent feel pain in the phantom. The reality of the phantom body is evident in paraplegic patients who suffer a complete break of the spinal cord. Phantom Limb Even though they have no somatic sensation or voluntary movement below the level of the break, they often report Phantom limbs occur in 95 to 100 percent of amputees who that they still feel their legs and lower body. The phantom lose an arm or leg. The phantom is usually described as hav- appears to inhabit the body when the person’s eyes are open ing a tingling feeling and a definite shape that resembles the and usually moves coordinately with visually perceived somatosensory experience of the physical limb before movements of the body. Initially, patients may realize the amputation. It is reported to move through space in much dissociation between the two when they see their legs the same way as the normal limb would move when the per- stretched out on the road after an accident yet feel them to son walks, sits down, or stretches out on a bed. At first, the be over the chest or head. Later, the phantom becomes coor- phantom limb feels perfectly normal in size and shape —so dinate with the body, and dissociation is rare. much so that the amputee may reach out for objects with the Descriptions given by amputees and paraplegic patients phantom hand, or try to step onto the floor with the phantom indicate the range of qualities of experience of phantom leg. As time passes, however, the phantom limb begins to body parts. Touch, pressure, warmth, cold, and many kinds change shape. The arm or leg becomes less distinct and may of pain are common. There are also feelings of itch, tickle, fade away altogether, so that the phantom hand or foot wetness, sweatiness, and tactile texture. Even the experience seems to be hanging in midair. Sometimes, the limb is of fatigue due to movement of the phantom limb is reported. slowly “telescoped” into the stump until only the hand or Furthermore, male paraplegics with total spinal sections foot remains at the stump tip. report feeling erections, and paraplegic women describe Amputation is not essential to the occurrence of a phan- sexual sensations in the perineal area. Both describe feel- tom. After avulsion of the brachial plexus of the arm, with- ings of pleasure, including orgasms. out injury to the arm itself, most patients report a phantom A further striking feature of the phantom limb or any arm that is usually extremely painful. Even nerve destruc- other body part, including half of the body in many paraple- tion is not necessary. About 95 percent of patients who gics, is that it is perceived as an integral part of one’s SELF. receive an anesthetic block of the brachial plexus for sur- Even when a phantom foot dangles “in midair” (without a gery of the arm report a vivid phantom, usually at the side or connecting leg) a few inches below the stump, it still moves over the chest, which is unrelated to the position of the real appropriately with the other limbs and is unmistakably felt arm when the eyes are closed but “jumps” into it when the to be part of one’s body-self. The fact that the experience of patient looks at the arm. Similarly, a spinal anesthetic block “self” is subserved by specific brain mechanisms is demon- of the lower body produces reports of phantom legs in most strated by the converse of a phantom limb —the denial that patients, and total section of the spinal cord at thoracic lev- a part of one’s body belongs to one’s self. Typically, the per- els leads to reports of a phantom body, including genitalia son, after a lesion of the right parietal lobe or any of several and many other body parts, in virtually all patients. other brain areas, denies that a side of the body is part of The most astonishing feature of the phantom limb is its himself. “reality” to the amputee, which is enhanced by wearing an There is convincing evidence that a substantial number artificial arm or leg; the prosthesis feels real, “fleshed out.” of children who are born without all or part of a limb feel a Amputees in whom the phantom leg has begun to “tele- vivid phantom of the missing part. The long-held belief that scope” into the stump, so that the foot is felt to be above phantoms are experienced only when an amputation has floor level, report that the phantom fills the artificial leg occurred after the age of six or seven years is not true. Phan- when it is strapped on and the phantom foot now occupies toms are experienced by about 20 percent of children who the space of the artificial foot in its shoe. The reality of the are born without all or part of a limb (congenital limb defi- phantom is reinforced by the experience of details of the ciency), and 20 percent of these children report pain in their 636 Philosophical Issues in Linguistics phantom. Persons with congenital limb deficiency some- undertaken to preserve or reconstruct their pronunciation. times perceive a phantom for the first time after minor sur- Phonetics developed rapidly in the nineteenth century in gery or an injury of the stump when they are adults. connection with spelling reform, language (pronunciation) The innate neural substrate implied by these data does teaching, speech training for the deaf, stenographic short- not mean that sensory experience is irrelevant. Learning hands, and the study of historical sound changes. Today obviously plays a role because persons’ phantoms often phonetics is an interdisciplinary field combining aspects of assume the shape of the prosthesis, and persons with a linguistics, psychology (including perception and motor deformed leg or a painful corn may report, after amputation, control), computer science, and engineering. However, in that the phantom is deformed or has a corn. That is, sensory the United States especially, “phonetics” is commonly con- inputs play an important role in the experience of the phan- sidered a subfield of linguistics, and “speech science” or tom limb. Heredity and environment clearly act together to “speech” is the more general term. produce the phenomena of phantom limbs. A common question among linguists and nonlinguists alike is, “What is the difference between phonetics and pho- See also ILLUSIONS; SENSATIONS; WHAT-IT’S-LIKE nology?” One answer is that phonetics is concerned with —Ronald Melzack actual physical properties that are measured or described with some precision, whereas phonology is concerned with Further Readings (symbolic) categories. For example, the phonology of a lan- guage might describe and explain the allowed sequences of Bors, E. (1951). Phantom limbs of patients with spinal cord injury. consonants in the language, and the phonetics of the lan- Archives of Neurology and Psychiatry 66: 610–631. guage would describe and explain the physical properties of Heusner, A. P. (1950) Phantom genitalia. Transactions of the American Neurological Association 75: 128–131. a given consonant in these different allowed sequences. Katz, J. (1993). The reality of phantom limbs. Motivation and Nonetheless, even though phonetics deals with physical Emotion 17: 147–179. properties, it is just as much concerned with linguistic Katz, J., and R. Melzack. (1990). Pain “memories” in phantom knowledge as with behavior. It is not the case that phonol- limbs: Review and clinical observations. Pain 43: 319–336. ogy can be identified with “competence” and phonetics with Lacroix, R., R. Melzack, D. Smith, and N. Mitchell. (1992). Multi- “performance,” for example. ple phantom limbs in a child. Cortex 28: 503–507. Phonetics is usually divided into three areas: speech pro- Melzack, R. (1989). Phantom limbs, the self and the brain. The D. duction, acoustics, and perception. Phoneticians want to O. Hebb Memorial Lecture Canadian Psychology 30: 1–16. understand not only how speech is produced, perceived, and Melzack, R., and P. R. Bromage. (1973). Experimental phantom acoustically structured, but how these mechanisms shape limbs. Experimental Neurology 39: 261–269. Melzack, R., R. Israel, R. Lacroix, and G. Schultz. (1997). Phan- the sound systems of human languages. What is the range of tom limbs in people with congenital limb deficiency or amputa- speech sounds found in human languages? Why do lan- tion in early childhood. Brain to appear. guages prefer certain sounds and combinations of sounds? Melzack, R., and J. D. Loeser. (1978). Phantom body pain in para- How does speech convey linguistic structure to listeners? plegics: Evidence for a central “pattern generating mechanism These are some of the key questions in phonetics. Phoneti- for pain.” Pain 4: 195–210. cians stress that only if the array of sounds used across lan- Mesulam, M.-M. (1981). A cortical network for directed attention guages is studied can complete models of speech be and unilateral neglect. Annals of Neurology 10: 309–325. developed. Riddoch, G. (1941). Phantom limbs and body shape. Brain 64: Speech production is the basis of traditional phonetic 197–222. transcription systems such as the International Phonetic Saadah, E. S. M., and R. Melzack. (1994). Phantom limb experi- ences in congenital limb-deficient adults. Cortex 30: 479–485. Alphabet (IPA), as well as some phonological feature sys- Sherman, R. A. (1997). Phantom Pain. New York: Plenum Press. tems (Catford 1977; Ladefoged 1993; Laver 1994; see PHO- NOLOGY and DISTINCTIVE FEATURES). The components of speech production are the airstream mechanism (the process Philosophical Issues in Linguistics by which air flow for speech is initiated), phonation or voic- ing (production of a sound source by the vibrating vocal See LINGUISTICS, PHILOSOPHICAL ISSUES cords inside the larynx—this is the most important sound source in speech), ARTICULATION (modification of the pho- nation sound source, and introduction of additional sound Philosophy of Mind sources, by the movements of articulators), and the oro- nasal process (modification of the sound source by the flow See INTRODUCTION: PHILOSOPHY of air through the nose). Speech sounds are traditionally described as combinations of these components. On the other hand, PROSODY (the suprasegmental variation of loud- Phonetics ness, length, and pitch that makes some segments, or larger groupings of segments, more prominent or otherwise set off Speech is the most common medium by which language is from others) is traditionally described not in speech- transmitted. Phonetics is the science or study of the physical production terms but more in terms of the dimensions of aspects of speech events. It has a long history. For centuries, amplitude, duration, and frequency (Lehiste 1970; see phonetic descriptions of particular languages have been STRESS, LINGUISTICS; TONE). Phonological Rules and Processes 637 Speech acoustics concerns the properties of speech trans- also contributes to the development of speech technology (see mitted from speaker to hearer. Speech sounds are usually SPEECH SYNTHESIS and SPEECH RECOGNITION IN MACHINES). described in terms of their prominent frequency components Theoretical phonetics is concerned not only with theories and the durations of intervals within each sound. The of speech production, acoustics, and perception but also source-filter theory of speech production and acoustics with theories to explain why languages have the sounds, (Fant 1960; Stevens, forthcoming) describes speech as the grammatical structures, and historical sound changes that result of modifying acoustic sources by vocal-tract filter they do, and theories to describe the interrelationship of the functions. The acoustic sources in speech include phonation more abstract patterns of phonology and the physical forms (described above), noise produced in the larynx (such as for of speech sounds. Phoneticians look for recurring patterns aspiration and breathiness), and noise produced by air flow- of variation in sounds (see PHONOLOGICAL RULES AND PRO- ing through a constriction anywhere in the vocal tract (such CESSES), and then try to understand why they should occur. as for a fricative sound or after the release of a stop). Every Phonetic constraints on phonology may be proposed as part speech sound must involve one or more such sources. The of either a reductionist program (see REDUCTIONISM), in source(s) is then modified by the filter function of the vocal which phonology is reduced to phonetics, or an interface tract. The most important aspect of this filtering is that the program, in which phonetics and phonology are usually rec- airways of the vocal tract have particular resonances, called ognized as separate components of a grammar. formants, which serve to enhance any corresponding fre- —Patricia A. Keating quencies in a source. The resonance frequencies depend on the size and shape of the airway, which in turn depend on References the positions of all the articulators; thus, as the articulators move during speech, the formant frequencies are varied. Catford, J. C. (1977). Fundamental Problems in Phonetics. At the same time, phonetics can be divided into practical, Bloomington: Indiana University Press. experimental, and theoretical approaches. Practical phonet- Fant, G. (1960). Acoustic Theory of Speech Production. The Hague: Mouton. ics concerns skills of aural transcription and oral production Hardcastle, W. J., and J. Laver. (1997). The Handbook of Phonetic of speech sounds, usually in conjunction with a descriptive Sciences. Oxford: Blackwell. system like the IPA. Different levels of phonetic transcrip- Ladefoged, P. (1993). A Course in Phonetics. 3rd ed. Fort Worth, tion are traditionally recognized. In a phonemic transcrip- TX and Orlando, FL: Harcourt Brace Jovanovich. tion, the only phonetic symbols used are those representing Ladefoged, P., and I. Maddieson. (1996). The Sounds of the the phonemes, or basic sounds, of the language being tran- World’s Languages. Oxford: Blackwell. scribed. In an allophonic transcription, additional symbols Laver, J. (1994). Principles of Phonetics. Cambridge: Cambridge are used, to represent more detailed variants (or allophones) University Press. of the phonemes. The more such detail included, the nar- Lehiste, I. (1970). Suprasegmentals. Cambridge, MA: MIT Press. rower the allophonic transcription. Stevens, K. N. (Forthcoming). Acoustic Phonetics. Cambridge, MA: MIT Press. Experimental phonetics is based on the use of laboratory equipment. Laboratory techniques (see Hardcastle and Further Readings Laver 1997) are generally needed to understand exactly how some sound is produced and to detail its acoustic and/or per- Asher, R. E., and E. J. A. Henderson, Eds. (1981). Towards a His- ceptually relevant properties. When experimental phonetic tory of Phonetics. Edinburgh: Edinburgh University Press. methods are used to answer questions of interest to phonol- Connell, B., and A. Arvaniti, Eds. (1995). Phonology and Phonetic ogists, it is sometimes called “laboratory phonology”. Evidence: Papers in Laboratory Phonology 4. Cambridge: Certain acoustic measurements of speech sounds have Cambridge University Press. become common, especially since the advent of the sound Denes, P. B., and E. N. Pinson. (1993). The Speech Chain: The Physics and Biology of Spoken Language. 2nd ed. New York: spectrograph, which produces visual displays (spectrograms) W. H. Freeman and Company. of frequency and intensity over time. The most common fre- Docherty, G. J., and D. R. Ladd, Eds. (1992). Gesture, Segment, quency measurements are the frequencies of vowel formants, Prosody: Papers in Laboratory Phonology 2. Cambridge: Cam- of the formant transitions between consonants and vowels, bridge University Press. and of the fundamental frequency of phonation. Changes in Keating, P. A., Ed. (1994). Phonological Structure and Phonetic the source, and the durations of intervals with different Form: Papers in Laboratory Phonology 3. Cambridge: Cam- sources, are also important in speech; for example the dura- bridge University Press. tion between the release of a stop and the onset of voicing (an Kingston, J., and M. E. Beckman, Eds. (1990). Papers in Labora- interval filled with aspiration) is called voice onset time tory Phonology 1: Between the Grammar and Physics of (VOT). There is now a wide range of the world’s languages Speech. Cambridge: Cambridge University Press. Ladefoged, P. (1996). Elements of Acoustic Phonetics. 2nd ed. Chi- receiving detailed experimental descriptions in terms of these cago: University of Chicago Press. and other measures (e.g., Ladefoged and Maddieson 1996), although of course there remain many languages with unusual sounds whose production, acoustics, and/or perception are not Phonological Rules and Processes well understood. Furthermore, manipulation of such measures to provide a range of artificial speech stimuli is an important tool of SPEECH PERCEPTION, in determining which acoustic Phonological processes were first systematically studied in properties matter most to listeners. Experimental phonetics the nineteenth century under the rubric of sound laws relating 638 Phonological Rules and Processes the various Indo-European languages. In the twentieth cen- such as the voicing of [f] in the plural of leaf: leaves but tury, attention shifted to a synchronic perspective, prompted reefs (*reeves) and verbal he leafs (*leaves) through the by observations such as Edward SAPIR’s that as part of their paper). A plausible but unsubstantiated hypothesis is that grammatical competence mature speakers unconsciously and rules relate different lexical items stored in memory while effortlessly assign (sometimes radically) different pronuncia- processes operate online. tions to a lexical item drawn from memory and inserted in Phonological processes are often phonetically motivated, different grammatical or prosodic contexts. For example, in seeming either to enhance the perceptibility of a sound, the pronunciation of the word átom American English speak- especially in “strong” contexts or more formal speaking styles (aspiration of prestressed [t] in a[th]ómic), or to mini- ers “flap” the intervocalic consonant to [2] and reduce the unstressed vowel to schwa [E] so that it merges with Adam: mize articulatory gestures, especially in “weak” contexts or ['æ2Em]. The underlying phonemes emerge when the stress is fast tempos (flapping of the stop and reduction of the shifted under affixation: atóm-ic [Eth'am-Ik]. Processes also unstressed vowel of átom ['æ2Em]). Besides a typology figure in the neutralizations found in child language such as based on their formal properties, phonological processes are the loss of the tongue-tip articulation of r so that room also usefully viewed as different solutions to a common merges with womb. phonetic difficulty. For example, the transition from the Phonological processes fall into two broad categories: nasal to the fricative in the consonant cluster of dense is rel- sound change and prosodic grouping. We briefly illustrate atively complex because it requires synchronization of two each type. In in-articulate versus im-possible the prefixal independent gestures: raising the velum to shut off nasal air- nasal assimilates the labial feature of the [p] thereby chang- flow and shifting the tongue tip from a closure to a constric- ing from [n] to [m]. Dissimilation alters neighboring sounds tion. Common responses include insertion of a transitional that share the same feature so that they become more dis- stop den[t]se (to rhyme with dents) or deletion of the tongue-tip closure d[ε~]s. An example from prosody is pro- tinct from one another (see DISTINCTIVE FEATURES). For example, the vocalic nucleus and offglide comprising the vided by the widespread tendency to avoid syllables begin- [au] diphthong of how share a retracted tongue position in ning with a vowel. When morphological or syntactic rules most English dialects. In broad Australian English the juxtapose vowels, a variety of processes come into play to nucleus is fronted to the [æ] vowel of cat: h[æu]. Assimila- avoid a syllable break between the vowels. These include tion and dissimilation are subject to a strict locality condi- deletion of one of the vowels (Slavic), contraction of the tion requiring that they apply in the context of the closest vowels into a diphthong (Polynesian) or long vowel (San- sound with the appropriate feature. The phonological fea- skrit), or insertion of a consonantal onset (British English tures that define a sound are also subject to deletion and intrusive [r] as in the idea [r] is). insertion: the former typically operates in prosodically Processes are characteristically myopic in the sense that weak contexts (reduction of unstressed vowels to schwa in in solving one phonetic problem they often create another. át[E]m but [E]tóm-ic) and the latter in strong contexts (aspi- Popular London (Cockney) deletion of initial [h] creates a ration of [t] before stress in a[th]ómic). vowel cluster with the indefinite article (“a hedge” [E 3d^)— Processes of prosodic grouping include the organization a situation that is otherwise avoided by substitution of the an allomorph (cf. “an edge” [En 3d^]; Wells 1982). To take of phonemes into syllables. In English a consonant cluster such as [rt] easily combines with the preceding vowel into a another example (data from Bethin 1992), in Polish [n] single syllable: monosyllabic mart. But in order to syllabify assimilates the place of articulation of a following velar: the inverse cluster [tr], a helping schwa is required: disyl- ba[n]k ‘bank’. In the Southwestern dialect, the process is labic me.t[E]r (cf. metr-ic). Languages such as Japanese extended to clusters arising from the deletion of a weak have simpler syllabic structures that bar syllable-internal vowel, whereas in the Northeastern dialect, such derived nk consonant clusters and place rigid restrictions on syllable- clusters remain unassimilated: ganek ‘porch’, ga[n]ka SW final consonants. Accordingly, the [rt] cluster in a loanword versus ga[n]ka NE genitive singular. Many phonologists such as French courte [kurt] ‘short’ receives two extra sylla- (e.g., Halle 1962) conclude from examples like this that pro- bles when it is adapted into Japanese: kuruto (Shinohara cesses apply in a linear sequence: in the Southwestern dia- 1997). At the next level of prosodic organization, syllables lect, vowel deletion precedes nasal assimilation, whereas in are grouped into strong-weak (trochaic) or weak-strong the Northeastern dialect, nasal assimilation precedes vowel (iambic) rhythmic units known as metrical feet. Native Aus- deletion (and so sees /ganek+a/ at its point of application). tralian languages impose trochaic rhythm so that words An alternative interpretation (Donegan and Stampe 1979) have a canonical SsSsSs . . . syllabic structure in compari- sees all processes as applying simultaneously to the input son to the iambic grouping sSsSsS . . . found in many Native with each given the option to iterate (Southwestern) or not American languages. English has trochaic grouping as (Northeastern). shown by the strong-weak template imposed on nickname Although myopic, phonological processes are typically formation: Elízabeth shortens to Ss Lísa; sS Elí is impossi- not self-defeating in the sense of recreating the same prob- ble. lem they are called upon to solve. An example is provided Some linguists (e.g., Stampe 1979) distinguish between by the liquid [l,r] dissimilation inherited from Latin (Steri- “processes” that reflect phonetically motivated limitations ade 1995), in which the suffixal [l] of nav-al, fat-al, mor-al on which sounds can appear where in pronunciation and is turned into [r] when the stem contains an [l]: stell-ar, lun- more arbitrary and conventional “rules” that are typically ar, column-ar, nucle-ar. The process systematically blocks restricted to particular morphological or lexical contexts when an [r] intervenes between the suffixal and stem [l]'s: Phonology 639 flor-al, plur-al, later-al. If the point of the change is to avoid units are and how they are put together to create intelligible successive identical liquids, an output such as *flor-ar is no and natural-sounding spoken utterances. better than the input flor-al and hence the process is sus- Let us consider what goes into making up the sound pended. system of a language. One ingredient, obviously enough, is Providing empirical substantiation to the notion “pho- its choice of speech sounds. All languages deploy a small netic motivation” as well as determining the principles that set of consonants and vowels, called phonemes, as the underlie the interaction of rules and constraints remain out- basic sequential units from which the minimal units of standing research objectives. word structure are constructed. The phonemes of a lan- guage typically average around 30, although many have See also ARTICULATION; LANGUAGE PRODUCTION; PHO- considerably more or less. The Rotokas language of Papua NETICS; PHONOLOGY; PROSODY AND INTONATION; PROSODY New Guinea has just 11 phonemes, for example, whereas AND INTONATION, PROCESSING ISSUES; STRESS, LINGUISTIC the !X~ language of Namibia has 141. English has about u —Michael Kenstowicz 43, depending on how we count and what variety of English we are describing. Although this number may References seem relatively small, it is sufficient to distinguish the 50,000 or so items that make up the normal adult LEXICON. Bethin, C. (1992). Polish Syllables. Columbus, OH: Slavica Pub- This is due to the distinctive role of order: thus, for exam- lishers. ple, the word step is linked to a sequence of phonemes that Donegan, P., and D. Stampe. (1979). The study of natural phonol- we can represent as /stεp/, whereas pest is composed of the ogy. In D. Dinnsen, Ed., Current Approaches to Phonological Theory. Bloomington, IN: Indiana University Press, pp. 126– same phonemes in a different order, /pεst/. 173. Phonemes are not freely combinable as a maximally effi- Halle, M. (1962). Phonology in generative grammar. Word 18: 54– cient system would require but are sequenced according to 72. strict patterns that are largely specific to each language. One Makkai, V. B. (1972). Phonological Theory: Evolution and Cur- important organizing principle is syllabification. In most rent Practice. New York: Holt. languages, all words can be exhaustively analyzable into Sapir, E. (1925). Sound patterns in language. Language 1: 37–51. syllables. Furthermore, many languages require all their syl- Reprinted in Makkai 1972, pp. 13–21. lables to have vowels. The reason why a fictional patro- Sapir, E. (1933). The psychological reality of phonemes. Reprinted nymic like Btfsplk is hard for most English speakers to in Makkai 1972, pp. 22–31. pronounce is that it violates these principles—it has no vow- Shinohara, S. (1997). Analyse phonologique de l'adaptation japonaise de mots étrangers. Paris: Université de la Sorbonne els, and so cannot be syllabified. In contrast, in one variety Nouvelle, Paris 3. of the Berber language spoken in Morocco, syllables need Stampe, D. (1979). A Dissertation on Natural Phonology. New not have vowels, and utterances like tsqssft stt (“you shrank York: Garland. it”) are quite unexceptional. Here is a typical, if extreme, Steriade, D. (1995). Underspecification and markedness. In J. example of how sound patterns can differ among languages. Goldsmith, Ed., Handbook of Phonological Theory. Oxford: Speech sounds themselves are made up of smaller com- Blackwell, pp. 114–174. ponents called DISTINCTIVE FEATURES, which recur in one Wells, J. (1982). Accents of English. Cambridge: Cambridge Uni- sound after another. For example, a feature of tongue-front versity Press. ARTICULATION (or coronality, to use the technical term) characterizes the initial phoneme in words like tie, do, see, Further Readings zoo, though, lie, new, shoe, chow, and jay, all of which are Archangeli, D. and D. Pulleyblank. (1993). Grounded Phonology. made by raising the tip or front of the tongue. This feature Cambridge, MA: MIT Press. minimally distinguishes the initial phoneme of tie from that Greenberg, J. (1978). Universals of Human Language. Stanford, of pie, which has the feature of labiality (lip articulation). CA: Stanford University Press. Features play an important role in defining the permissible Kenstowicz, M. (1994). Phonology in Generative Grammar. sound sequences of a language. In English, for instance, Oxford: Blackwell. only coronal sounds like those just mentioned may occur after the diphthong spelled ou or ow: We consequently find Phonology words like out, loud, house, owl, gown, and ouch, all words ending in coronal sounds, but no words ending in sound Phonology addresses the question of how the words, sequences like owb, owf, owp, owk, or owg. All speech phrases, and sentences of a language are transmitted from sounds and their regular patterns can be described in terms speaker to hearer through the medium of speech. It is easy of a small set of such features. to observe that languages differ considerably from one A further essential component of a sound system is its another in their choice of speech sounds and in the rhythmic choice of “suprasegmental” or prosodic features such as and melodic patterns that bind them together into units of LINGUISTIC STRESS, by which certain syllables are high- structure and sense. Less evident to casual observation, but lighted with extra force or prominence; TONE, by which equally important, is the fact that languages differ greatly in vowels or syllables bear contrastive pitches; and intona- the way their basic sounds can be combined to form sound tion, the overall “tune” aligned with phrases and sen- patterns. The phonological system of a given language is the tences. Stress and tone may be used to distinguish different part of its grammar that determines what its basic phonic words. In some varieties of Cantonese, for example, only 640 Phonology tone distinguishes si “poem” (with high pitch), si “cause” phrases, etc.) may reflect a higher-order disposition to group (with rising pitch), and si “silk” (with falling pitch). Pro- serially ordered units into hierarchically organized struc- sodic features may also play an important role in distin- tures, reflected in many other complex activities such as guishing different sentence types, as in conversational memorization, versification (see METER AND POETRY), and French where only intonation distinguishes the statement jazz improvisation. On the other hand, human biology tu viens “you come” (with falling intonation) from the cor- imposes quite different demands, often requiring that com- responding question tu viens? (with rising intonation). In plex phonemes and phoneme sequences be simplified to many languages, stress is used to highlight the part of the forms that are more readily articulated or that can be more sentence that answers a question or provides new informa- easily distinguished by the ear. tion (cf. FOCUS). Thus in English, an appropriate reply to Research on phonology dates back to the ancient San- the question “Where did Calvin go?” is He went to the skrit, Greek, and Roman grammarians, but it received its STORE, with main stress on the part of the sentence pro- modern foundations in the work of Henry Sweet, Jan Bau- viding the answer, whereas an appropriate reply to the douin de Courtenay, Ferdinand de SAUSSURE, and others in question “Did you see Calvin with Laura?” might be No, I the late nineteenth century. Principles of phonemic analysis saw FRED with her, where the new information is empha- were subsequently worked out in detail by linguists such as sized. Though this use of stress seems natural enough to BLOOMFIELD, SAPIR, Harris, Pike, and Hockett in the the English speaker, it is by no means universal, and United States and Trubetzkoy, Hjelmslev, and Martinet in Korean and Yoruba, to take two examples, make the same Europe. Feature theory was elaborated principally by distinctions with differences in word order. Roman JAKOBSON and his associates in the United States, Although phonological systems make speech communi- and the study of suprasegmental and prosodic features by cation possible, there is often no straightforward correspon- Kenneth Pike as well as by J. R. Firth and his associates in dence between underlying phoneme sequences and their London. Since mid-century, linguists have increasingly phonetic realization. This is due to the cumulative effects of attempted to develop explicit formal models of phonologi- sound changes on a language, many of them ongoing, that cal structure, including patterns of phonologically condi- show up not only in systematic gaps such as the restriction tioned morpheme alternation. In their watershed work The on vowel + consonant sequences in English noted above but Sound Pattern of English (1968), Noam Chomsky and Mor- also in regular alternations between different forms of the ris Halle proposed to characterize the phonological compe- same word or morpheme. For example, many English tence of English speakers in terms of an ordered set of speakers commonly pronounce fields the same way as feels, rewrite rules, applying in strict order to transform underly- while keeping field distinct from feel. This is not a matter of ing representations into surface realizations. More recent sloppy pronunciation but of a regular principle of English trends taking such an approach as their point of departure phonology that disallows the sound [d] between [l] and [z]. have included the development of so-called nonlinear Many speakers of American English pronounce sense in the (autosegmental, metrical, prosodic) models for the repre- same way as cents, following another principle requiring the sentation of tone, stress, syllables, feature structure, and sound [t] to appear between [n] and [s]. These principles are prosodic organization, and the study of the interfaces fully productive in the sense that they apply to any word that between phonology and other areas of language, including contains the underlying phonological sequence in question. SYNTAX; MORPHOLOGY; AND PHONETICS. At the present Hosts of PHONOLOGICAL RULES AND PROCESSES such as time, newer phonological models emphasizing the role of these, some easily detected by the untrained ear and others constraints over rewrite rules have become especially much more subtle, make up the phonological component of prominent, and include principles-and-parameters models, English grammar, and taken together may create a signifi- constraint-and-repair phonology, declarative phonology, cant “mismatch” between mentally represented phoneme connectionist-inspired approaches, and most recently OPTI- sequences and their actual pronunciation. As a result, the MALITY THEORY. speech signal often provides an imperfect or misleading cue Viewed from a cognitive perspective, the task of phonol- to the lexical identity of spoken words. One of the major ogy is to find the mental representations that underlie the goals of speech analysis—one that has driven much production and perception of speech and the principles that research over the past few decades—is to work out the com- relate these representations to the physical events of speech. plex patterns of interacting rules and constraints that define This task is addressed hand-in-hand with research in related the full set of mappings between the underlying phonemic areas such as LANGUAGE ACQUISITION and language pathol- forms of a language and the way these forms are realized in ogy, acoustic and articulatory phonetics, PSYCHOLINGUIS- actual speech. TICS, neurology, and computational modeling. The next Why should phonological systems include principles that decades are likely to witness increased cross-disciplinary are so obviously dysfunctional from the point of view of the collaboration in these areas. hearer (not to mention the language learner)? The answer As one of the basic areas of grammar, phonology lies at appears to lie in the constraints imposed “from above” by the heart of all linguistic description. Practical applications the brain and “from below” by the size, shape, and muscular of phonology include the development of orthographies for structure of the speech-producing apparatus (the lungs, the unwritten languages, literacy projects, foreign language larynx, the lips, and the tongue). The fact that languages so teaching, speech therapy, and man-machine communication commonly group their phonemes into syllables and their (see SPEECH SYNTHESIS and SPEECH RECOGNITION IN syllables into higher-level prosodic groupings (metrical feet, MACHINES). Phonology, Acquisition of 641 See also Ohala, J. (1983). The origin of sound patterns in vocal tract con- PHONOLOGY, ACQUISITION OF; PHONOLOGY, straints. In P. F. MacNeilage, Ed., The Production Of Speech. NEURAL BASIS OF; PROSODY AND INTONATION; PROSODY New York: Springer, pp. 189–216. AND INTONATION, PROCESSING ISSUES Palmer, F. R., Ed. (1970). Prosodic Analysis. London: Oxford Uni- versity Press. — G. N. Clements Phonology, Published three times a year by Cambridge University Press: Cambridge. References Sapir, E. (1925). Sound patterns in language. Language 1: 37–51. Vihman, M. (1995). Phonological Development: The Origins of Anderson, S. R. (1985). Phonology in the Twentieth Century. Chi- Language in the Child. Oxford: Blackwell. cago: University of Chicago Press. Chomsky, N., and M. Halle. (1968). The Sound Pattern of English. New York: Harper and Row. Phonology, Acquisition of Durand, J. (1990). Generative and Non-linear Phonology. London: Longman. Goldsmith, J. A., Ed. (1995). The Handbook of Phonological The- The acquisition of PHONOLOGY—or rather its develop- ory. Oxford: Blackwell. ment—may be divided into two fields: SPEECH PERCEPTION Hockett, C. F. (1974/1955). A Manual of Phonology. Chicago: and speech production. There are two reasons why the University of Chicago Press. development of perception is prior to the development of Jakobson, R. (1971). Selected Writings, vol. 1: Phonological Stud- production. One is that although the human ear is almost ies. The Hague: Mouton. completely formed when the fetus is 7 months, the oral cav- Kenstowicz, M. (1994). Phonology in Generative Grammar. ity of a human at birth is very different from the adult’s oral Oxford: Blackwell. cavity. The second reason is that in order to produce the Spencer, A. (1995). Phonology: Theory and Description. Oxford: sounds of a given language, a child must be exposed to the Blackwell. Trubetzkoy, N. S. (1969/1939). Principles of Phonology. Berkeley relevant linguistic experience. Babbling infants, in fact, pro- and Los Angeles: University of California. duce all sorts of linguistic sounds, even ones they have never heard in their linguistic environment. Although the Further Readings first reason why perception is prior to production holds exclusively for the acquisition of the mother tongue (L1), Archangeli, D., and D. Pulleyblank. (1994). Grounded Phonology. the second holds both for the acquisition of L1 and of Cambridge, MA: MIT Press. whichever language comes after L1 (L2). Bloomfield, L. (1933). Language. New York: Holt. A child comes into life well equipped to hear subtle dif- Clark, J., and C. Yallop. (1995). Introduction to Phonetics and ferences in sounds (Eisenberg 1976). Though it is conceiv- Phonology. 2nd ed. Oxford: Blackwell. Clements, G. N., and S. J. Keyser. (1983). CV Phonology: A Gen- able that some speech perception development starts before erative Theory of the Syllable. Cambridge, MA: MIT Press. birth, because newborns discriminate the mother language Dell, F. (1980). Generative Phonology and French Phonology. from a foreign one (Mehler et al. 1988), it is generally Cambridge: Cambridge University Press. assumed that the development of language perception starts Dressler, W. (1985). Morphophonology. Ann Arbor, MI: Karoma. at birth. Ferguson, C., L. Menn, and C. Stoel-Gammon, Eds. (1992). Pho- The most widely accepted theory of speech perception is nological Development. Timonium, MD: York Press. the innatist theory, first proposed by JAKOBSON and much Fischer-Jörgensen, E. (1975). Trends in Phonological Theory: A influenced by the work of Chomsky and other generative Historical Introduction. Copenhagen: Akademisk Forlag. linguists. A mechanism, called the LANGUAGE ACQUISITION Goldsmith, J. A. (1990). Autosegmental and Metrical Phonology. device (LAD), is assumed to be responsible for the ability Oxford: Blackwell. Greenberg, J. H. (1978). Universals of Human Language, vol. 2: humans have to analyze linguistic inputs and to construct Phonology. Stanford, CA: Stanford University Press. grammars that generate them (see GENERATIVE GRAMMAR). Hyman, L. (1975). Phonology: Theory and Analysis. New York: Given that some properties of language are common to all Holt, Rinehart, and Winston. languages of the world, it may be assumed that these are the Inkelas, S., and D. Zec. (1990). The Phonology-Syntax Connec- consequences of the innate human endowment. The devel- tion. Chicago: University of Chicago Press. opment of a specific language is achieved through the set- Jakobson, R. (1941). Child Language, Aphasia, and Phonological ting of the values of a set of parameters on the basis of the Universals. The Hague: Mouton. linguistic data one is exposed to. Kenstowicz, M., and C. W. Kisseberth. (1979). Generative Phonol- According to the innatist hypothesis, a newborn can dis- ogy: Description and Theory. New York: Academic Press. criminate between pairs of segments (consonant and vow- Kiparsky, P. (1995). Phonological basis of sound change. In J. Goldsmith, Ed., The Handbook of Phonological Theory. els) attested in at least one language of the world, even if Oxford: Blackwell, pp. 640–670. they are not distinctive in the language they are exposed to Labov, W. (1994). Principles of Linguistic Change: Internal Fac- (Jakobson 1941). Testing what infants hear has become tors. Oxford: Blackwell. possible only in the last twenty years. The methodology to Maddieson, I. (1984). Patterns of Sounds. Cambridge: Cambridge test young infants’ perception is the nonnutritive (or high University Press. amplitude) sucking method (Eimas et al. 1971). When Makkai, V. B., Ed. (1972). Phonological Theory: Evolution and infants are 6 months or older, the preferential headturn pro- Current Practice. New York: Holt, Rinehart, and Winston. cedure is commonly used to test sound discrimination Martinet, A. (1955). Economie des Changements Phonétiques: (Moore, Wilson, and Thomson 1977). It has been shown Traité de Phonologie Diachronique. Bern: Francke. 642 Phonology, Acquisition of that, indeed, for the newborn, the ability to discriminate pressure in the oral cavity compared to the external air pres- does not appear to be related to the language he or she is sure, as in [p], whereas in the pronunciation of V, the inter- exposed to (Streeter 1976). At a later stage of development, nal pressure is similar to the external one, as in [a]. around 10 months, infants start losing the ability to dis- Subsequently different CV combinations develop and dif- criminate sounds that are not distinctive in the language(s) ferent parameters are set to give the full range of adult sylla- they are exposed to (Werker and Tees 1984; Kuhl et al. bles such as whether a prevocalic consonant is obligatory or 1992). This is in line with the learning by selection theory not or whether a postvocalic consonant is allowed or not. of neurological development (Changeux, Heidmann, and PHONOLOGICAL RULES AND PROCESSES also develop with Patte 1984). It has also been shown that already from 1 time. For example, the centralization to schwa of unstressed month, infants represent linguistic sounds categorically, vowels in English is not part of early productions. that is, different acoustic variants of a sound are identified The acquisition of the phonology (and of grammar in with one category (Eimas, Miller, and Jusczyk 1987; Kuhl general) of L1 appears to be impaired after the fifth year of 1993). life. This claim is based on the experience of humans who Perceiving phonological distinctions is not only relevant have not been in contact with a speaking community and to the acquisition of a phonological system but is also essen- have been found at age 5 or more. The acquisition of the tial to the development of both LEXICON and SYNTAX. New- phonology of L2 appears to be impaired after puberty, as borns are in a position similar to that of adults when they witnessed by the foreign-accent phenomenon, responsible hear a language unrelated to any other language of which for the fact that we can distinguish a native speaker from a they have some experience. One of the problems in con- nonnative one. Interestingly, the acquisition of syntax (i.e., structing a lexicon is segmenting a continuous stream of the computational system) is not so impaired. sounds. In order to build a lexicon, a child must come to See also PHONOLOGY, NEURAL BASIS OF; SYNTAX, understand where each word ends and the next one begins. ACQUISITION OF Because newborns are sensitive to edges of phonological or —Marina Nespor prosodic constituents (Christophe et al. 1994) and to LIN- GUISTIC STRESS (Sansavini et al. 1995), it is conceivable that they use these prosodic cues to segment the continuous References input (cf. PROSODY AND INTONATION). According to the theory of language initialization known Boysson-Bardies, B. de. (1993). Ontogeny of language-specific as prosodic bootstrapping, the prosody of a language also syllabic productions. In B. de Boysson-Bardies, S. de Schonen, provides a cue to its syntactic structure (Gerken, Jusczyk, P. Jusczyk, P. McNeilage, and J. Morton, Eds., Developmental Neurocognition: Speech and Face Processing in the First Year and Mandel 1994). Given its sensitivity to prosody, an infant of Life. Dordrecht: Kluwer, pp. 353–363. should thus be capable of setting certain syntactic parame- Boysson-Bardies, B. de, P. Halle, L. Sagart, and C. Durand. ters long before the period in which he or she shows some (1989). A cross-linguistic investigation of vowel formants in knowledge of the lexicon (Mazuka 1996; Nespor, Guasti, babbling. Journal of Child Language 11:1–17. and Christophe 1996). The early setting of parameters Changeux, J. P., T. Heidmann, and P. Patte. (1984). Learning by responsible for word order accounts for the fact that the selection. In P. Marler and H. S. Terrace, Eds., The Biology of monolingual child hardly makes any mistakes in the relative Learning. Berlin: Springer, pp.115–133. order of words when he or she starts combining them into Christophe, A., E. Dupoux, J. Bertoncini, and J. Mehler. (1994). small sentences. Do infants perceive word boundaries? An empirical approach Speech production starts with babbling, the first approxi- to the bootstrapping problem for lexical acquisition. Journal of the Acoustical Society of America 95: 1570–1580. mation to language. The vocal apparatus approaches an Eimas, P. D., E. R. Siqueland, P. W. Jusczyk, and J. Vigorito. adult state at about 6 months of age, so it is only from this (1971). Speech perception in infants. Science 171: 303–306. period that it is safe to talk about babbling. Though the seg- Eimas, P. D., J. L. Miller, and P. W. Jusczyk. (1987). On infant mental categories in babbling do not resemble those of the speech perception and the acquisition of language. In S. Har- adult language (de Boysson-Bardies 1993), even at this first nad, Ed., Categorical Perception: The Ground Work of Cogni- stage, speech production is influenced by speech percep- tion. Cambridge: Cambridge University Press, pp. 161–195. tion: babbling does not develop normally without an audi- Eisenberg, R. B. (1976). Auditory Competence in Early Life. tory input (Oller and Eilers 1988). In auditorily unimpaired Unpublished MS. Baltimore, MD: infants, suprasegmentals (i.e., rhythm and intonation) are Gerken, L.-A., P. W. Jusczyk, and D. R. Mandel. (1994). When acquired before segments are, as early as at 6 months prosody fails to cue syntactic structure: Nine-months olds’ sen- sitivity to phonological versus syntactic phrases. Cognition 51: (Whalen, Levitt, and Wang 1991). Around the first year of 237–265. age, both the vowel quality and the syllabic structure of Jakobson, R. (1941). Kindersprache, Aphasie, und Allegemeine auditorily unimpaired babbling infants is much influenced Lautgesetze. (Child Language, Aphasia, and Phonological Uni- by that of the adult language (de Boysson-Bardies et al. versals, 1968.) Uppsala. The Hague: Mouton. 1989; Vihman 1992). The first syllable type produced Kuhl, P. K. (1993). Innate predispositions and the effects of experi- throughout the languages of the world is one formed by a ence in speech perception: The native language magnet theory. consonant (C) followed by a vowel (V). This is also the only In B. de Boysson-Bardies, S. de Schonen, P. Jusczyk, P. syllable type universally present. The segmental content of McNeilage, and J. Morton, Eds., Developmental Neurocogni- the first syllables is such that C and V are as far apart as pos- tion: Speech and Face Processing in the First Year of Life. Dor- sible in sonority—that is, C is pronounced with high air drecht: Kluwer, pp. 259–274. Phonology, neural basis of 643 Phonology, Neural Basis of Kuhl, P. K., K. A. Williams, F. Lacerda, K. N. Stevens, and B. Lindblom. (1992). Linguistic experiences alter phonetic per- ception in infants by 6 months of age. Science 255: 606–608. Mazuka, R. (1996). How can a grammatical parameter be set refers to the sound structure of language. As PHONOLOGY before the first word? In J. L. Morgan and K. Demuth, Eds., such, the study of phonology includes investigation of the Signal to Syntax. Hillsdale, NJ: Erlbaum, pp. 313–330. representations and organizational principles underlying Mehler, J., P. W. Jusczyk, G. Lambertz, N. Halsted, J. Bertoncini, the sound systems of language, as well as the exploration and C. Amiel-Tison. (1988). A precursor of language acquisi- of the mechanisms and processes used by the listener in tion in young infants. Cognition 29: 143–178. speech perception or by the speaker in speech production. Moore, J. M., W. R Wilson, and G. Thomson. (1977). Visual rein- The study of the neural basis of phonology is guided by forcement of head-turn responses in infants under 12 months of an attempt to understand the neural mechanisms contribut- age. Journal of Speech and Hearing Disorders 42: 328–334. Nespor, M., M.-T. Guasti, and A. Christophe. (1996). Selecting ing to the perception and production of speech. This word order: The Rhythmic Activation Principle. In U. Klein- domain of inquiry has largely focused on investigations of henz, Ed., Interfaces in Phonology. Berlin: Akademie Verlag, adult aphasics who have language deficits subsequent to pp. 1–26. brain damage, exploring their language impairments and Oller, D. K., and R. E. Eilers. (1988). The role of audition in infant accompanying lesion localization. These “experiments in babbling. Child Development 59: 441–449. nature” provide the traditional approach to the study of the Sansavini, A., J. Bertoncini, and G. Giovanelli. (1997). Newborns neural bases of phonology. More recently, neuorimaging discriminate the rhythm of multisyllabic stressed words. Devel- techniques, such as POSITRON EMISSION TOMOGRAPHY opmental Psychology 33: 3–11. (PET) and functional MAGNETIC RESONANCE IMAGING Streeter, L. A. (1976). Language perception of 2-month-old infants (fMRI), have provided a new window into the neural shows effects of both innate mechanisms and experience. Nature 259: 39–41. mechanisms contributing to phonology, allowing for inves- Vihman, M. M. (1992). Early syllables and the construction of tigation of neural activity in normal subjects as well as phonology. In C. A. Ferguson, L. Menn, and C. Stoel-Gammon, brain-damaged patients. Eds., Phonological Development: Models, Research, Implica- It has been long known that the left hemisphere is domi- tions. Timonium, MD: York Press, pp. 393–422. nant for language for most speakers, and that the peri-sylvian Werker, J. F., and R. C. Tees. (1984). Cross-linguistic speech per- regions of the left hemisphere are most directly involved in ception: Evidence for perceptual reorganization in the first year LANGUAGE PRODUCTION and SPEECH PERCEPTION. The clas- of life. Infant Behavior and Development 7: 49–63. sical view has characterized speech/language deficits in Whalen, D. H., A. Levitt, and Q. Wang. (1991). Intonational differ- APHASIA in terms of broad anatomical (left anterior and left ences between the reduplicative babbling of French and posterior) and functional (expressive and receptive) dichoto- English-learning infants. Journal of Child Language 18: 501– 506. mies (Geschwind 1965). In this view, expressive language deficits occur as a result of damage to the motor (anterior) Further Readings areas, and receptive language deficits occur as a result of damage to the auditory association (posterior) areas. None- Dehaene-Lambertz, G. (1995). Capacités Linguistiques Précoces theless, the processing components involved in both speech et leurs Bases Cérébrales. Ph.D. diss., Université de Paris VI. production and perception are complex and appear to involve Fikkert, P. (1994). On the Acquisition of Prosodic Structure. The more extensive neural structures than originally proposed. Hague: Holland Academic Graphics. In order to produce a word or group of words, a speaker Ingram, D. (1989). First Language Acquisition. Cambridge: Cam- must select the word(s) from the set of words in long-term bridge University Press. Jusczyk, P. W. (1997). The Discovery of Spoken Language. Cam- memory, encode its phonological form in a short-term bridge, MA: MIT Press. buffer in order to plan the phonetic shape, which will vary Mehler, J., and A. Christophe. (1995). Maturation and learning of as a function of the context (articulatory phonological plan- language in the first year of life. In M. S. Gazzaniga, Ed., The ning), and convert this phonetic string into a set of motor Cognitive Neurosciences. Cambridge, MA: MIT Press, pp. commands or motor programs to the vocal tract (articula- 943–954. tory implementation). Results from studies with aphasic Mehler, J., and E. Dupoux. (1990). Naître Humain. (What Infants patients show that all patients, regardless of clinical syn- Know, 1994.) Paris: Editions Odile Jacob. Oxford: Blackwell. drome and accompanying lesion localization, display defi- Morgan, J. L., and K. Demuth, Eds. (1995). Signal to Syntax. Mah- cits in the processes of selection and planning (Blumstein wah, NJ: Erlbaum. 1994). That is, they may produce the wrong sound segment, Repp, R. (1983). Categorical perception: Issues, methods, findings. In N. J. Lass, Ed., Speech and Language: Advances in Basic a selection error, such as “keams” for “teams”, or they may Research and Practice, vol. 10. New York: Academic Press, pp. produce the wrong sound segment because of the influence 243–335. of a neighboring sound, a planning error, such as “rof beef” Smith, N. V. (1973). The Acquisition of Phonology: A Case Study. for “roast beef.” The patterns of errors that occur show that Cambridge: Cambridge University Press. the sounds of speech are organized in terms of smaller units Strozer, J. R. (1994). Language Acquisition After Puberty. Wash- called phonetic features (cf. DISTINCTIVE FEATURES), and ington, DC: Georgetown University Press. that patients tend to make errors involving a change in the Werker, J. F. (1995). Exploring developmental changes in cross- value of a phonetic feature. For example, the production of language speech perception. In L. R. Gleitman and M. Liber- “keams” for “teams” reflects a change in the place of articu- man, Eds., An Invitation to Cognitive Science. Vol. 1, Language. lation of the initial stop consonant. Of importance, phonetic Cambridge, MA: MIT Press, pp. 87–106. 644 Phonology, Neural Basis of processing impairments (Milberg, Blumstein, and Dworet- features are not “lost,” but the patterns of errors reflect sta- zky 1988). tistical tendencies. Sometimes the patients produce a word PET and fMRI studies provide converging evidence con- correctly, sometimes not; and sometimes they may make sistent with the results from studies with aphasic patients. one type of error on a word, and other times a different type These studies have shown activation in a number of poste- of error on the same word. These phonological deficits arise rior and anterior structures when passively listening to in nearly all aphasic patients including Broca’s aphasics words or in making phonetic judgments. These structures who may have brain damage in Broca’s area and other ante- include the first and second temporal gyri, the supramar- rior brain structures such as the precentral gyrus and the ginal gyrus, the inferior frontal gyrus, as well as premotor BASAL GANGLIA, and Wernicke’s aphasics who may have areas. Nonetheless, direct comparisons between the behav- brain damage in the third temporal gyrus and other posterior ioral/lesion studies and the neuroimaging studies are diffi- structures such as the second temporal gyrus and the supra- cult because the experimental tasks used have not been marginal gyrus. comparable. For example, patients may be required to listen Although speech-production deficits that relate to selec- to pairs of words and make same/different judgments, tion and planning may not have a distinct neural locus, whereas normal subjects may be required to listen to pairs speech-production impairments that relate to articulatory of words and determine whether the final consonant of the implementation processes do. Such deficits seem to stem stimuli is the same. Even so, it seems clear that the pro- from impaired timing and coordination of articulatory move- cesses involved in the perception of speech are complex and ments (Ryalls 1987). The correct sound may be selected, but invoke a neural system that encompasses both anterior and the articulatory system cannot implement it normally. For posterior brain structures. example, for “teams” the patient may produce an overly aspirated initial /t/. Lesion data from aphasic patients and See also LANGUAGE, NEURAL BASIS OF evoked potential and PET data from normal subjects impli- —Sheila E. Blumstein cate the left inferior frontal gyrus (including Broca’s area), the precentral gyrus, the basal ganglia, the precentral gyrus References of the insula, and the supplementary motor areas, in the articulatory implementation of speech (Baum et al. 1990; Blumstein, S. E. (1994). The neurobiology of the sound structure Dronkers 1996; Petersen et al. 1989). Interestingly, although of language. In M. Gazzaniga, Ed., Handbook of the Cognitive the speech musculature is bilaterally innervated, speech-pro- Neurosciences. Cambridge: MIT Press. duction deficits emerge only as a consequence of left-brain Baum, S. R., S. E. Blumstein, M. A. Naeser, and C. L. Palumbo. damage and not right-brain damage to these structures. (1990). Temporal dimensions of consonant and vowel produc- tion: An acoustic and CT scan analysis of aphasic speech. Brain Speech perception processes are also complex. They and Language 39: 33–56. require a transformation of the auditory input from the Dronkers, N. F. (1996). A new brain region for coordinating peripheral auditory system to a spectral representation speech articulation. Nature 384: 159–161. based on more generalized auditory patterns or properties, Geschwind, N. (1965). Disconnexion syndromes in animals and followed by the conversion of this spectral representation to man. Brain 88: 237–294, 585–644. a more abstract feature (phonological) representation, and Milberg, W., S. E. Blumstein, and B. Dworetzky. (1988). Phono- ultimately the mapping of this sound structure onto its lexi- logical processing and lexical access in aphasia. Brain and Lan- cal representation. Presumably, the word is selected from a guage 34: 279–293. set of potential word candidates that are phonologically sim- Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintun, and M. E. ilar. Speech perception studies support the view that the Raichle. (1989). Positron emission tomographic studies of the processing of single words. Journal of Cognitive Neuroscience neural basis of speech perception is dominant in the left 1: 153–170. hemisphere. However, they challenge the classical view that Ryalls, J., Ed. (1987). Phonetic Approaches to Speech Production these deficits underlie the auditory comprehension impair- in Aphasia and Related Disorders. Boston: College-Hill Press. ments of Wernicke’s aphasics and that speech perception impairments are restricted to patients with temporal lobe Further Readings pathology. Nearly all aphasic patients, regardless of their lesion localization, display some deficits in perceiving the Binder, J. R., J. A. Frost, T. A. Hammeke, R. W. Cox, S. M. Rao, sounds of speech, as demonstrated by discrimination experi- and T. Prieto. (1997). Human brain language areas identified by ments with words, “pear” versus “bear,” or nonwords, “pa” functional magnetic resonance imaging. Journal of Neuro- versus “ba.” These patients do not seem to have impair- science 171: 353–362. Binder, J. R., S. M. Rao, T. A. Hammeke, Y. Z. Yetkin, A. Jes- ments in transforming the auditory input into a phonological manowicz, P. A. Bandettini, E. C. Wong, L. D. Estkowski, form nor do they show impairments in phonological struc- M. D. Goldstein, V. M. Haughton, and J. S. Hyde. (1994). ture. Differences among aphasic patients relate to the quan- Functional magnetic resonance imaging of human auditory tity of errors, not the patterns of errors. The basis for the cortex. Annals of Neurology 35: 662–672. different quantity of errors might reflect a greater involve- Binder, J. R., J. A. Frost, T. A. Hammeke, S. M. Rao, and R. W. ment of posterior structures in such speech processing tasks. Cox. (1996). Function of the left planum temporale in auditory Nonetheless, differential patterns of performance do emerge and linguistic processing. Brain 119: 1239–1247. in studies exploring the mapping of sound structure onto Damasio, H. (1991). Neuroanatomical correlates of the aphasias. lexical representations, implicating lexical processing defi- In M. T. Sarno, Ed., Acquired Aphasia. 2nd ed. New York: Aca- cits for Broca’s and Wernicke’s aphasics rather than speech demic Press. Physicalism 645 either). In discussions of the status of cognitive/psychologi- Demonet, J. F., J. A. Fiez, E. Paulesu, S. E. Petersen, and R. J. Zatorre. (1996). PET studies of phonological processing: A cal properties, physical properties are usually also taken to critical reply to Poeppel. Brain and Language 55: 352–379. include such higher-level properties as biological properties Gandour, J., and Dardarananda, R. (1982). Voice onset time in and computational properties. This broad sense of physical aphasia: Thai, I. Perception. Brain and Language 1: 24–33. property seems appropriate to the discussion of the question Gandour, J., and R. Dardarananda. (1984). Voice-onset time in how psychological properties are related to physical proper- aphasia: Thai, II: Production. Brain and Language 18: 389– ties—that is, the MIND-BODY PROBLEM. In its broad sense, 410. therefore, “physical” essentially amounts to “nonpsycholog- McAdam, D. W., and H. A. Whitaker. (1971). Language produc- ical.” This leaves our previous question unanswered: what tion: Electroencephalographic localization in the normal human is a physical property? Mass, charge, energy, and the like brain. Science 172: 499–502. are of course important properties in current physics, but the Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintun, and M. E. Raichle. (1988). Positron emission tomographic studies of the physics of the future may invoke properties quite different cortical anatomy of single-word processing. Nature 331: 585– from those in today’s physics. How would we recognize 589. them as physical properties rather than properties of another Poeppel, D. (1996). A critical review of PET studies of phonologi- sort? That is, how would we know that future physics is cal processing. Brain and Language 55: 317–351. physics? Posner, M. I. and M. E. Raichle. (1994). Images of Mind. New As noted, physicalists differ on the status of higher-level York: W. H. Freeman and Co. properties in relation to lower-level, basic physical proper- Price, C. J., R. J. S. Wise, E. A. Warburton, C. J. Moore, D. ties. Reductive physicalism claims that higher-level proper- Howard, K. Patterson, R. S. J. Frackowiak, and K. J. Friston. ties, including psychological properties, are reducible to, (1996). Hearing and saying: The functional neuro-anatomy of and hence turn out to be, physical properties. Opposed to auditory word processing. Brain 119: 919–931. Rosenbek, J., M. McNeil, and A. Aronson, Eds. (1984). Apraxia of reductive physicalism is nonreductive physicalism, also Speech. San Diego, CA: College-Hill Press. called property dualism, which takes at least some higher- Zatorre, R. J., A. C. Evans, E. Meyer, and A. Gjedde. (1992). Lat- level properties, in particular cognitive/psychological prop- eralization of phonetic and pitch discrimination in speech pro- erties, to form an irreducible autonomous domain. This cessing. Science 256: 846–849. would mean that psychology is a special science whose Zatorre, R. J., E. Meyer, A. Gjedde, and A. C. Evans. (1996). PET object is to investigate the causal/nomological connections studies of phonetic processes in speech perception: Review, involving these irreducible psychological properties and replication, and re-analysis. Cerebral Cortex 6: 21–30. generate distinctively psychological explanations in terms of them. In this view, these laws and explanations cannot be Physicalism formulated in purely physical terms—not even in an ideally complete physical theory—and a purely physical descrip- Physicalism is the doctrine that everything that exists in the tion of the world, however physically complete it may be, spacetime world is a physical thing, and that every property would leave out something important about the world. Non- of a physical thing is either a physical property or a prop- reductive physicalism, therefore, leads to the doctrine of the erty that is related in some intimate way to its physical AUTONOMY OF PSYCHOLOGY and, more generally, the nature. Stated this way, the doctrine is an ontological claim, autonomy of all special sciences in relation to basic physics but it has important epistemological and methodological (Davidson 1970; Fodor 1974). corollaries. The mind-brain identity theory (Feigl 1958; Smart 1959; Physicalists in general will accept the following thesis of Armstrong 1968) is a form of reductive physicalism. This “ontological physicalism” (Hellman and Thompson 1975): approach proposes to identify psychological properties with Every object in spacetime is wholly material—that is, it is their neural correlates; for example, pain is to be identified either a basic particle of matter (proton, electron, quark, or with its neural substrate (”C-fiber stimulation,” according to whatever) or an aggregate structure composed exclusively armchair philosophical neurophysiology). These mental- of such particles. Ontological physicalism, therefore, denies neural identities are claimed to be just like the familiar iden- the existence of things like Cartesian souls, supernatural tities discovered by science, for example, “Water = H2O,” divinities, “entelechies,” “vital forces,” and the like. Physi- “Light = electromagnetic radiation,” and “Genes = DNA calists, however, differ widely when it comes to the question molecules.” Just as the “true nature” of water is being com- of properties of physical objects—whether complex physi- posed of H2O molecules, advances in neurophysiology will cal systems can have properties that are in some sense non- reveal to us the true nature of each type of mental state by physical. But what is a physical property? identifying it with a specific kind of brain state. It is difficult to give a clear-cut answer to this question. EMERGENTISM, a doctrine popular in the first half of the In a narrow sense, physical properties are those properties, twentieth century, is a form of nonreductive physicalism relations, quantities, and magnitudes that figure in physics, (Morgan 1923; Sperry 1969; McLaughlin 1992). Its central such as mass, energy, shape, volume, entropy, temperature, tenet is the claim that certain higher-level properties, in par- spatiotemporal position and distance, and the like. Most will ticular consciousness and intentionality, are emergent in the also include chemical properties like valence, inflammabil- sense that, although they appear only when a propitious set ity, and acidity, although these are not among the basic of physical conditions are present, they are genuinely novel physical properties—properties that figure in basic physical properties that are neither explainable nor predictable in laws (in this sense entropy and temperature are not basic terms of their underlying physical conditions. Moreover, 646 Physicalism these emergent properties bring into the world their own comfortably with the claim that the special sciences are distinctive causal powers, thereby enriching the causal autonomous vis-à-vis basic physics. For if the laws and causal structure of the world. FUNCTIONALISM is also often thought relations obtaining at the basic physical level determine all to be a form of nonreductive physicalism. According to this higher-level causal relations and laws, it should be possible in position, psychological properties are not physical or neural principle, or at least so it seems, to formulate explanations of properties, but rather functional kinds, where a functional higher-level laws and phenomena within the physical domain. kind is a property defined in terms of causal inputs and out- If the world works the way it does because the physical world puts. To give a familiar example, pain is said to be a func- works the way it does, why is it not possible to explain every- tional kind in that being in pain is to be in some physical/ thing in terms of how the physical world works? biological state that is typically caused by certain types of Some will challenge this reasoning. They will argue that physical inputs (e.g., tissue damage) and that causes certain for X to determine Y is one thing, but that for X to explain or behavioral outputs (e.g., groaning, wincing, escape behav- make intelligible why Y occurs is quite another. Pain ior). It is then noted that a psychological kind when given a emerges whenever C-fibers are firing, and this may well be functional interpretation of this kind has multiple physical a lawlike correlation. But the correlation is “brute”: it is not realizers (Putnam 1967; Block and Fodor 1972; Fodor possible to explain why pain, rather than tickle or itch, 1974); that is, the neural mechanism that realizes or imple- emerges when C-fibers fire, or why pain emerges from C- ments pain in humans is probably vastly different from the fiber excitation but not from other kinds of neural activity. pain mechanisms in reptiles, mollusks, and perhaps certain Nor do we seem able to explain why any conscious states complex electromechanical systems. This is “the multiple should emerge from neural processes. For the emergentists, realization argument” against reductionism: because pain is then, although all higher-level facts are determined by multiply realized in diverse physical/biological mecha- lower-level physical facts, the latter are powerless to explain nisms, it cannot be identified with any single physical or the former. The world may be a fundamentally physical biological property. This has led to the view that cognitive/ world, but it may well include physically inexplicable facts. psychological properties are at a higher level of abstraction Whether and how the functionalist can resist the reduc- and formality than the physical/biological properties that tionist pressure is less clear. Suppose, as functionalism has it, implement them (Kim 1992). that being in mental state M is to be in some physical state or However, nonreductive physicalists, insofar as they are other meeting a certain causal specification D. It would seem physicalists, will acknowledge that psychological proper- then that we could easily explain why something is in M by ties, although physically irreducible, are in some sense pointing out that it is in P and that P meets causal specifica- dependent on, or determined by, physical properties— tion D—namely that P is a realizer of M. And given the func- unless, that is, one is prepared to take their physical irreduc- tional characterization of M, it seems to follow that the causal ibility as proof of their unreality and adopt eliminativism/ powers of a given instance of M are just the causal powers of irrealism (or ELIMINATIVE MATERIALISM) about the mental its realizer P on this occasion. Thus, if it is a special-science law that M-events cause M*-events, that must be so because (Churchland 1981). That is, physicalists who accept the each of M’s physical realizers causes a physical realizer of reality of the mental will accept the mind-body SUPERVE- M*. In this way, special-science laws would seem reductively NIENCE thesis (Hellman and Thompson 1975; Horgan 1982; explainable in terms of laws governing the realizers of the Kim 1984): the psychological character of an organism or special-science properties involved. system is entirely fixed by its total physical nature. From “Materialism” is often used interchangeably with “physi- this it follows that any two systems with a relevantly similar calism.” However, there are some subtle differences between physical structure will exhibit an identical or similar psy- these terms, the most salient of which is that physicalism chological character. Even emergentists will grant that when indicates acknowledgment that something like current phys- identical physical conditions are replicated, the same mental ics is the ultimate explanatory theory of all the facts, whereas phenomenon will emerge, or fail to emerge. Supervenience materialism is not necessarily allied with the success of is also a commitment of functionalism: systems in identical physics as a basic explanatory theory of the world. physical conditions presumably have the same causal pow- ers and so will instantiate the same functional properties. It See also ANOMALOUS MONISM; CONSCIOUSNESS; is a basic commitment of all forms of physicalism that the EXPLANATORY GAP; INDIVIDUALISM; INTENTIONALITY world is the way it is because the physical facts of the world —Jaegwon Kim are the way they are. That is, physical facts fix all the facts. Among the facts of this world are causal facts, including those involving mental and other higher properties. The References supervenience thesis implies then that these higher-level Armstrong, D. M. (1968). A Materialist Theory of Mind. New causal facts are fixed by lower-level physical facts, presum- York: Humanities Press. ably facts about physical causal relations. The same goes for Block, N., and J. A. Fodor. (1972). What psychological states are higher-level laws: under supervenience, these laws are fixed not. Philosophical Review 81: 159–181. once basic physical facts, in particular basic laws of physics, Churchland, P. M. (1981). Eliminative materialism and the propo- are fixed. According to the supervenience thesis, therefore, sitional attitudes. Journal of Philosophy 78: 68–90. physical laws and causal relations are fundamental; they, and Davidson, D. (1970). Mental events. In L. Foster and J. W. Swan- they alone, are ultimately responsible for the causal/nomic son, Eds., Experience and Theory. Amherst, MA: University of structure of the world. But this conclusion does not comport Massachusetts Press, pp. 79–101. Piaget, Jean 647 Feigl, H. (1958). The “mental” and the “physical.” Minnesota departed from the limited aims of psychology to discover the most general principles of cognition. Studies in the Philosophy of Science 2: 370–497. Fodor, J. A. (1974). Special sciences, or the disunity of science as a Piaget’s basic idea is that knowledge continues biologi- working hypothesis. Synthese 28: 97–115. cal ADAPTATION by different means. This means that intelli- Hellman, G., and F. Thompson. (1975). Physicalism: Ontology, gence is considered as a sort of organ and, as such, has both determination, and reduction. Journal of Philosophy 72: 551– a functional side and a structural one. But, whereas other 564. organs have fixed structures and fixed functions, cognitive Horgan, T. (1982). Supervenience and microphysics. Pacific Philo- organs present a functional continuity within structural dis- sophical Quarterly 63: 29–43. continuities. The functional continuity is the emergence and Kim, J. (1984). Concepts of supervenience. Philosophy and Phe- growth of knowledge during evolution. Structural disconti- nomenological Research 45: 153–176. nuities are the different forms knowledge takes during the Kim, J. (1992). Multiple realization and the metaphysics of reduc- course of the growth of a species, a culture, or an individual. tion. Philosophy and Phenomenological Research 52: 1–26. Lewis, D. (1972). Psychophysical and theoretical identifications. These discontinuities are marked by a stage like construc- Australasian Journal of Philosophy 50: 249–258. tion of successive invariants ensuring a certain stability to McLaughlin, B. (1992). The rise and fall of British emergentism. the world in which the organism lives (homeostasis). In A. Beckermann, H. Flohr, and J. Kim, Eds., Emergence or Such a position is called constructivism in epistemology, Reduction. Berlin: De Gruyter, pp. 49–93. because it is a sort of midway between two opposites: real- Morgan, C. L. (1923). Emergent Evolution. London: William and ism and nominalism. REALISM pretends that things exist Norgate. independently of their instances in the actual world by Putnam, H. (1967). Psychological predicates. In W. H. Capitan and necessity. Such a view secures the objectivity and universal- D. D. Merrill, Eds., Art, Mind, and Religion. Pittsburgh, PA: ity of knowledge. Nominalism considers that what we call University of Pittsburgh Press, pp. 37–48. things are mere conveniences that vary according to one’s Smart, J. J. C. (1959). Sensations and brain processes. Philosophi- cal Review 68: 141–156. needs and conventions. This relativistic approach accounts Sperry, R. W. (1969). A modified concept of consciousness. Psy- for the variability of things according to cultural changes. chological Review 76: 532–536. As one can see, constructivism being both fixed in its func- tional dimension and ever changing in its structural one Further Readings solves the opposition gracefully without reducing one per- spective to the other or excluding one in favor of the other. Broad, C. D. (1925). The Mind and Its Place in Nature. London: The rest of Piaget’s program characterizes the sequences Routledge and Kegan Paul. and mechanisms by which rational knowledge develops. Kim, J. (1989). The myth of nonreductive materialism. Proceed- Sequences of development are marked by a constant ings and Addresses of the American Philosophical Association abstraction of conservation from the mere permanence of 63: 31–47. objects to the laws of conservation in physics and chemistry. Levine, J. (1983). Materialism and qualia: The explanatory gap. In order for the world to acquire the minimum stability Pacific Philosophical Quarterly 64: 354–361. McLaughlin, B. (1989). Type epiphenomenalism, type dualism, requested to retrieve an object once it has disappeared from and the causal priority of the physical. Philosophical Perspec- perception, space must be conceived as a container within tives 3: 109–136. which all the moves of an observer form a mathematical Moser, P. K., and J. D. Trout, Eds. (1995). Contemporary Materi- group of displacements. Then time, matter, weight, and vol- alism. London: Routledge. ume need to be conserved first in action, symbols, and con- Poland, J. (1994). Physicalism. Oxford: Clarendon Press. cepts, as well as logical classes, relations, and numbers, on Rosenthal, D. M., Ed. (1991). The Nature of Mind. New York: the logico-mathematical side. Oxford University Press. Conservation accounts for the preservation of knowledge at each level of development but not for the acquisition of Physiology new knowledge. This is made possible by novelty or the attainment of better knowledge and by necessity or the inter- connection of all available knowledge into logically neces- See AUDITORY PHYSIOLOGY; ELECTROPHYSIOLOGY, ELEC- sary systems. TRIC AND MAGNETIC EVOKED FIELDS; VISUAL ANATOMY Novelty plays an important role in Piaget’s theorizing. AND PHYSIOLOGY; VISUAL CORTEX, CELL TYPES AND CON- First, the emergence of novelty in knowledge is considered NECTIONS IN by him as evidence in favor of his constructivistic view and against the two extreme positions in the nature-nurture Piaget, Jean dilemma. NATIVISM and environmentalism both exclude novelty because it is mere unfolding in nativism and a mat- Jean Piaget’s (1896–1930) research program about human ter of learning in environmentalism. Second, the sudden knowledge counts as one of the major contributions to psy- emergence of novelty proves the stage like nature of the chology and epistemology because it has translated philo- growth of knowledge. But, third and above all, novelty sophical questions into empirical ones, setting a standard changes the face of knowledge both in the child and in sci- against which any new paradigm about the nature and ence. Once a child has discovered that, when one gets the growth of knowledge is still measured today. Hence its per- concept of number, all the numerical operations will yield a tinence for cognitive sciences because, like them, Piaget number and nothing else but a number ad infinitum, this 648 Pictorial Art and Vision novel knowledge changes the child’s outlook of the world in Piaget, J. (1947). The Psychology of Intelligence. Translated (1950). New York: Harcourt Brace. the very same way that the discovery of object permanence Piaget, J. (1953). Logic and Psychology. Manchester: Manchester makes the baby search for objects that have disappeared and University Press. abandon the “out of sight out of mind” attitude so typical of Piaget, J. (1967). Biology and Knowledge: An Essay on the Rela- newborns. In science, the double movement of geometriza- tions Between Organic Regulations and Cognitive Processes. tion of physics and physicalization of space accomplished Translated (1971). Chicago: University of Chicago Press. by Albert Einstein when he applied Georg Riemann’s geom- Piaget, J. (1968). Structuralism. Translated (1971). New York: etry to gravity modified completely the way physicists Harper and Row. looked at the world. Thus progress in cognition both gener- Piaget, J. (1970). The Principles of Genetic Epistemology. Trans- ates and is generated by novelty. lated (1972). New York: Basic Books. But novelty is not enough. Knowledge needs to be true Piaget, J. (1975). The Development of Thought: Equilibration of Cognitive Structures. Translated (1977). New York: Viking knowledge (novelty) and knowledge of the truth (necessity). Press. This could not be explained only in terms of an interaction Piaget, J., and J.-C. Bringuier. (1977). Conversations with J. between nature and nurture because how could mere conti- Piaget. Translated (1980). Chicago: University of Chicago gencies generate necessity? Piaget offers a more general Press. factor: equilibration, subsuming nature and nurture, under Piaget, J., and R. Garcia. (1987). Towards a Logic of Meanings. one explanatory system transcending them in levels of gen- Translated (1991). Hillsdale, NJ: Erlbaum. erality, necessity, and abstraction. To understand the abstract Piaget, J., and B. Inhelder. (1941). The Child’s Construction of nature of equilibration, let us suppose that living organisms Quantities: Conservation and Atomism. Translated by Arnold J. are governed by the second law of thermodynamics. If this Pomerans (1974). New York: Basic Books. is so, then the resulting increase in entropy of the system Piaget, J., and A. Szeminska. (1941). The Child’s Conception of Number. Translated (1965). London: Routledge and Kegan cannot be considered as either innate or acquired but as depending on a law of probability. In the very same way, Paul. equilibration is the law of development, an abstract neces- sary principle independent of any contingencies and result- Further Readings ing in an endless optimization of living systems Gruber, H. and J. Vonèche. (1966). The Essential Piaget: An Inter- (homeorhesis) in a stage like sequence considered as the pretative Reference and Guide. North Vale, NJ: J. Aronson. ideal course of evolution or chreode. Chapman, M. (1988). Constructive Evolution: Origins and Devel- A number of criticisms have been raised against Piaget’s opment of Piaget’s Thought. Cambridge: Cambridge University psychological points: age of attainment, neuropsychological Press. mechanisms of concept acquisition, etc. These criticisms have unfortunately confused Piaget’s epistemological points that are essential to his theory with psychological ones that Pictorial Art and Vision are contingent and thus open to change for him too, because they were just algorithms. Pictorial art attempts to capture the three-dimensional struc- See also ANIMISM; COGNITIVE DEVELOPMENT; INFANT ture of a scene—some chosen view of particular objects, COGNITION; NAIVE MATHEMATICS; NAIVE PHYSICS; NATIV- people, or a landscape. The artist’s goal is to convey a mes- ISM; THEORY OF MIND sage about the world around us, but we can also find in art a —Jacques Vonèche message about the workings of the brain. Many look to art for examples of pictorial depth cues—perspective, occlu- References sion, TEXTURE gradients, and so on—as these are the only cues available for depth in pictures. DEPTH PERCEPTION Inhelder, B., and J. Piaget. (1953). The Growth of Logical Thinking based on binocular disparity, vergence, and accommodation from Childhood to Adolescence: An essay on the construction is inappropriate for the depths depicted, and head move- of formal operational structures. Translated by A. Parsons and ments no longer provide new views of the scene. However, S. Milgam (1972). London: Routledge and Kegan Paul. pictorial cues are abundant in real scenes—that is why they Piaget, J. (1926). The Child’s Conception of the World. Translated work in pictures—and there is no obvious benefit in study- by J. A. Thomlinson (1929). London: Kegan Paul, Trench, ing their effectiveness in art as opposed to their effective- Trubner. ness in natural scenes. Piaget, J. (1932). The Moral Judgment of the Child. Translated (1932). London: Kegan Paul, Trench, Trubner. And yet pictorial art can tell us a great deal about vision Piaget, J. (1936). The Origin of Intelligence in Children. Trans- and the brain if we pay attention to the ways in which paint- lated (1952). New York: International University Press. ings differ from the scenes they depict. First of all, we learn Piaget, J. (1937). The Construction of Reality in the Child. Trans- that artists get away with a great deal—impossible colors, lated (1954). New York: Basic Books. inconsistent shading and shadows, inaccurate perspective, Piaget, J., and B. Inhelder. (1948). The Child’s Conception of the use of lines to stand for sharp discontinuities in depth or Space. Translated (1956). London: Routledge and Kegan Paul. brightness. These representational “errors” do not prevent Piaget, J., and B. Inhelder. (1966). The Psychology of the Child. human observers from perceiving robust three-dimensional Translated (1969). New York: Basic Books. forms. Art that captures the three-dimensional structure of Piaget, J. (1945). Play, Dreams and Imitation in Childhood. Trans- the world without merely recreating or copying it offers a lated (1962). New York: W. W. Norton. Pictorial Art and Vision 649 (a) (b) Figure 1. (a) An early example of outline drawing from France. (b) As you view this image from different angles, the changes in the distance from face to hand and in the shape of the head are subtle. A 3-D computer model of this scene would require large-scale relative motions and 3-D shape changes to maintain the 2-D view seen with changing viewpoints. (c) Impossible lighting, highlights, or shadows (note the overlapping cast shadows at the bottom) are difficult to spot in paintings, implying that human observers use a simplistic local (c) model of light and shade. revealing glimpse of the short cuts and economies of the A line drawing of a building or an elephant can convey inner codes of vision. The nonveridicality of representation its 3-D structure very convincingly, but remember that in art is so commonplace that we seldom question the rea- there are no lines in the real world corresponding to the son why it works. lines used in the drawings. The surface occlusions, folds, or 650 Pictorial Art and Vision creases that are represented by lines in drawings are Elder; see Gombrich 1976 for a beautiful reinterpretation revealed by changes in, say, brightness or texture in the real of this ancient presentation of painting techniques). These world, and these changes have one value extending on one local techniques of shading, shadows, and highlights were side and a different value on the other. This is not a line. It applied with little thought to making them all consistent is not obvious why lines should work at all. The effective- with a given light source—and yet they all work very well. ness of line drawings is not based simply on learned con- Even 500 years ago, when the geometry of perspective vention, passed on through our culture. This point has been was well understood, the geometry of light was still controversial (Kennedy 1975; see Deregowski 1989, and its ignored. The resulting errors in light and shadow would be following comments), but most recent evidence suggests caught immediately by any analysis based on physical that line drawings are universally interpreted in the same optics, but pass unnoticed to human observers. Modern way—infants (Yonas and Arterberry 1994), stone-age artists with a full understanding of the physics of light and tribesmen (Kennedy and Ross 1975), and even monkeys shade available to them often still choose inconsistencies (Itakura 1994) appear to be capable of interpreting line in lighting either because it never matters much, or per- drawings as we do. Nor is it the case that the lines in line haps because it looks better. drawings just trace the brightness discontinuities in the Evidently, we as observers do not reconstruct a light image, because this type of representation is rendered source in order to recover the depth from shading and meaningless by the inclusion of cast shadow and pigment shadow, we do not act as optical geometers in the way that contours. By a quirk of design or an economy of encoding, computer graphics programs can. We do not notice inconsis- lines may be directly activating the internal code for object tencies across different portions of a painting but recover structure, but only object contours can be present in the depth cues locally. The message here is that in the real drawing for this shortcut to work. The shortcut, discovered world, the information is rich and redundant, so we do not and exploited by artists, hints at the simplicity of the inter- have to analyze the image much beyond a local region to nal code that underlies the vision of 3-D structures. This resolve any ambiguities. When faced with the sparser cues code is both simpler than the 2 ½-D sketch of David MARR of pictorial art, we do not adopt a larger region of analysis— the local cues are meaningful, albeit inconsistent with cues and sparser than the compact, reversible codes (Olshausen in other areas of the painting. To the advantage of the artist, and Field 1996) that may reflect the workings of early areas the inconsistencies go unnoticed. And again, like many of VISUAL CORTEX. Both artists and brains have found out aspects of art, this discrepancy between the art and the scene which are the key contours necessary to represent the it depicts informs us about the brain within us as much as essential structure of an object. By studying the nature of about the world around us. lines used in line drawings, scientists too may eventually join this group. See also GESTALT PERCEPTION; ILLUSIONS; LIGHTNESS Another aspect as commonplace and as informative as PERCEPTION; SHAPE PERCEPTION; STRUCTURE FROM VISUAL the effectiveness of lines is that pictures are flat and yet INFORMATION SOURCES; SURFACE PERCEPTION they provide consistent, apparently 3-D interpretations —Patrick Cavanagh from a wide range of viewpoints. This is not only conve- nient for the artist, but also prime evidence that our impres- References sions of a 3-D world are not supported by true, 3-D internal representations. If we had real 3-D vision, the scene Busey, T. A., N. P. Brady, and J. E. Cutting. (1990). Compensation depicted in a flat picture would have to distort grotesquely is unnecessary for the perception of faces in slanted pictures. in 3-D space as we moved about the picture. To the con- Perception and Psychophysics 48: 1–11. Deregowski, J. B. (1989). Real space and represented space: trary, however, objects in pictures seem reassuringly the Cross-cultural perspectives. Behavioral and Brain Sciences 12: same as we change our vantage point (with some interest- 51–119. ing exceptions; see Gregory 1994). We don’t experience the Gombrich, E. H. (1976). The Heritage of Apelles. Oxford: Phaidon distortions probably because the visual system does not Press. generate a true 3-D representation of the object. It has some Gregory, R. (1994). Experiments for a desert island. Perception 23: qualities of three dimensions but it is far from Euclidean. It 1389–1394. may follow some other geometry, affine or nonmetric in Itakura, S. (1994). Recognition of line-drawing representations by nature (Todd and Reichel 1989; Busey, Brady, and Cutting a chimpanzee (Pan troglodytes). Journal of General Psychol- 1990). The effectiveness of flat images is of course a boon ogy 121: 189–197. to artists who do not have to worry about special vantage Kennedy, J. M. (1975). Drawings were discovered, not invented. New Scientist 67: 523–527. points and to film makers who can have theaters with more Kennedy, J. M., and A. S. Ross. (1975). Outline picture perception than one seat in them. It is also of great importance for by the Songe of Papua. Perception 4: 391–406. understanding the internal representations of objects and Olshausen, B. A., and D. J. Field. (1996). Emergence of simple- space. cell receptive field properties by learning a sparse code for nat- Finally, consider the enormous range of discrepancies ural images. Nature 381: 606–607. between light and shade in the world and their renditions Todd, J. T., and F. D. Reichel. (1989). Ordinal structure in the in art. When light and shade were introduced into art visual perception and cognition of smoothly curved surfaces. about 2,200 years ago, it was through the use of local Psychological Review 96: 643–657. techniques such as lightening a surface fold to make it Yonas, A., and M. E. Arterberry. (1994). Infants perceive spatial come forward (a Greek technique described by Pliny the structure specified by line junctions. Perception 23: 1427–1435. Pitts, Walter 651 had published his marvelous essay on the universal comput- Further Readings ing engine. The analogy of neurons (as pulsatile rather than Gombrich, E. H. (1960). Art and Illusion. Princeton: Princeton two-state devices) to the elements of a logical machine was University Press. inescapable. By 1943 McCulloch and Pitts published their Gregory, R., J. Harris, P. Heard, and D. Rose. (1995). The Artful famous paper, “A Logical Calculus of the Ideas Immanent Eye. Oxford: Oxford University Press. in Nervous Activity.” In 1947 they added the work “How Kennedy, J. M. (1974). The Psychology of Picture Perception. San We Know Universals.” It was an attempt to interpret the Francisco: Jossey-Bass Inc. structure of cortex as providing the sort of net that could Maffei, L., and A. Fiorentini. (1995). Arte e Cervello. Bologna: abstract form independent of scale. Zanichelli Editore. In 1943 Pitts, visiting Lettvin (who was interning in Bos- Willats, J. (1997). Art and Representation. Princeton: Princeton University Press. ton), met Norbert WIENER and was invited to come to MIT as a research assistant. By the beginning of 1944, Pitts had been taken by the Kellex Corporation (a branch of the Pitts, Walter atomic bomb project). In the late 1940s he returned to MIT and began a project extending the work of Caianiello (on Walter Pitts was born in 1923, vanished from the scene in two-dimensionally connected nets) to three-dimensionally the late 1950s, and died at the end of the 1960s, having connected arrays—an extremely difficult problem. destroyed, as much as he could, any traces of his past exist- In 1951, Jerry Wiesner, at the behest of Wiener, invited ence. He is a peculiarly difficult subject for a biography McCulloch, Patrick Wall, and Lettvin to join the Research because, although he remains a vividly haunting memory to Laboratory of Electronics (RLE) as research associates. those who knew him, he seems only a group delusion to oth- Despite the loss of status and income, the three accepted ers. At least that was the opinion of the neurologist Norman with the full enthusiasm of their wives. They and Pitts formed a new laboratory at RLE. GESCHWIND. Pitts appeared as a penniless 14-year-old at the Univer- But in late 1952, Wiesner received a letter from Mexico sity of Chicago in 1937, attended various classes, though City where Wiener and his wife were visiting Arthur Rosen- unregistered, and was accepted by Rashevsky’s coterie as a blueth. Viciously phrased, it severed all relations with very talented but mysterious junior. All that was known of McCulloch’s group, which included Pitts. Only after a him was that he came from Detroit, and that would be all decade did Rosenblueth reveal what had set off this explo- that was known thereafter. sion. It had nothing to do with any substantive cause but was An autodidact, he read Latin, Greek, Sanskrit, and Ger- the result of a deliberate and cynical manipulation designed man (though did not speak them) and apparently was to sever Wiener’s connection with McCulloch and his advanced well beyond his years in LOGIC. The last can be group. The details are not edifying; Wiener was victimized illustrated by a confirmable anecdote. In 1938 he appeared as much as the group. at the office of Rudolf Carnap, whose most recent book on The effect on Pitts was devastating; he was the most logic had appeared the previous year. Without introducing vulnerable. Wiener had become the father he had never himself, Pitts laid out his copy opened to a section annotated had. From that point on, Pitts went into a steep decline. He marginally, and proceeded to make critical comments on the abandoned interest in the work, and though willing enough material. Carnap, after initial shock, defended his work and to help, lost all initiative. Nothing could be done to arrest engaged with Pitts in an hour or so of talk. Pitts then left his decline. Pitts would have nothing to do with any psy- with his copy. For several weeks, Carnap hunted through the chiatrist, even those whom he met at the Macy symposia university for “that newsboy who understood logic,” finally and other such roundtables. He destroyed all of his past located him, and found a job for him, for Pitts had no funds work that he could find and became a ghost long before he and lived only on what he could earn from ghosting papers died. for other students. On a personal level, Pitts was a wonderful friend, and an In 1938, Pitts, Jerry Lettvin, and Hy Minsky (the future inexhaustible fount of knowledge about everything, the arts economist) formed a friendship that would endure over the as much as the sciences. One asked him a serious question years. When Lettvin went to medical school in 1939 at the only if there was enough time to hear the full answer, which University of Chicago, they would still meet often. In was sometimes several hours long, but never didactic, 1941, Warren MCCULLOCH came to the University of Illi- rather, extremely witty and tailored to the understanding of nois from Yale and Gerhardt von Bonin introduced Pitts the inquirer. and Lettvin to him. Thereafter Pitts joined the laboratory All that vanished before the end of the 1950s. He died unofficially. alone in a boarding house in Cambridge after doing his best Pitts was homeless, Lettvin wanted to escape his family, for close to a decade to avoid being found by his friends. and so McCulloch, together with his remarkable wife Rook, Nothing of his work was left. But beyond question his influ- in spite of having four children already, brought the pair into ence shaped much of the thought of the laboratory and the their household. In late 1942, after weeks of reviewing the approach to physiology from a philosophical view. material in neurophysiology, Pitts told McCulloch of Leib- See also AUTOMATA; CHURCH-TURING THESIS; NEURAL niz’s dictum that any task which can be described com- NETWORKS; VON NEUMANN pletely and unambiguously by a finite set of terms can be performed by a logical machine. Six years earlier TURING —Jerome Lettvin 652 Planning This work was combined with search methods being References studied in operations research (e.g., branch-and-bound McCulloch, W., and W. Pitts. (1943). A logical calculus of the methods), and with research on representation and reason- ideas immanent in nervous activity. Bulletin of Mathematical ing from predicate logic in various theorem proving meth- Biophysics 5: 115–133. Reprinted in W. S. McCulloch (1965/ ods (e.g., Green 1969), so that by the end of the 1960s 1988), Embodiments of Mind. Cambridge, MA: MIT Press. some long-lasting methods were emerging (see HEURISTIC Pitts, W., and W. McCulloch. (1947). On how we know universals: SEARCH and SITUATION CALCULUS). The perception of auditory and visual forms. Bulletin of Mathe- In 1969, the Stanford Research Institute Problem Solver, matical Biophysics 9: 127–147. Reprinted in W. S. McCulloch or STRIPS, (Fikes, Hart, and Nilsson 1971) represented (1965/1988), Embodiments of Mind. Cambridge, MA: MIT application domain states in first-order logic, introduced a Press. way to represent the actions of the domain in terms of Further Readings changes to the world state, used means-end analysis to iden- tify goals and subgoals that needed to be solved as stepping Anderson, J. A. (1996). From discrete to continuous and back stones to a solution, searched through a space of possible again. In R. Moreno-Diaz and J. Mira-Mira, Eds., Brain Pro- solutions, and employed a simple but effective representa- cesses, Theories and Models: An International Conference in tion of the actions possible in the domain—as STRIPS oper- Honor of W. S. McCulloch 25 Years After His Death. Cam- ators. Many of these techniques form the basis for later bridge, MA: MIT Press. work in planning. Cull, P. (1996). Neural nets: classical results and current problems. Many planning techniques are formulated as search prob- In R. Moreno-Diaz and J. Mira-Mira, Eds., Brain Processes, Theories and Models: An International Conference in Honor of lems. Early planning approaches, including STRIPS, used a W. S. McCulloch 25 Years After his Death. Cambridge, MA: search technique in which the nodes of the search repre- MIT Press. sented application domain states directly and the search arcs Howland, R., J. Y. Lettvin, W. S. McCulloch, W. Pitts, and P. D. were the domain actions that could transform those states. Wall (1955). Reflex inhibition by dorsal root interaction. Jour- This is termed “application state space” or “situation space.” nal of Neurophysiology 18: 1–17. Reprinted in W. S. McCul- For example, the node state might represent the position of a loch (1965/1988), Embodiments of Mind. Cambridge, MA: robot waiter and the items on various tables: MIT Press. Lettvin, J. (1989). Introduction. In R. McCulloch, Ed., Collected At(Robot,Counter) and On(Cup-a,Table-1) and On(Plate- Works of Warren S. McCulloch, vol. 1. Salinas, CA: Intersys- a,Table-1), tems Publications. Lettvin, J., H. Maturana, W. McCulloch, and W. Pitts. (1959). and the action arcs might represent the movement of a robot What the frog’s eye tells the frog’s brain. Proceedings of the waiter, or a pickup action of the robot: IRE 47: 1940–1959. Reprinted in W. S. McCulloch (1965/ Operator Pickup(x) 1988), Embodiments of Mind. Cambridge, MA: MIT Press. Preconditions: On(x,y) and At(Robot,y) Wall, P. D., W. S. McCulloch, J. Y. Lettvin, and W. H. Pitts. (1955). Delete list: On(x,y) Effects of strychnine with special reference to spinal afferent fibres. Epilepsia Series 3, 4: 29–40. Reprinted in W. S. McCul- Add list: Held(x) loch (1965/1988), Embodiments of Mind. Cambridge, MA: STRIPS operators represent an action as having precon- MIT Press. ditions that have to be satisfied in the state in which the action is performed, and a delete list and add list of effects Planning that represent changes made to the state following the per- formance of the action. Later approaches have concentrated on searching Planning is the process of generating (possibly partial) rep- through a different space—that of partially defined plans. A resentations of future behavior prior to the use of such search space node is a partial plan and an arc is a partial plans to constrain or control that behavior. The outcome is plan modification operator (PMO). For example, a PMO usually a set of actions, with temporal and other constraints might ensure the satisfaction of a condition on some activity on them, for execution by some agent or agents. As a core in the partial plan. Each node of the search space defines an aspect of human intelligence, planning has been studied entire set of possible plan elaborations that fit within the since the earliest days of AI and cognitive science. Plan- constraints in the partial-plan. This method therefore can ning research has led to many useful tools for real-world support “constraint posting” or “least commitment plan- applications, and has yielded significant insights into the ning” in which decisions are postponed rather than a selec- organization of behavior and the nature of reasoning about tion being made arbitrarily. actions. The integration of powerful constraint management tech- Early work in cognitive science sought to create general niques alongside planning methods is possible within this domain-independent problem solvers that exhibited some of framework (e.g., as in MOLGEN (Stefik 1981) for plan the characteristics observed in human problem solving. The object constraints, Deviser (Vere 1981) and FORBIN (Dean, most influential early example was the General Problem Firby, and Miller 1990) for temporal constraints, and SIPE Solver (GPS) proposed in 1959 (Newell and Simon 1963). It (Wilkins 1988) for resource constraints). This means that introduced techniques still in regular use today: means-ends planning and scheduling techniques can be intermixed (see analysis or goal directed problem solving, and finding “dif- ferences” between goal and current states. TEMPORAL REASONING). Planning 653 executed, and perhaps a partial solution. This can then be Partial-plan search spaces lend themselves very well to refined through the hierarchy into greater levels of detail “refinement planning” approaches (Kambhampati, Knob- while also addressing outstanding issues and flaws in the lock, and Yang 1996), where an outline plan is refined to plan. address outstanding flaws or issues. However, it also lends Researchers and technologists in the planning field have itself to refining existing partial descriptions of a solution to added many extra features to the basic STRIPS action repre- a problem, to instantiating previously created generic plans, sentation over the years. These have included: or to adapting plans drawn from case libraries (see CASE- BASED REASONING AND ANALOGY). 1. Abstraction of the levels of conditions and effects (as in In the mid-1970s, NOAH (for Net of Action Hierarchies) ABSTRIPS (for ABstract STRIPS); Sacerdoti 1974) (Sacerdoti 1977), and then “Nonlin” (the nonlinear planner where a skeleton plan is first developed that addresses built by Austin Tate) (Tate 1977), began to allow plans to be important preconditions before refining that to deal with represented as partial orders on the actions contained within other detailed preconditions; for example, in the robot them, rather than insisting on the activities within plans waiter domain, we may first develop a plan that ignores details of the robot’s movements between tables. being fully ordered. (Unfortunately, the terminology of the 2. The addition of resource, time, and spatial constraints to time led to partially ordered plans also being called “nonlin- reflect the scheduling requirements on actions. ear” plans, which caused confusion with the “linear” and 3. The use of universally quantified preconditions or “nonlinear” planning approaches to the order in which goals effects; for example, a “Clearall” action to move all and sub goals were solved in planners.) Some problems that items on a given table to the counter. had caused difficulty for earlier planners such as STRIPS were more easily addressed when a partially ordered plan The expressiveness of a planner’s action representation is a representation was used, but it became more difficult to major contributor to the effectiveness of a planning system, ensure that conditions on actions in the plan were satisfied but can also lead to very large search spaces if used in from the effects of earlier actions. Potential interactions uncontrolled ways (see COMPUTATIONAL COMPLEXITY). with parallel actions had to be resolved in a valid plan. Techniques from knowledge engineering and knowledge Means to “protect” the condition establishment had to be acquisition are beginning to be used to improve the model- added to planners (as introduced in HACKER; Sussman ing and capture of information about planning domains. In 1975). The “Nonlin” planner’s question answering proce- common with experience in KNOWLEDGE-BASED SYSTEMS, dure (Tate 1977) included means to decide whether a speci- the use of richer models of the application domain have fied condition was already satisfied at a given point in a been found to be beneficial, such as in search space pruning partially ordered network of actions and, if necessary, could and guidance. propose orderings to be added to the plan to satisfy such a Planning as a field has branched out in recent years to condition. This provides information that can support the include a wide range of research topics related to reasoning protection of the plan’s causal structure (also called “goal about activities. One important area of investigation structure” or “teleology”). This work was later used as the involves planning for activities that take place in environ- basis for the formalization of the “modal truth criterion” ments where the outcome of actions is uncertain. For exam- (Chapman 1991) used at the heart of later planners for con- ple, the robot waiter “Pickup” action may fail if an object is dition establishment and protection. too heavy. Some of the techniques used in “classical” AI Partially ordered planning (POP) algorithms are the basis planning assume “perfect” information about the outcomes for a number of modern planners such as SIPE (Wilkins of actions. However, there is also a great deal of work on 1988), O-Plan (Currie and Tate 1991), and UCPOP (Penber- coping with uncertainty during the planning process or dur- thy and Weld 1992). ing the execution of plans. The hierarchical organization of action descriptions is an Conditional or “contingency” branches may be included important technique that may reduce complexity and in a plan to allow for the most likely scenarios. “Reactive significantly increase the scale of plans that can be gener- planning” techniques can be used to select activities at exe- ated in some domains. Most practical planners employ cution time on the basis of the situation that a system finds “hierarchical planning” methods. A library of action itself in. Uncertainty has also been addressed by general- descriptions is maintained, some of which have a decompo- purpose algorithms for solving planning problems cast as sition into a number of subactions at a more detailed level, Markov decision processes (Dean et. al. 1995; see also MUL- and some of which are considered “primitive.” For exam- TIAGENT SYSTEMS). ple, in the robot waiter domain, a high-level “Cleartable” The volume Readings in Planning (Allen, Hendler, and action may be decomposed into primitive subactions to Tate 1990) collects together many seminal papers that have move to a table, pick up an item on the table, move to the documented the main advances in the field of planning. It counter, and put down the item held in the robot hand. A presents a historical perspective to the work undertaken in higher-level action in the plan can then be replaced with this field. Several overviews in the Readings volume from some suitable decomposition into more detailed actions. different perspectives serve as an introduction to research on This is sometimes referred to as “hierarchical task net- planning. work” (HTN) planning. See also KNOWLEDGE REPRESENTATION; ROBOTICS; HTN planning lends itself to the refinement planning PROBLEM SOLVING; INTELLIGENT AGENT ARCHITECTURE model. An initial plan can incorporate the task specification, assumptions about the situation in which the plan is to be —Austin Tate 654 Plasticity References Simmons, R., and M. Veloso. (1998). Proceedings of the Fourth International Conference on Artificial Intelligence Planning Allen, J. F., J. Hendler, and A. Tate. (1990). Readings in Planning. Systems. Menlo Park, CA: AAAI Press. San Francisco: Kaufmann. Tate, A., Ed. (1996). Advanced Planning Technology—The Tech- Chapman, D. (1991). Planning for conjunctive goals. Artificial nological Achievements of the ARPA/Rome Laboratory Plan- Intelligence 32: 333–377. ning Initiative. Menlo Park, CA: AAAI Press. Currie, K. W., and A. Tate. (1991). O-Plan: The Open Planning Weld, D. (1994). An introduction to least-commitment planning. Architecture. Artificial Intelligence 52 (1): 49–86. Artificial Intelligence 15: 27–61. Dean, T., J. Firby, and D. Miller. (1990). Hierarchical planning Zweben, M., and M. E. Fox, Eds. (1995). Intelligent Scheduling. involving deadlines, travel time and resources. Computational San Francisco: Kaufmann. Intelligence 4 (4): 381–398. Dean, T., L. Kaebling, J. Kirman, and A. Nicholson. (1995). Plan- Plasticity ning under time constraints in stochastic domains. Artificial Intelligence 76 (1–2): 35–74. Fikes, R. E., P. E. Hart, and N. J. Nilsson. (1971). STRIPS: A new See AUDITORY PLASTICITY; NEURAL DEVELOPMENT; NEU- approach to the application of theorem proving to problem RAL PLASTICITY solving. Artificial Intelligence 3 (4): 251–288. Green, C. (1969). Application of theorem proving to problem solv- ing. Proceedings of the First International Joint Conference on Play Artificial Intelligence (IJCAI-69). Washington, DC: IJCAII, pp. 219–239. Kambhampati, S., C. Knoblock, and Q. Yang. (1995). Planning as See SOCIAL PLAY BEHAVIOR refinement search: A unified framework for evaluating design tradeoffs in partial order planning. Artificial Intelligence 76: Poetry 167–238. Newell, A., and H. A. Simon. (1963). GPS, a program that simu- lates human thought. In E. A. Fiegenbaum and J. Feldman, See METER AND POETRY Eds., Computers and Thought. New York: McGraw Hill, pp. 279–293. Polysynthetic Languages Penberthy, J. S., and D. S. Weld. (1992). UCPOP: A sound, com- plete, partial order planner for ADL. Proceedings of Knowledge Representation 1992 (KR-92) pp. 103–114. Polysynthetic languages are languages that allow the forma- Sacerdoti, E. D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence 5(2): 115–135. tion of extremely long and complex words that are built up Sacerdoti, E. D. (1977). A Structure for Plans and Behavior. spontaneously out of many smaller parts. One such word Amsterdam: Elsevier. can typically be the functional equivalent of an entire sen- Stefik, M. J. (1981). Planning with constraints. Artificial Intelli- tence in a language like English. For example, a speaker of gence 16: 111–140. the Mohawk language might make up the word wahonwa- Sussman, G. J. (1975). A Computer Model of Skill Acquisition. tia’tawitsherahetkenhten’, and this would immediately be Elsevier/North-Holland. understood by other Mohawk speakers as meaning “She Tate, A. (1977). Generating project networks. In Proceedings of made the thing that one puts on one’s body ugly for him.” the International Joint Conference on Artificial Intelligence The term polysynthesis was coined in the late 1800s, when (IJCAI-77) San Francisco: Kaufmann. linguists began to develop typologies of natural languages Vere, S. (1981). Planning in time: Windows and durations for activities and goals. In IEEE Transactions on Pattern Analysis based on knowledge of languages from outside Europe and and Machine Intelligence, vol. 5. Los Alamitos, CA: IEEE the Middle East. For these early typologists, a synthetic lan- Press. guage was one like Latin and Greek, which use affixes to Wilkins, D. (1988). Practical Planning. San Francisco: Morgan express the structural and meaning relationships among the Kaufmann. words in a sentence. A polysynthetic language, then, is one that carries this method of expression to an extreme (Boas Further Readings 1911; Sapir 1921; the first important discussion of the con- cept is Humboldt 1836, although he doesn’t use this term). Drabble, B. (1996). Proceedings of the Third International Confer- Polysynthesis is particularly associated with the languages ence on Artificial Intelligence Planning Systems. Menlo Park, of North America, Inuit and Aztec being two paradigm- CA: AAAI Press. defining cases. Nevertheless, it refers to a structural type of Georgeff, M. P., and A. L. Lansky. (1986). Reasoning about actions and plans. In Proceedings of the 1986 Workshop, Tim- language, not a linguistic area: there are polysynthetic lan- berline, Oregon. San Francisco: Morgan Kaufmann. guages spoken in Australia, New Guinea, Siberia, and India, Kambhampati, S., and D. S. Nau. (1996). On the nature and role of whereas many native American languages are not polysyn- modal truth criteria in planning. AIJ 82: 129–155. thetic. The polysynthetic languages probably do not consti- McAllister, D., and D. Rosenblitt. (1991). Systematic non-linear tute a discrete type; rather there seems to be a continuum of planning. In Proceedings of the Ninth National Conference on languages determined by how much they rely on complex Artificial Intelligence (AAAI-91), vol. 2. Anahiem, CA: AAAI words to express various linguistic relationships. Press, pp. 634–639. The study of polysynthetic languages has been important Russell, S. J., and P. Norvig. (1995). Artificial Intelligence: A Mod- for several reasons. First, they present an excellent way of ern Approach. Englewood, NJ: Prentice-Hall. Polysynthetic Languages 655 exploring the relationships between the different branches different ways in English syntax, English morphology, and of linguistics. In particular, ideas about the connections Mohawk syntax, because of differences in whether HEAD between SYNTAX and MORPHOLOGY are well studied by MOVEMENT takes place (Baker 1988). looking at these languages, because they seem to use a dif- Another property of polysynthetic languages is that their ferent division of labor from languages like English, with verbs contain elements that indicate the person, number, and more burden on morphology and less on syntax. Thus, the gender of both the subject and object. As a result, the verb, study of such languages has led to new proposals about the subject, and object can appear in any imaginable word relationship between these components (e.g., Sadock 1980, order, in addition to the subject-verb-object order that is 1985; Baker 1988). These languages also raise interesting required in English. The subject and the object can also be questions about the LEXICON and its relationship to both left out entirely. (Languages with these properties are called syntax and morphology, because it is clear that speakers of a nonconfigurational; Hale 1983.) All this contributes to the polysynthetic language cannot possibly learn more than a impression that these languages have no syntactic structure tiny fraction of the expressions that count as words in their to speak of—in contrast to English. The view changes, how- language. ever, once one realizes that the elements in the verb are However, the first and most important reason for study- really the equivalent of English pronouns (Jelinek 1984; Van ing polysynthetic languages is that they constitute one of the Valin 1985; Bresnan and Mchombo 1987; Mithun 1987). most extreme and “exotic” classes of language in a linguis- Thus, the Mohawk sentence Sak wahahninu’ atyatawi is not tic TYPOLOGY. As such, they provide one of the strongest best compared to English “Sak bought a dress,” but rather to testing grounds for the validity of proposed LINGUISTIC UNI- a colloquial dislocated structure “Sak, he finally bought it, VERSALS. In this way, they become indirectly relevant to the dress.” Such structures allow some freedom of word questions about the INNATENESS OF LANGUAGE, because order (“That dress, he finally bought it, Sak”) and the omis- that idea implies that there must be many substantive fea- sion of noun phrases (“He bought it”) even in English. tures of natural language that are attested across the whole Baker (1996) argues that there are in fact many such human species (see also RATIONALISM VS. EMPIRICISM). abstract similarities between polysynthetic languages and Finally, polysynthetic languages raise questions about the more familiar ones. If this is correct, then polysynthetic lan- EVOLUTION OF LANGUAGE: their existence forces one to ask guages could actually give some of the most striking evi- how it is that human linguistic capacities are articulated dence in favor of linguistic universals, the innateness of enough to account for the ease of language acquisition and language, and a rationalist view. yet flexible enough to generate languages that are superfi- —Mark Baker cially so different. What linguists think they have learned about such mat- References ters from studying polysynthetic languages has varied over time. These languages contributed greatly to the impression Baker, M. (1988). Incorporation: A Theory of Grammatical Func- of Boas and Sapir that “language is a human activity that tion Changing. Chicago: University of Chicago Press. varies without assignable limit” (Sapir 1921), leading away Baker, M. (1996). The Polysynthesis Parameter. New York: Oxford from linguistic universals, innateness, and rationalism. University Press. However, more recent research has uncovered facts that Boas, F. (1911). Introduction. In F. Boas, Ed., Handbook of Ameri- point toward the opposite conclusion. For example, one can Indian Languages pp. 1–83. common aspect of polysynthesis is noun incorporation, Bresnan, J., and S. Mchombo. (1987). Topic, pronoun, and agree- whereby the noun referring to the thing affected by an ment in Chichewa. Language 63: 741–782. Hale, K. (1983). Warlpiri and the grammar of nonconfigurational action is expressed inside the verb, rather than as a separate languages. Natural Language and Linguistic Theory 1: 5–49. direct object (see THEMATIC ROLES and GRAMMATICAL Humboldt, W. von (1836). Über die Verschiedenheit des menschli- RELATIONS). Thus, in Mohawk one can say wa’kana’tara- chen Sprachbaues und ihren Einfluss auf die geistige Entwick- kwetare’ ‘It cut the bread,’ a single word that contains both lung des Menschengeschlechts. Berlin: Königliche Akademie na’tar ‘bread’ and kwetar ‘cut’. This is unlike English, der Wissenschaften. where one cannot naturally say It bread-cut. However, there Jelinek, E. (1984). Empty categories, case, and configurationality. is a point of similarity as well. English does allow affected Natural Language and Linguistic Theory 2: 39–76. objects and verbs to be compounded in other environments: Mithun, M. (1987). Is basic word order universal? In R. Tomlin, one can refer to a long, serrated knife as a bread-cutter, for Ed., Coherence and Grounding in Discourse. Amsterdam: Ben- example. Significantly, neither language allows a noun that jamins, pp. 281–328. Sadock, J. (1980). Noun incorporation in Greenlandic. Language refers to the cause of the event to be inside the verb. Thus, in 56: 300–319. Mohawk one cannot say wawasharakwetare’ (containing Sadock, J. (1985). Autolexical syntax: A proposal for the treatment ashar ‘knife’) for ‘The knife cut it’; neither in English could of noun incorporation and similar phenomena. Natural Lan- one call a sliced-up loaf a knife-cuttee. Moreover, there is guage and Linguistic Theory 3: 379–440. evidence that the verb and the affected object form a rela- Sapir, E. (1921). Language. New York: Harcourt Brace Jovanov- tively tight unit in the syntax of English (the verb phrase) to ich. the exclusion of the causer of the event (see X-BAR THE- Van Valin, R. (1985). Case marking and the structure of the ORY). Collating facts like these, one finds a true universal: Lakhota clause. In J. Nichols and A. Woodbury, Eds., Grammar affected objects form tighter constructions with verbs than Inside and Outside the Clause. Cambridge: Cambridge Univer- causers do. This universal property then manifests itself in sity Press, pp. 363–413. 656 Population-Level Cognitive Phenomena increased regionally during mental activity and concluded, Further Readings correctly we now know, that brain circulation changes selec- Bach, E. (1993). On the semantics of polysynthesis. Berkeley Lin- tively with neuronal activity. guistics Society 19: 361–368. At the close of World War II, Seymour Kety and his col- Baker, M. (1997). Complex predicates and agreement in polysyn- leagues opened the modern era of studies of brain circula- thetic languages. In A. Alsina, J. Bresnan, and P. Sells, Eds., tion and metabolism, introducing the first quantitative Complex Predicates. Stanford, CA: CSLI Publications, pp. methods for measuring whole-brain blood flow and metab- 247–288. olism in humans. The introduction by Kety’s group of an in Foley, W. (1991). The Yimas language of New Guinea. Stanford, vivo tissue autoradiographic measurement of regional CA: Stanford University Press. blood flow applicable only in laboratory animals (Kety Mithun, M. (1984). The evolution of noun incorporation. Lan- guage 60: 847–893. 1960; Landau et al. 1955) provided the first glimpse of Reinholtz, C., and K. Russell. (1994). Quantified NPs in pronomi- quantitative changes in blood flow in the brain related nal argument langauges: Evidence from Swampy Cree. North directly to brain function. This work clearly foretold what Eastern Linguistics Society 25: 389–403. was to come in the modern era of functional brain imaging Sadock, J. (1991). Autolexical Syntax. Chicago: University of Chi- with PET and MRI. cago Press. Soon after Kety and his colleagues introduced their quan- titative methods for measuring whole-brain blood flow and Population-Level Cognitive Phenomena metabolism in humans, David Ingvar, Neils Lassen, and their Scandinavian colleagues introduced methods applicable to humans that permitted regional blood flow measurements to See INTRODUCTION: CULTURE, COGNITION, AND EVOLUTION be made using scintillation detectors arrayed like a helmet over the head (Lassen et al. 1963). They demonstrated Positron Emission Tomography directly in normal human subjects that blood flow changed regionally during changes in brain functional activity. Emission tomography is a visualization technique in nuclear In 1973 Godfrey Hounsfield (Hounsfield 1973) intro- medicine that yields an image of the distribution of a previ- duced x-ray computed tomography (CT), a technique based ously administered radionuclide in any desired transverse upon principles presented in 1963 by Alan Cormack (Cor- section of the body. Positron emission tomography (PET) mack 1963, 1973). Overnight the way in which we looked at utilizes the unique properties of the annihilation radiation the human brain changed. Immediately, researchers envi- generated when positrons are absorbed in matter. It is char- sioned another type of tomography, positron emission acterized by the fact that an image reconstructed from the tomography, or PET (Hoffman et al. 1976; Ter-Pogossian et radioactive counting data is an accurate and quantitative al. 1975). representation of the spatial distribution of a radionuclide in With the introduction of PET (Hoffman et al. 1976; Ter- the chosen section. This approach is analogous to quantita- Pogossian et al. 1975) a new era of functional brain mapping tive antoradiography performed in laboratory animals but began. The autoradiographic techniques for the measurement has the added advantage of allowing in vivo studies and, of blood flow (Kety 1960; Landau et al. 1955) and glucose hence, studies to be performed safely in human subjects. metabolism (Sokoloff et al. 1977) in laboratory animals PET, now along with MAGNETIC RESONANCE IMAGING could now be performed safely in humans (Raichle et al. (MRI), is at the forefront of cognitive neuroscience research 1983; Reivich et al. 1979). in normal humans. The signal used by PET and MRI in this Soon it was realized that highly accurate measurements research is based on the fact that changes in the cellular of brain functional anatomy in humans could be performed activity of the brain of normal, awake humans and unanes- with PET (Posner and Raichle 1994). While such functional thetized laboratory animals are invariably accompanied by brain imaging could be accomplished with either measure- changes in local blood flow (for reviews, see Raichle 1987, ments of blood flow or metabolism (Raichle 1987), blood 1998). While PET measures blood flow directly, functional flow became the favored technique with PET because it MRI or fMRI as it is now called, relies on the local changes could be measured quickly (in less than one minute) using an easily produced radiopharmaceutical (H215O) with a in magnetic field properties occurring in the brain that result from changes in the blood flow that exceed changes in oxy- short half-life (123 sec) which allowed many repeat mea- gen consumption (Raichle 1998). This is known as the surements in the same subject (Raichle 1998). blood oxygen level–dependent or BOLD signal. The study of human cognition with PET was aided This robust, empirical relationship between blood flow greatly by the involvement of cognitive psychologists in the and brain function has fascinated scientists for well over a 1980s whose experimental designs for dissecting human hundred years. One has only to consult William JAMES’s behaviors using information processing theory fit extremely monumental two-volume text Principles of Psychology well with the emerging functional brain imaging strategies (James 1890) on page 97 of the first volume to find refer- (Posner and Raichle 1994). As a result of collaboration ence to changes in brain blood flow during mental activities. among neuroscientists, imaging scientists, and cognitive psy- He references primarily the work of the Italian physiologist chologists, a distinct behavioral strategy for the functional Angelo Mosso (1881) who recorded the pulsation of the mapping of neuronal activity emerged. This strategy was human cortex in patients with skull defects following neuro- based on a concept introduced by the Dutch physiologist surgical procedures. Mosso showed that these pulsations Franciscus C. Donders in 1868 (reprinted in Donders 1969). Positron Emission Tomography 657 Figure 1. Four different hierarchically organized conditions are complexity was increased from simply opening the eyes (row 1) represented in these mean blood flow difference images obtained through passive viewing of nouns on the television monitor (row 2); with PET. All of the changes shown in these images represent reading aloud the nouns as they appear on the screen (row 3); and increases over the control state for each task. A group of normal saying aloud an appropriate verb for each noun as it appeared on the subjects performed these tasks involving common English nouns screen (row 4). These horizontal images are oriented with the front (Petersen et al. 1988; Petersen et al. 1989) to demonstrate the of the brain on top and the left side to the reader’s left. The marking spatially distributed nature of the processing by task elements going “Z = 40” indicates milimeters above and below a horizontal plane on in the normal human brain during a simple language task. Task through the brain marked “Z = 40”. Donders proposed a general method of measuring thought cific cognitive decisions) can be added or removed without processes based on a simple logic. He subtracted the time affecting ongoing processes (e.g., motor processes). Pro- needed to respond to a light (say, by pressing a key) from the cessing areas of the brain whose activity is differentially time needed to respond to a particular color of light. He altered at various stages of a hierarchically organized cogni- found that discriminating color required about 50 msec. In tive paradigm can be readily seen with imaging (figure 2). this way, Donders isolated and measured a mental process Clearly, extant data now provide many examples of areas of for the first time by subtracting a control state (i.e., respond- the brain active at one stage in a hierarchically designed par- ing to a light) from a task state (i.e., discriminating the color adigm which become inactive as task complexity is of the light). This strategy (figure 1) was first introduced to increased (for a recent review, see Raichle 1998). While functional brain imaging with PET in the study of single- changes of this sort are hidden from the view of the cogni- word processing (Petersen et al. 1988, 1989, 1990) but tive scientist they become obvious when brain imaging is quickly became the dominant approach to the study of all employed. aspects of human cognition with functional brain imaging. A final caveat with regard to imaging certain cognitive One criticism of this subtractive approach has been that paradigms is that the brain systems involved do not necessar- the time necessary to press a key after a decision to do so ily remain constant through many repetitions of the task has been made, for instance, is affected by the nature of the (e.g., see Raichle et al. 1994; Raichle 1998). While simple decision process itself. By implication, the nature of the habituation might be suspected when a task is tedious, this is processes underlying key press, in this example, may have not the issue referred to here. Rather, when a task is novel been altered. Although this issue (known in cognitive sci- and, more importantly, conflicts with a more habitual ence jargon as the assumption of pure insertion) has been response to the presented stimulus, major changes can occur the subject of continuing discussion in cognitive psychol- in the systems allocated to the task. Such changes have both ogy, it finds its resolution in functional brain imaging, practical and theoretical implications when it comes to the where changes in any process are directly signaled by design and interpretation of cognitive activation experiments. changes in observable brain states. Events occurring in the Functional brain imaging provides a unique perspective brain are not hidden from the investigator as in the purely on the relationship between brain function and behavior in cognitive experiments. Careful analysis of the changes in humans that is unavailable in the purely cognitive experi- the functional images reveals whether processes (e.g., spe- ments and, in many instances, unattainable in experiments 658 Positron Emission Tomography Figure 2. Hierarchically organized subtraction involving the same available in Figures 1 and 2 provides a fairly complete picture of the task conditions as shown in Figure 1 with the difference being that interactions between tasks and brain systems in hierarchically these images represent areas of decreased activity in the condition organized cognitive tasks when studied with functional brain as compared with the control condition. Combining the information imaging. restricted to laboratory animals. fMRI has greatly expanded Hoffman, E. J., M. E. Phelps, N. A. Mullani, C. S. Higgins, and M. M. Ter-Pogossian. (1976). Design and performance characteris- the work initiated with PET owing to its better spatial and tics of a whole-body positron tranxial tomograph. Journal of temporal resolution. Using fMRI it is now possible, for Nuclear Medicine 17: 493–502. example, to image the brain changes associated with single Hounsfield, G. N. (1973). Computerized transverse axial scanning cognitive events in individual subjects (Buckner et al. (tomography): Part I. Description of system. British Journal of 1996). Radiology 46: 1016–1022. One of the great challenges remaining in the use of func- James, W. (1890). Principles of Psychology. New York: Henry tional imaging with either PET or MRI is to understand Holt, pp. 97–99. more fully the relationship between brain blood flow and Kety, S. (1960). Measurement of local blood flow by the exchange brain function (Raichle 1998). on an inert diffusible substance. Methods in Medical Research See also CEREBRAL CORTEX; CORTICAL LOCALIZATION, 8: 228–236. Landau, W. M., W. H. Freygang Jr., L. P. Roland, L. Sokoloff, and HISTORY OF; ELECTROPHYSIOLOGY, ELECTRIC AND MAG- S. Kety. (1955). The local circulation of the living brain: Values NETIC EVOKED FIELDS; PSYCHOPHYSICS; UNITY OF SCIENCE; in the unanesthetized and anesthetized cat. Transactions of the SINGLE-NEURON RECORDING American Neurological Association 80: 125–129. —Marcus Raichle Lassen, N. A., K. Hoedt-Rasmussen, S. C. Sorensen, E. Skinhoj, B. Cronquist, E. Bodforss, and D. H. Ingvar. (1963). Regional cerebral blood flow in man determined by Krypton-85. Neurol- References ogy 13: 719–727. Buckner, R. L., P. A. Bandettini, K. M. O’Craven, R. L. Savoy, S. Mosso, A. (1881). Über den Kreislauf des Blutes im menschlichen E. Petersen, M. E. Raichle, and B. R. Rosen. (1996). Detection Gehirn. Leipzig: Verlag von Veit. of cortical activation during averaged single trials of a cognitive Petersen, S. E., R. T. Fox, M. I. Posner, M. Mintum, and M. E. task using functional magnetic resonance imaging. Proceedings Raichle. (1988). Positron emission tomographic studies of the of the National Academy of Sciences 93: 14878–14883. cortical anatomy of single-word processing. Nature 331: 585– Cormack, A. M. (1963). Representation of a function by its line 589. integrals, with some radiological physics. Journal of Applied Petersen, S. E., P. T. Fox, M. I. Posner, M. A. Mintun, and M. E. Physics 34: 2722–2727. Raichle. (1989). Positron emission tomographic studies of the Cormack, A. M. (1973). Reconstruction of densities from their processing of single words. Journal of Cognitive Neuroscience projections, with applications in radiological applications. 1: 153–170. Phys. Med. Biol. 18: 195–207. Petersen, S. E., P. T. Fox, A. Z. Snyder, and M. E. Raichle. (1990). Donders, F. C. (1869/1969). On the speed of mental processes. Activation of extrastriate and frontal cortical areas by visual Acta Psychologia 30: 412–431. words and word-like stimuli. Science 249: 1041–1044. Possible Worlds Semantics 659 Possible worlds semantics is used in compositional theo- Posner, M. I., and M. E. Raichle. (1994). Images of Mind. New York: W. H. Freeman. ries of meaning, where the meaning of a complex sentence is Raichle, M. E. (1987). Circulatory and metabolic correlates of to be obtained from the meaning of its parts (see COMPOSI- brain function in normal humans. In F. Plum, Ed., Handbook of TIONALITY). It developed from the languages of MODAL Physiology: The Nervous System V. Higher Functions of the LOGIC where the meaning of “p is true by necessity” (written Brain. Bethesda, MD: American Physiological Society, pp. Lp or p) is obtained from the meaning of p by specifying 643–674. the worlds in which Lp is true given the worlds in which p is. Raichle, M. E. (1998). Behind the scenes of function brain imag- To be specific, Lp is true in w provided p is true in every w' ing: A historical and physiological perspective. Proceedings of possible relative to w. Dual to necessity is possibility. “It is the National Academy of Sciences 95: 765–772. possible that p” (written Mp or ◊p) is true at a world w if p Raichle, M. E., W. R. W. Martin, P. Herscovitch, M. A. Mintun, itself is true in at least one w' possible relative to w. A more and J. Markham. (1983). Brain blood flow measured with intra- venous H215O. 2. Implementation and validation. Journal of elaborate example is found in the semantics of counterfac- tual sentences. Where p → q means that if p were the case, Nuclear Medicine 24: 790–798. then q would be too, then (on one account) p → q is true in Raichle, M. E., J. A. Fiez, T. O. Videen, A. K. MacLeod, J. V. Pardo, P. T. Fox, and S. E. Petersen. (1994). Practice-related a world w iff there is a world w' in which p and q are both changes in human brain functional anatomy during nonmotor true that is more similar to w than any world in which p is learning. Cerebral Cortex 4: 8–26. true but q is not. In studying these as logics it is customary Reivich, M., D. Kuhl, A. Wolf, J. Greenberg, M. Phelps, T. Ido, V. (depending on which logic is being studied) to set up first a Casella, E. Hoffman, A. Alavi, and L. Sokoloff. (1979). The [18F] flourodeoxyglucose method for the measurement of local structure in which relations are given to specify that one world is or is not possible relative to another, or that a world cerebral glucose utilization in man. Circulation Research 44: w1 is further from a world w2 than a world w3 is. But for 127–137. Sokoloff, L., M. Reivich, C. Kennedy, M. H. Des Rosiers, C. S. studying natural language we cannot assume that any partic- Patlak, K. D. Pettigrew, O. Sakurada, and M. Shinohara. ular words like “possibly” are in any way special. (1977). The [14C]deoxyglucose method for the measurement of Typically, an implementation of possible worlds seman- local glucose utilization: Theory, procedure and normal values tics for a language will require the language to be specified in the conscious and anesthetized albino rat. Journal of Neuro- by a system of rules that give the LOGICAL FORM of every chemistry 28: 897–916. sentence. Then values are assigned to the simple symbols of Ter-Pogossian, M. M., M. E. Phelps, E. J. Hoffman, and N. A. a sentence in logical form in such a way that a set of indices Mullani. (1975). A positron-emission tomograph for nuclear (worlds, times, speaker, and whatever else is involved in the imaging (PET). Radiology 114: 89–98. meaning of the sentence) emerges as the meaning of the final sentence. Thus in the sentence “Possibly Felix lives,” the Possible Worlds Semantics name “Felix” will have a person Felix as value, the verb “lives” will have as its value an operation that associates with The use of possible worlds as a part of a semantic theory of an individual (in this case Felix) the set of worlds (and times) natural language is based on the truth-conditional theory of at which that individual lives; the adverb “possibly” will meaning, that is, that the meaning of a sentence in a language have as its value an operation that associates with a set of is constituted by the conditions under which that sentence is worlds (in this case the set of worlds in which it is the case true. On this view, to know the meaning of a sentence is to that Felix lives) another set of worlds, in fact all the worlds know what the world would have to be like if that sentence from which the worlds in the first set are possible. The final were true. If the way the world is construed as the actual sentence will then be true in a world w if there is a world w' world, then other ways the world could be may be thought of possible relative to w, such that w' is in the set assigned to as alternative possible but nonactual worlds. Knowing how “Felix lives,” that is, in the set of worlds in which Felix lives. the world would be if a particular sentence were true does not To deal with tensed languages worlds can be thought of require knowledge of whether it is true, because in a given as worlds at times. More neutrally these are called “semanti- world w, a person need not know in w that w is the actual cal indices.” Possible worlds semantics requires supplemen- world. Thus, if I know the meaning of “Wellington is the cap- tation by generalizing such indices in various ways. Thus, to ital of New Zealand,” I do not have to know whether in fact it interpret “I” in a sentence like “I’d like an apple,” one needs is the capital, but I do have to know what it would be like for an index to supply a speaker (or someone regarded as the it to be the capital. In possible worlds terms I have to know of speaker). To interpret a sentence like “Everyone is present” any given world w, whether w is a world in which the sen- one requires an index to supply a domain of people, because tence is true or whether w is a world in which it is false, but I the sentence is presumably not intended to claim that every- do not have to know whether w is the actual world. To know one in the world is present, but only everyone in some con- which world is actual would be to be omniscient. textually provided universe. Our language has to be able to talk about things that may Possible worlds semantics abstracts from many features not exist. In a sentence that has become rather famous in the of linguistic behavior that have sometimes been thought semantical literature, “Someone seeks a unicorn,” there important, though the extent to which this should be done need be no particular unicorn that is being sought, and so in can be controversial. Thus, for some possible worlds theo- some way the idea of a unicorn, a creature that does not rists the ascription of truth conditions to a sentence is actually exist, has to be involved in the content of that sen- intended to be completely neutral on the question of what tence—a sentence, moreover, that all of us understand. an utterance of that sentence is being used to do. It might 660 Poverty of the Stimulus Arguments be being used to report a fact or issue an order or ask a Poverty of the Stimulus Arguments question. Other theorists may be more hesitant to speak of nondeclaratives as having truth conditions. But perhaps more importantly for cognitive science, the ascription of The “poverty of the stimulus argument” is a form of the truth conditions to a sentence is neutral on the question of problem of the under-determination of theory by data, just how those truth conditions are represented in the mind applied to the problem of language learning. Two other of a speaker. It is concerned with the question of how to well-known problems of under-determination include Wil- categorize what constitutes a representation’s having a cer- lard Van Orman Quine’s (1960) Gavagai example (a visitor tain content, not on the nature of the representation itself. to a foreign country sees a rabbit pass just as his informant Possible worlds semantics as such can be neutral on the utters the word “gavagai;” given only this evidence, “gava- metaphysical status of possible worlds. At one extreme is gai” might mean anything from rabbit, furry or nice day, the view that other possible worlds are just as real as the isn’t it? to undetached part of rabbit) and Nelson Good- actual world. At another extreme is the view that possible man’s Grue paradox (why is it that we take our experience worlds are no more than linguistic descriptions of how the in which all emeralds that we have thus far observed have world might be. For certain limited purposes, as for example been green to suggest that emeralds are green rather than the in describing the language of a computer where the possibil- (equally confirmed so far) possibility that emeralds are grue, ities that can be represented are fixed and limited, it may be namely “green when examined before the year 2000, and plausible to consider worlds to be descriptions. But it is blue when examined thereafter”?). plausible to claim that a general theory of MEANING should Learning a language involves going beyond the data: a not presuppose any particular way of representing worlds. child hears only a finite number of sentences, yet learns to speak and comprehend sentences drawn from a grammar See also INTENTIONALITY; LOGIC; PRAGMATICS; PROPO- that can represent an infinite number of sentences. The trou- SITIONAL ATTITUDES; SEMANTICS; TENSE AND ASPECT. ble that the child faces is thus a problem of under-determi- —Max Cresswell nation: any finite set of example sentences is compatible with an infinite number of grammars. The child’s task is to Further Readings pick among those grammars. The term “poverty of the stimulus” itself is relatively Cresswell, M. J. (1973). Logics and Languages. London: Methuen. recent, perhaps first used by Noam Chomsky (1980: 34); but Cresswell, M. J. (1978). Semantic competence. In F. Guenthner the argument as applied to language learning goes at least as and M. Guenthner-Reutter, Eds., Meaning and Translation. far back as Chomsky’s (1959) review of B. F. Skinner’s Ver- London: Duckworth, pp. 9–43. Reprinted in M. J. Gresswell, Semantical Essays 1988. Dordrecht: Kluwer, pp. 12–33. bal Behavior. The exact formulation of the argument varies Cresswell, M. J. (1985). Structured Meanings. Cambridge, MA: (Chomsky 1980; Crain 1991; Garfield 1994; Wexler 1991), MIT Press/Bradford Books. but a typical version states that (1) children rapidly and, to Cresswell, M. J. (1994). Language in the World. Cambridge: Cam- first approximation, uniformly acquire language; (2) chil- bridge University Press. dren are only exposed to a finite amount of data; yet (3) Lewis, D. K. (1972). General semantics. In D. Davidson and G. children appear to converge on a grammar capable of inter- Harman, Eds., Semantics of Natural Language. Dordrecht: preting unfamiliar sentences; the conclusion is often argued Reidel, pp. 169–218. to be that some aspect of grammar is innate. Lewis, D. K. (1973). Counterfactuals. Oxford: Blackwell. Although the poverty of the stimulus argument is some- Lewis, D. K. (1975). Languages and language. In K. Gunderson, times described in conjunction with the claim that children Ed., Language, Mind and Knowledge. Minneapolis: University of Minnesota Press, pp. 3–35. do not receive correction for their grammatical errors (for a Lewis, D. K. (1986). On the Plurality of Worlds. Oxford: Black- recent review of the role of parental correction in the acqui- well. sition of grammar, see Marcus 1993), it is important to Loux, M. J., Ed. (1979). The Possible and the Actual. Ithaca: Cor- reject the notion that nativist explanations of language nell University Press. acquisition depend on the lack of parental correction. Even Putnam, H. (1975). The meaning of “meaning.” In K. Gunderson, if parents did provide reliable correction to their children, Ed., Language, Mind and Knowledge. Minneapolis: University innate constraints on the generalizations that children make of Minnesota Press, pp. 131–193. Reprinted in H. Putnam, would be necessary, because many plausible errors simply Mind, Language and Reality (1975). Cambridge: Cambridge never occur. For instance, children never go through a University Press, pp. 215–271. period where they erroneously form yes-no questions by Schiffer, S. (1987). Remnants of Meaning. Cambridge, MA: MIT Press. moving the first is to the front of the sentence. Although one Stalnaker, R. C. (1968). A theory of conditionals. In N. Rescher, can turn The man is hungry into Is the man hungry?, chil- Ed., Studies in Logical Theory. Oxford: Blackwell, pp. 98–112. dren never, by a false analogy, turn The man who is hungry Stalnaker, R. C. (1978). Assertion. In P. Cole, Ed., Syntax and is ordering dinner into Is the man who hungry is ordering Semantics, vol. 9: Pragmatics. New York: Academic Press, pp. dinner? (e.g., Chomsky 1965; Crain and Nakayama 1987). 315–332. More generally, at every stage of LANGUAGE ACQUISI- Stalnaker, R. C. (1984). Inquiry. Cambridge, MA: MIT Press. TION—inferring the meaning of a new word or morpheme, Stalnaker, R. C. (1989). On what’s in the head. In J. E. Tomberlin, creating a morphological or syntactic rule, or determining Ed., Philosophical Perspectives 3: Philosophy of Mind and the subcategorization frame of a new verb—the child can Action Theory. Atascadero, CA: Ridgeview Publishing Co., pp. make an infinity of logically possible generalizations, 287–316. Pragmatics 661 regardless of whether negative evidence exists. But children Chomsky, N. A. (1965). Aspects of a Theory of Syntax. Cambridge, MA: MIT Press. do not simply cycle through all logical possibilities and Chomsky, N. A. (1980). Rules and Representations. New York: check to see what their parents say about each one; their Columbia University Press. choice of hypotheses instead must, in part, be dictated by Crain, S. (1991). Language acquisition in the absence of experi- innately given learning mechanisms. The open question is ence. Behavioral and Brain Sciences 14. not whether there are innately given constraints, but rather Crain, S., and M. Nakayama. (1987). Structure dependence in whether those constraints are specific to language. grammar formation. Language 63: 522–543. An excellent example of the poverty of the stimulus Garfield, J. L. (1994). Innateness. In S. Guttenplan, Ed., A Com- argument comes from Peter Gordon’s (1985) work on the panion to the Philosophy of Mind. Oxford: Blackwell. relation between plural formation and compounding. Paul Gordon, P. (1985). Level-ordering in lexical development. Cogni- Kiparsky (1983) had noted that while irregular plurals can tion 21: 73–93. Kiparsky, P. (1983). Word-formation and the lexicon. In F. Inge- appear in compounds (mice-infested), regular plurals sound mann, Ed., Proceedings of the 1982 Mid-American Linguistics awkward inside of compounds (rats-infested), perhaps Conference. Lawrence, KS: University of Kansas. because the design of the grammar is such that the process Marcus, G. F. (1993). Negative evidence in language acquisition. of compounding can only use stored (irregular) plurals as Cognition 46: 53–85. input, whereas the process of compounding serves as input Marcus, G. F. (1998a). Can connectionism save constructivism? to the process of regular plural formation. If irregular plurals Cognition 66: 153–182. inside compounds were common, it would be easy to see Marcus, G. F. (1998b). Rethinking eliminative connectionism. how a general purpose learning device might learn the con- Cognitive Psychology 37(3). trast between regulars and irregulars, but in fact, as Gordon Quine, W. V. O. (1960). Word and Object. Cambridge, MA: MIT noted, plurals inside compounds are rare. Given this, one Press. Wexler, K. (1991). On the arguments from the poverty of the stim- might expect that children would not be able to systemati- ulus. In A. Kasher, Ed., The Chomskyan Turn. Oxford: Black- cally distinguish between regulars and irregulars appearing well. in compounds. But Gordon found that although children allow irregular plurals inside of compounds, they systemati- cally exclude regulars from compounds; children say things Pragmatics like mice-eater, but not rats-eater. As Gordon put it, “it would seem that of all the hypotheses available, there would Pragmatics is the study of the context-dependent aspects of be little to persuade an open-minded learner to choose this, MEANING that are systematically abstracted away from in rather than some other path.” Instead, Gordon suggests, the the construction of LOGICAL FORM. In the semiotic trichot- child’s mind is structured such that it is predisposed to learn omy developed by Charles Morris, Rudolph Carnap, and C. one kind of grammar rather than another. S. Peirce in the 1930s, SYNTAX addresses the formal rela- Recently, some scholars have tried to use CONNECTION- tions of signs to one another, SEMANTICS the relation of IST APPROACHES TO LANGUAGE learning to challenge the signs to what they denote, and pragmatics the relation of poverty of the stimulus argument, but connectionist models signs to their users and interpreters. Although some have cannot obviate the need for innate constraints; instead they argued for a pragmatics module within the general theory of would simply provide a different theory of what those con- speaker/hearer competence (or even a pragmatic component straints are. Different connectionist models differ from one in the grammar), Sperber and Wilson (1986) argue that, like another in their architecture, representational schemes, scientific reasoning—the paradigm case of a nonmodular, learning algorithms, and so forth; each model thus differs horizontal system—pragmatics cannot be a module, given from every other model in its innate design (Marcus 1998a, the indeterminacy of the predictions it offers and the global 1998b). Advocates of radical connectionism often overlook knowledge it invokes (see MODULARITY AND LANGUAGE). the importance of these innate design features, but such In any case, a regimented account of language use facilitates models cannot refute the poverty of stimulus argument; a simpler, more elegant description of language structure. instead they can only show that (at most) the innate con- Those areas of context-dependent yet rule-governed aspects straints are different in character than those suggested by of meaning reviewed here include deixis, speech acts, pre- other researchers. Moreover, such researchers have yet to supposition, reference, and information structure; see also provide any concrete example of a putatively unlearnable IMPLICATURE. aspect of language that has been later shown to be learnable; Pragmatics seeks to “characterize the features of the hence their critique of the poverty of the stimulus argument speech context which help determine which proposition is is for now without much force. expressed by a given sentence” (Stalnaker 1972: 383). The See also INDUCTION; INNATENESS OF LANGUAGE; NATIV- meaning of a sentence can be regarded as a function from a ISM; RADICAL TRANSLATION; WORD MEANING, ACQUISITION context (including time, place, and possible world) into a OF proposition, where a proposition is a function from a possi- —Gary Marcus ble world into a truth value. Pragmatic aspects of meaning involve the interaction between an expression’s context of References utterance and the interpretation of elements within that expression. The pragmatic subdomain of deixis or indexical- Chomsky, N. A. (1959). Review of Verbal Behavior. Language 35: ity seeks to characterize the properties of shifters, indexicals, 26–58. 662 Pragmatics How are the presuppositions of a larger expression deter- or token-reflexives (expressions like I, you, here, there, now, mined compositionally as a function from those of its sub- then, hereby, tense/aspect markers, etc.) whose meanings are expressions? Karttunen’s (1974) solution to this “projection constant but whose referents vary with the speaker, hearer, problem” partitions operators into plugs, holes, and filters, time and place of utterance, style or register, or purpose of according to their effect on presupposition inheritance, speech act (see Levinson 1983, chap. 2). whereas Karttunen and Peters (1979) propose a formaliza- If pragmatics is “the study of linguistic acts and the tion of inheritance of pragmatic presuppositions qua “con- contexts in which they are performed” (Stalnaker 1972: ventional implicatures.” Gazdar (1979) offers an alternative 383), speech-act theory constitutes a central subdomain. It mechanism in which the potential presuppositions induced has long been recognized that the propositional content of by subexpressions are inherited as a default but are canceled utterance U can be distinguished from its illocutionary if they clash with propositions already entailed or impli- force, the speaker’s intention in uttering U. The identifica- cated by the utterance or prior discourse context. tion and classification of speech acts was initiated by Wit- Subsequent work identifies empirical and conceptual tgenstein, Austin, and Searle. In an explicit performative problems for these models. Heim (1983) identifies an opera- utterance (e.g., I hereby promise to marry you), the tor’s projection properties with its context-change potential. speaker does something—that is—performs an act whose Presuppositions are invariant pragmatic inferences: A sen- character is determined by her intention, rather than tence Σ presupposes φ if every context admitting entails φ. If merely saying something. Austin (1962) regards perfor- a context c (a conjunction of propositions) is true and c matives as problematic for truth-conditional theories of admits Σ, then Σ is true with respect to c if the context incre- meaning, because they appear to be devoid of ordinary mented by Σ is true. But if Σ is uttered in a context c not truth value; an alternate view is that a performative is admitting it, the addressee will adjust c to c′, a context close automatically self-verifying when felicitous, constituting to c but consistent with Σ. Heim’s projection theory thus a contingent a priori truth like I am here now. Of particular incorporates Stalnaker-Lewis accommodation, which linguistic significance are indirect speech acts, where the appeals in turn to the Gricean model of a cooperative con- form of a given sentence (e.g., the yes/no question in Can versational strategy dynamically exploited to generate prag- you pass the salt?) belies the actual force (here, a request matic inferences (see DYNAMIC SEMANTICS). for action) characteristically conveyed by the use of that sentence. (See Levinson 1983; chap. 4, and Searle and Soames (1989) provides a conspectus of formal Vanderveken 1985 for more on speech-act theory and its approaches to presupposition, and see also van der Sandt formalization.) (1992) for an anaphoric account of PRESUPPOSITION, projec- Although a semantic or logical presupposition is a neces- tion, and accommodation formulated within discourse rep- sary condition on the truth or falsity of statements (Frege resentation theory. On van der Sandt’s theory, the very 1892, Strawson 1950), a pragmatic presupposition is a presupposition that presuppositions are determined compo- restriction on the common ground, the set of propositions sitionally is challenged, leading to a reassessment of the constituting the current context. Its failure or nonsatisfaction entire projection-problem enterprise. results not in truth-value gaps or nonbivalence but in the Although speech acts and presuppositions operate prima- inappropriateness of a given utterance in a given context rily on the propositional level, reference operates on the (Stalnaker 1974; Karttunen 1974). In presupposing Φ, I treat phrasal level. Reference is the use of a linguistic expression Φ as an uncontroversial element in the context of utterance; (typically a noun phrase) to induce a hearer to access or cre- in asserting Ψ, I propose adding the propositional content of ate some entity in his mental model of the discourse. A dis- Ψ to the common ground or, equivalently, discarding ~Ψ course entity represents the referent of a linguistic from the set of live options, winnowing down the context set expression—that is, the actual individual (or event, property, (possible worlds consistent with the shared beliefs of relation, situation, etc.) that the speaker has in mind and is S[peaker] and H[earer]) by jettisoning worlds in which Ψ saying something about. does not hold. Within philosophy, the traditional view has been that ref- In stating Even Kim left I assert that Kim left while erence is a direct semantic relationship between linguistic presupposing that others left and that Kim was unlikely to expressions and the real world objects they denote (see have left. Such presuppositions can be communicated as SENSE AND REFERENCE and REFERENCE, THEORIES OF). new information by a speaker who “tells his auditor Researchers in computer science and linguistics, however, something . . . by pretending that his auditor already knows have taken a different approach, viewing this relation as it” (Stalnaker 1974: 202). S’s disposition to treat a mediated through the (assumed) mutual beliefs of speakers proposition as part of the common ground, thereby getting and hearers, and therefore as quintessentially pragmatic. H to adjust his model of the common ground to encompass Under this view, the form of a referring expression depends it, is codified in Lewis’s rule of accommodation for on the assumed information status of the referent, which in presupposition (1979: 340): “If at time t something is said turn depends on the assumptions that a speaker makes that requires presupposition P to be acceptable, and if P is regarding the hearer’s knowledge store as well as what the not presupposed just before t, then—ceteris paribus and hearer is attending to in a given discourse context. within certain limits—presupposition P comes into Given that every natural language provides its speakers existence at t'.” Accommodation, a special case of Gricean with various ways of referring to discourse entities, there are exploitation, is generalized by Lewis to descriptions, two related issues in the pragmatic study of reference: (1) modalities, vagueness, and performatives. What are the referential options available to a speaker of a Pragmatics 663 Like referring expressions, propositions contain information given language? (2) What are the factors that guide a that can be either discourse-new/old and hearer-new/old. speaker on a given occasion to use one of these forms over Vallduví (1992) proposes a hierarchical articulation of another? The speaker’s choice among referring expressions information within his theory of informatics. Sentences are (e.g., zero forms, pronominals, indefinites, demonstratives, divided into the focus, which represents that portion of definite descriptions, proper names) is constrained by the information that is hearer-new, and the ground, which speci- information status of discourse entities. Unidimensional fies how that information is situated within the hearer’s accounts (e.g,. Gundel, Hedberg, and Zacharski 1993) pro- knowledge-store. The ground is further divided into the vide a single, exhaustively ordered dimension (“assumed link, which denotes an address in the hearer’s knowledge- familiarity,” “accessibility,” “givenness”) along which the store under which he is instructed to enter the information, various types of referring expressions are arranged. More and the tail, which provides further directions on how the recently, Prince (1992) offers a two-dimensional account in information must be entered under a given address (see also which entities are classified as, on the one hand, either dis- Rooth 1992). Lambrecht (1994) identifies three categories course-old or discourse-new (based on whether or not they of information structure: presupposition and assertion (the have been evoked in the prior discourse) and, on the other structure of propositional information into given and new); hand, either hearer-old or hearer-new (based on whether identifiability and activation (the information status of dis- they are assumed to be present within the hearer’s course referents); and topic and focus (the relative predict- knowledge-store). ability of relations among propositions). (See also the Related to information status is the notion of definite- Functional Sentence Perspective frameworks of Firbas 1966 ness, which has been defined both as a formal marking of and Kuno 1972 and the overview in Birner and Ward, 1998.) NPs and as an information status. Research into the mean- ing of the English definite article has generally been See also DISCOURSE; FIGURATIVE LANGUAGE; GRICE, H. approached from one of two perspectives (Birner and Ward PAUL; INDEXICALS AND DEMONSTRATIVES; PROPOSITIONAL 1994); its felicitous use has been argued to require that the ATTITUDES referent of the NP be either familiar within the discourse or —Laurence Horn and Gregory Ward uniquely identifiable to the hearer. In the absence of prior linguistic evocation, the referent must be accommodated References (Lewis 1979) into the discourse model by the hearer. Research into the discourse functions of syntax is based Austin, J. L. (1962). How To Do Things With Words. Oxford: Clar- on the observation that every language provides its speakers endon Press. with various ways to structure the same proposition. That is, Birner, B. J., and G. Ward. (1994). Uniqueness, familiarity, and the definite article in English. Berkeley Linguistics Society 20: 93– a given proposition may be variously realized by a number 102. of different sentence-types, or constructions, each of which Birner, B. J., and G. Ward. (1998). Information Status and Nonca- is associated with a particular function in discourse. Con- nonical Word Order in English. Amsterdam: Benjamins. sider the sentences in (1). Firbas, J. (1966). Non-thematic subjects in contemporary English. (1) a. John did most of the work on that project. Travaux Linguistiques de Prague 2: 239–56. Frege, G. (1952/1892). On sense and reference. In P. Geach and M. b. Most of the work on that project was done by John. Black, Eds., Translations from the Philosophical Writings of c. Most of the work on that project John did. Gottlob Frege. Oxford: Blackwell, pp. 56–78. d. It’s John who did most of the work on that project. Gazdar, G. (1979). Pragmatics. New York: Academic Press. The same proposition expressed by the canonical word- Gundel, J., N. Hedberg, and R. Zacharski. (1993). Givenness, implicature, and the form of referring expressions in discourse. order sentence in (1a) can also be expressed by the (truth- Language 69: 274–307. conditionally equivalent) passive sentence in (1b), by the Heim, I. (1983). On the projection problem for presuppositions. topicalization in (1c), and by the cleft sentence in (1d), WCCFL 2 pp. 114–25. among others, each of which reflects the speaker’s view on Karttunen, L. (1974). Presupposition and linguistic context. Theo- how it is to be integrated by the hearer into the current dis- retical Linguistics 1: 181–193. course. For example, the topicalization (1c) allows the Karttunen, L., and S. Peters. (1979). Conventional implicature. In speaker to situate familiar, or discourse-old (Prince 1992) C.-K. Oh and D. A. Dinneen, Eds., Syntax and Semantics 11: information in preposed position, thus marking the preposed Presupposition. New York: Academic Press, pp. 1–56. constituent as related—or “linked”—to the prior discourse, Kuno, S. (1972). Functional sentence perspective: A case study whereas use of the cleft in (1d) reflects the speaker’s belief from Japanese and English. Linguistic Inquiry 3: 269–320. Lambrecht, K. (1994). Information Structure and Sentence Form. that her hearer has in mind the fact that somebody did most Cambridge: Cambridge University Press. of the work in question. Finally, with the passive in (1b), in Levinson, S. (1983). Pragmatics. Cambridge: Cambridge Univer- which the canonical order of arguments is reversed, the sity Press. speaker may present information that is relatively familiar Lewis, D. (1979). Scorekeeping in a language game. Journal of within the discourse before information that is relatively Philosophical Logic 8: 339–359. unfamiliar within the discourse. Prince, E. F. (1981). Toward a taxonomy of given/new information. Such constructions serve an information-packaging func- In P. Cole, Ed., Radical Pragmatics. New York: Academic tion in that they allow speakers to structure their discourse in Press, pp. 223–254. a maximally accessible way, thereby facilitating the incorpo- Prince, E. F. (1992). The ZPG letter: Subjects, definiteness, and ration of new information into the hearer’s knowledge-store. information-status. In S. Thompson and W. Mann, Eds., 664 Presupposition Discourse Description: Diverse Analyses of a Fundraising Prince, E. F. (1988). Discourse analysis: A part of the study of lin- Text. Amsterdam: John Benjamins, pp. 295–325. guistic competence. In F. Newmeyer, Ed., Linguistics: The Rooth, M. (1992). A theory of focus interpretation. Natural Lan- Cambridge Survey. Vol. 2, Linguistic Theory: Extensions and guage Semantics 1: 75–116. Implications. Cambridge: Cambridge University Press, pp. Searle, J., and D. Vanderveken. (1985). Foundations of Illocution- 164–182. ary Logic. Cambridge: Cambridge University Press. Rochemont, M., and P. Culicover. (1990). English Focus Construc- Soames, S. (1989). Presupposition. In D. Gabbay and F. Guenth- tions and the Theory of Grammar. Cambridge: Cambridge Uni- ner, Eds., Handbook of Philosophical Logic, 4. Dordrecht: versity Press. Reidel, pp. 553–616. Searle, J. (1969). Speech Acts. Cambridge: Cambridge University Sperber, D., and D. Wilson. (1986). Relevance. Cambridge, MA: Press. Harvard University Press. Searle, J. (1975). Indirect speech acts. In P. Cole and J. Morgan, Stalnaker, R. (1972). Pragmatics. In D. Davidson and G. Harman, Eds., Syntax and Semantics 3: Speech Acts. New York: Aca- Eds., Semantics of Natural Language. Dordrecht: Reidel, pp. demic Press, pp. 59–82. 380–397. Sidner, C. (1979). Towards a Computational Theory of Definite Stalnaker, R. (1974). Pragmatic presuppositions. In M. Munitz and Anaphora Comprehension in English Discourse. Ph.D. diss., P. Unger, Eds., Semantics and Philosophy. New York: New MIT. York University Press, pp. 197–214. Ward, G. (1988). The Semantics and Pragmatics of Preposing. Strawson, P. F. (1950). On referring. Mind 59: 320–344. New York: Garland. Vallduví, E. (1992). The Informational Component. New York: Webber, B. L. (1979). A Formal Approach to Discourse Anaphora. Garland. New York: Garland. van der Sandt, R. A. (1992). Presupposition projection as anaphora Webber, B. L. (1991). Structure and ostension in the interpretation resolution. Journal of Semantics 9: 333–378. of discourse deixis. Language and Cognitive Processes 6: 107– 135. Further Readings Presupposition Ariel, M. (1990). Accessing Noun-Phrase Antecedents. London: Routledge. Atlas, J. D. (1989). Philosophy Without Ambiguity. Oxford: Clar- There are two principal aspects of the MEANING convention- endon Press. ally conveyed by a linguistic expression, its presupposed Birner, B. J. (1996). Form and function in English by-phrase pas- content and its proffered content (Stalnaker 1979; Karttunen sives. Chicago Linguistic Society 32. and Peters 1979; Heim 1982; Roberts 1996b). The proffered Christophersen, P. (1939). The Articles: A Study of their Theory content is what we usually think of as the literal content of and Use in English. Copenhagen: Munksgaard. the expression: what is asserted by using a declarative sen- Clark, H., and C. Marshall. (1981). Definite reference and mutual knowledge. In A. Joshi, B. Webber, and I. Sag, Eds., Elements tence, the question raised by an interrogative, the command of Discourse Understanding. Cambridge: Cambridge Univer- (or wish, or suggestion, etc.) posed by an imperative. It is sity Press, pp. 10–63. the information that is treated as new by the speaker (or Cole, P., Ed. (1981). Radical Pragmatics. New York: Academic writer, etc.). The presupposed content is ostensibly old Press. information: it is information that the speaker (behaves as if Givón, T. (1979). On Understanding Grammar. New York: Aca- she) assumes is already known in the context of the dis- demic Press. course in progress. Hence, a presupposition imposes a Green, G. (1989). Pragmatics and Natural Language Understand- requirement on the context of use, the requirement that it ing. Hillsdale, NJ: Erlbaum. already contain the presupposed information. An older view Grosz, B., and C. Sidner. (1986). Attention, intentions, and the of presuppositions treats them as semantic—that is, as structure of discourse. Computational Linguistics 12: 175–204. Halliday, M. A. K. (1967). Notes on transitivity and theme in entailments of both a sentence and its negation (Van Fraas- English, part 2. Journal of Linguistics 3: 199–244. sen 1971). By contrast, the present notion is called prag- Hawkins, J. A. (1991). On (in)definite articles: Implicatures and matic presupposition, because a presupposition so analyzed (un)grammaticality prediction. Journal of Linguistics 27: 405– is an entailment not of the utterance itself, but of any context 442. which satisfies the imposed requirement. Horn, L. R. (1986). Presupposition, theme and variations. Papers Consider (1), where the capital letters indicate emphasis from the Parasession on Pragmatics and Grammatical Theory, on the subject. Chicago Linguistic Society 22: pp. 168–92. Horn, L. R. (1988). Pragmatic theory. In F. Newmeyer, Ed., Lin- (1) MARCIA has a bicycle, too. guistics: The Cambridge Survey. Vol. 1, Linguistic Theory: Foundations. Cambridge: Cambridge University Press, pp. (1) cannot be used without presuming that someone other 113–145. than Marcia has a bicycle, demonstrating that this proposi- Kadmon, N. (1990). Uniqueness. Linguistics and Philosophy 13: tion is in some way conventionally associated with the utter- 273–324. ance. This contrasts with conversational implicatures, which Kuno, S. (1986). Functional Syntax: Anaphora, Discourse, and in general only arise when context interacts with the con- Empathy. Chicago: University of Chicago Press. ventional content of an utterance in a certain way. That the Morgan, J. (1978). Towards a rational model of discourse compre- presupposed proposition in (1) isn’t part of what’s directly hension. Proceedings of Tinlap-2: Theoretical Issues in Natural asserted is suggested by the fact that if the addressee Language Processing. New York: ACM and ACL, pp. 109–114. directly denies the speaker’s assertion of (1), for example Prince, E. F. (1978). A comparison of wh-clefts and it-clefts in dis- replying No, he is not taken thereby to deny that someone course. Language 54: 883–906. Presupposition 665 example: other than Marcia has a bicycle but rather to deny that Mar- cia has a bicycle. In fact, such a reply implicitly acknowl- (8) a. All Dutch people own bicycles. edges the truth of the presupposition, as reflected in the b. Marcia is Dutch. dilemma of the man under oath who must reply to the ques- c. She rides her bicycle to work. tion Have you stopped stealing bicycles? The indirectness of presuppositions and their conventional (i.e., noncancella- (8c) contains the noun phrase her bicycle, with her ana- ble) nature is also reflected in the behavior of examples like phoric to Marcia, and hence presupposes that Marcia has a (1) under negation, interrogation, and conditional assump- bicycle. This presupposition is satisfied in the context sug- tion, as in the following: gested, where (8a) and (8b) together (though neither alone) entail that Marcia owns a bicycle. (2) It’s not as if MARCIA has a bicycle, too. But presupposition failure doesn’t always lead to infelic- (3) Does MARCIA have a bicycle, too? ity. Sometimes a cooperative addressee will be willing to accommodate the speaker by assuming the truth of the pre- (4) If MARCIA has a bicycle, too, we can go for a ride by supposed proposition after the fact, as if it had been true all the river. along (Lewis 1979; Heim 1983; Thomason 1990). In such a The retention in such variants of the presupposition that case, we say that the addressee has accommodated the failed someone other than Marcia has a bicycle demonstrates that presupposition, saving the conversation from infelicity. This presuppositions are logically stronger than mere entail- is not uncommon with factive verbs, such as regret in (6), and ments, which are lost under negation, interrogation, or con- in fact gossips often use factives as a way of reporting juicy ditional assumption: Here, what is directly entailed by (1), news while pretending it was already common knowledge. that Marcia has a bicycle, is not entailed by (2), (3), or (4). The linguistic and philosophical literature on presupposi- There are many kinds of expressions that conventionally tion since the early 1970s has largely focused on the so- trigger presuppositions, including inflectional affixes, lexi- called projection problem: how to predict the presupposi- cal items, and syntactic constructions. Among others in tions that a possibly complex sentence will inherit from the English, besides too, we find presuppositions convention- words and phrases that constitute it. Uttered out of the blue, ally associated with the possessive case, as in Marcia’s (9) seems to presuppose that Marcia has a bicycle. However, bicycle (which presupposes that Marcia has a bicycle), with this is only apparent, as we can see when we put (9) in the the adverbials only and even, with factive predicates like appropriate sort of context, illustrated by (9'). regret, and with constructions like the pseudo-cleft con- (9) Marcia believes that she sold her bicycle. struction, as illustrated by the following. (The reader may use the negated, interrogative, and conditional forms of (9') Marcia is quite mad. Last week, she imagined that she these examples to test that the presuppositions noted are had acquired a bicycle and a motorscooter. Now, she implicated.) believes that she sold her bicycle. (5) Marcia even sold her BICYCLE. Hence, the effect of the main verb believe in (9) is quite dif- Presupposed: Her bicycle was one of the least likely ferent from that of regret in the otherwise identical (6): things for Marcia to have sold, and there are other regret always passes along the presuppositions of its senten- things that she sold. tial complement she sold her bicycle, (that is, the proposi- Asserted: Marcia sold her bicycle. tion that Marcia had a bicycle), to the matrix sentence; we say that the complement sentence’s presuppositions are pro- (6) Marcia regrets that she sold her bicycle. jected. But believe doesn’t necessarily do so—whether it Presupposed: Marcia sold her bicycle. does depends on the context of utterance. Asserted: Marcia regrets having done so. Karttunen (1973) classifies so-called factive predicates like regret as holes to presupposition, because they pass (7) What Marcia sold is her bicycle. along the presuppositions of their complements to become Presupposed: Marcia sold something. presuppositions of the clause of which they are the main Asserted: Marcia sold her bicycle. verb. Other holes include negation (in (2)), the interrogative If the interlocutors in a conversation do not already agree construction (in (3)), and the antecedent of a conditional (in on the truth of a presupposition associated with an utter- (4)). Predicates like say are said to be presupposition plugs, ance, then we say that the presupposition has failed to be not passing along any of the presuppositions of a comple- satisfied in the context of utterance. Presupposition failure ment; replacing regrets with says in (6) gets rid of the pre- often leads to infelicity. Following the seminal work of supposition that Marcia sold her bicycle. Predicates like Karttunen (1973) and Stalnaker (1974), an utterance with believe are said to be presupposition filters, since they only presupposition p is felicitious in context c iff c entails p. If pass along their complement’s presuppositions under certain one utters (1) in a conversation whose interlocutors don’t conditions. Filters include a number of syntactic construc- have in their common ground the information that someone tions, as well as embedding predicates and other operators. besides Marcia has a bicycle, then it will sound distinctly The filtering behavior of the conditional construction is odd. If the interlocutors haven’t been discussing what Mar- illustrated in (10): cia (perhaps among others) sold, then (7) will seem infelici- tous. That the presupposition p itself needn’t have been (10) If Marcia sold her bicycle, then by now she regrets directly asserted is demonstrated by the following type of selling it. 666 Primate Amygdala The consequent of (10) carries the presupposition that Mar- Lewis, D. (1979). Score-keeping in a language game. In B. Egli and A. von Stechow, Eds., Semantics from a Different Point of cia regrets selling her bicycle; but (10) as a whole does not View. Berlin: Springer. presuppose this. Although one can utter (10) in a context in Neale, S. (1990). Descriptions. Cambridge, MA: MIT Press. which it’s known that Marcia sold the bicycle, it would also Roberts, C. (1996a). Anaphora in intensional contexts. In S. Lap- be felicitous in a context in which we only know that she pin, Ed., Handbook of Semantics. Oxford: Blackwell, pp. 215– was contemplating doing so. 246. The merits of three principal types of theories of presup- Roberts, C. (1996b). Information structure in discourse: Towards position and of presupposition projection are currently being an integrated formal theory of pragmatics. In J.-H. Yoon and A. debated: satisfaction theories (Karttunen 1973; Stalnaker Kathol, Eds., OSU Working Papers in Linguistics 49: Papers in 1974, 1979; Heim 1983, 1992), cancellation theories (Gaz- Semantics. Ohio State University, pp. 91–136. dar 1979; Soames 1989), and anaphoric theories (Van der Russell, B. (1905). On denoting. Mind 66: 479–493. Soames, S. (1989). Presupposition. In D. Gabbay and F. Guenth- Sandt 1989, 1992). See Beaver 1998 for an excellent techni- ner, Eds., Handbook of Philosophical Logic. Vol. 4. Dordrecht: cal overview of current theory with extensive comparison. Reidel, pp. 553–616. For historical overviews of the linguistic and philosophical Stalnaker, R. C. (1974). Pragmatic presuppositions. In M. Munitz literature on presupposition (including the important debate and D. Unger, Eds., Semantics and Philosophy. New York: New on the purported presuppositions of definite Noun Phrases in York University Press, pp. 197–219. Frege 1892; Russell 1905, and Strawson 1950), the reader is Stalnaker, R. C. (1979). Assertion. In P. Cole, Ed., Syntax and referred to Levinson (1983) and Soames (1989). Evans Semantics 9: Pragmatics. New York: Academic Press, pp. 315– (1977), Heim (1982), Kadmon (1990), and Neale (1990), 332. among others, continue the Russell/Strawson debate. And Strawson, P. F. (1950). On referring. Mind 59: 320–344. Roberts (1996a) discusses a phenomenon dubbed modal Thomason, R. H. (1990). Accommodation, meaning, and implica- ture: Interdisciplinary foundations for pragmatics. In P. R. subordination, which poses prima facie problems for most Cohen, J. Morgan, and M. E. Pollack, Eds., Intentions in Com- theories of presupposition munication. Cambridge, MA: MIT Press, pp. 325–363. See also ANAPHORA; GRICE, H. PAUL; IMPLICATURE; Van der Sandt, R. A. (1989). Presupposition and discourse struc- PRAGMATICS ture. In R. Bartsch, J. van Benthem, and P. van Emde Boas, Eds., Semantics and Contextual Expression. Dordrecht: Foris. —Craige Roberts Van der Sandt, R. A. (1992). Presupposition projection as anaphora resolution. Journal of Semantics 9(4): 333–377. Van Fraassen, B.C. (1971). Formal Semantics and Logic. New References York: Macmillan. Beaver, D. (1997). Presupposition. In J. van Benthem and A. ter Meulen, Eds., Handbook of Logic and Language. Amsterdam: Primate Amygdala Elsevier/Cambridge: MIT Press, pp. 939–1008. Chierchia, G. (1995). Dynamics of Meaning: Anaphora, Presuppo- sition, and the Theory of Grammar. Chicago: University of Chi- See AMYGDALA, PRIMATE cago Press. Evans, G. (1977). Pronouns, quantifiers and relative clauses (1). Primate Cognition Canadian Journal of Philosophy 7: 467–536. Reprinted in M. Platts, Ed., Reference, Truth, and Reality: Essays on the Philos- ophy of Language. London; Routledge and Kegan Paul, pp. Ludwig Wittgenstein remarked that if lions could speak we 255–317. Frege, G. (1892). Über Sinn und Bedeutung. Zeitschrift für Philos- would not understand them. David Premack (1986), follow- ophie und philosophische Kritik 22–50. ing this conceptual thread, commented that if chickens had Gazdar, G. (1979). Pragmatics: Implicature, Presupposition, and syntax they would have nothing much to say. The first com- Logical Form. New York: Academic Press. ment raises a methodological challenge, the second a con- Heim, I. (1982). The Semantics of Definite and Indefinite ceptual one. Studies of primate cognition have faced both. Noun Phrases. Ph.D. diss., University of Massachusetts, For some, monkeys and apes appear much smarter than Amherst. nonprimates. If so, why might this be the case? One domi- Heim, I. (1983). On the projection problem for presuppositions. In nant perspective suggests that social life has exerted extraor- M. Barlow, D. Flickinger, and M. Wescoat, Eds., Proceedings dinary pressure on brain structure and function, and has led of the Second Annual West Coast Conference on Formal Lin- to a mind that is capable of tracking dynamically changing guistics. Stanford University, pp. 114–125. Heim, I. (1992). Presupposition projection and the semantics of social relationships and political struggles (Byrne and attitude verbs. Journal of Semantics 9: 183–221. Whiten 1988; Cheney and Seyfarth 1990; Humphrey 1976; Kadmon, N. (1990). Uniqueness. Linguistics and Philosophy 13: Povinelli 1993). In primates—but few other species—indi- 273–324. viduals form coalitions to outcompete others, and following Karttunen, L. (1973). Presuppositions of compound sentences. aggressive attack, subordinates often reconcile their differ- Linguistic Inquiry 4: 169–193. ences with a dominant, engaging in exceptional acts of kind- Karttunen, L., and S. Peters. (1979). Conventional implicature. In ness and trust such as kissing and testicle holding (de Waal C. K. Oh and D. A. Dinneen, Eds., Syntax and Semantics 11: 1996; Harcourt and de Waal 1992). Such behavior, along Presupposition. New York: Academic Press, pp. 1–56. with apparent acts of deception (Hauser 1996), has provided Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge Uni- the foundation for experimental investigations of underlying versity Press. Primate Cognition 667 cognitive mechanisms. Here, I tackle three problems so as to cause-effect, identity, kinship) that contribute to the intrica- shed light on the architecture of the primate mind: (1) IMITA- cies of their social life. And primates are probably not even TION, (2) abstract CONCEPTS, and (3) mental state attribution. unique within the animal kingdom in terms of such concep- tual capacities, as other species have demonstrated compara- ble cognitive prowess (see review in Thompson 1995). Imitation In some monkey species, all chimpanzee populations, and Mental State Attribution one orangutan population—but in no gorilla populations— individuals use tools to gain access to food (Matsuzawa and Are primates intuitive psychologists in that they can reflect Yamakoshi 1996; McGrew 1992; Visalberghi and Fragaszy upon their own beliefs and desires and understand that oth- 1991). The observation that individuals within a population ers may or may not share such mental states? Consider the ultimately acquire the same tool-using technique has been following observation: a low-ranking male chimpanzee who taken as evidence that primates are capable of imitation. is about to mate sees a more dominant male approaching. There are, however, several paradoxical findings and contro- The low-ranking male covers his erect penis as the domi- versies over the interpretation of these observations (Heyes nant walks by. This kind of interaction—and there are thou- and Galef 1996; Tomasello and Call 1997; Byrne and Rus- sands of observations like this in the literature—suggests a son forthcoming). First, in some study populations, young capacity for intentional deception (see MACHIAVELLIAN require 5–10 years before they master tool technology. INTELLIGENCE HYPOTHESIS). If true, the following capaci- Although some of this can be accounted for by maturational ties must be in place: the ability to represent one’s own issues associated with motor control, one would expect beliefs and desires, the ability to understand perspective, faster acquisition if imitation, or a more effective teaching and the ability to attribute intentions to others. The evidence system, was in place (Caro and Hauser 1992). Second, most for each of these capacities is weak, at best, but the experi- experiments conducted in the lab have failed to provide evi- mental research program is only in its infancy. Studies using dence that naturally reared monkeys and apes can imitate mirrors suggest that all of the apes, and at least one monkey (Whiten and Ham 1992), although a recent set of studies on (cotton-top tamarins), respond to their reflection as if they chimpanzees and marmosets suggest that some of the previ- see themselves, rather than a conspecific (Gallup 1970; ous failures may be due to methodological problems rather Hauser et al. 1995; Povinelli et al. 1993). Self-recognition than conceptual ones (Heyes and Galef 1996). Third, and can be computed by perceptual mechanisms alone, whereas perhaps most paradoxical of all, apes reared by humans can self-awareness implies some access to one’s own beliefs and imitate human actions (reviewed in Tomasello and Call desires, how they can change, and how they might differ 1997). This suggests that the ape mind has been designed from those of another individual. The mirror test is blind to for imitation, but requires a special environment for its issues of awareness. emancipation—a conceptual puzzle that has yet to be Many animals, primates included, follow the direction of resolved (see SOCIAL COGNITION IN ANIMALS). eye gaze. However, current evidence suggests that neither monkeys nor apes understand that seeing provides a window into knowledge. A suite of experiments now show that mon- Abstract Concept keys and apes do not use eye gaze to infer what other indi- In several primates, the number of food calls produced is pos- viduals know, and thus do not alter their behavior as a itively correlated with the amount of food discovered function of differences in knowledge (Cheney and Seyfarth (reviewed in Hauser 1996). In chimpanzees, a group from 1990; Povinelli and Eddy 1996). Given that this capacity one community will kill a lone individual from another com- emerges in the developing child well before the capacity to munity, but will avoid others if there are two or more individ- attribute intentional states to others and that perspective tak- uals. Are such assessments based on an abstract conceptual ing plays such a critical role in mental state attribution, it system, akin to our number system? Recent experiments, seems unlikely that primates have access to a theory of using different experimental procedures, reveal that both apes mind. But we should withhold final judgment until addi- and monkeys have quite exceptional numerical skills tional experiments have been conducted. (reviewed in Gallistel 1990; Hauser and Carey forthcoming). The human primate once held hands with a nonhuman Thus, chimpanzees who have learned arabic numbers under- primate ancestor. But this phylogenetic coupling happened stand the primary principles of a count system (e.g., one-one 5–6 million years ago, ample time for fundamental differ- mapping, item indifference, cardinality) and can count up to ences to have emerged in the human branch of the tree. and label nine items (Boysen 1996; Matsuzawa 1996). Using Nonetheless, many features of the primate mind have been the violation of expectancy procedure designed for human left unchanged, including some capacity for imitation and infants, studies of rhesus monkeys and cotton-top tamarins some capacity to represent abstract concepts. The future lies have revealed that they can spontaneously carry out simple in uncovering the kinds of selective pressures that led to arithmetical calculations, such as addition and subtraction changes in and conservation of the general architecture of (Hauser and Carey forthcoming; Hauser, MacNeilage, and the primate mind. Ware 1996). Although nonhuman primates will never join the See also LANGUAGE AND THOUGHT; METAREPRESENTA- intellectual ranks of our mathematical elite, they clearly have TION; PRIMATE LANGUAGE; THEORY OF MIND access to an abstract number concept, in addition to other abstract concepts (e.g., transitivity, color names, sameness, —Marc D. Hauser 668 Primate Cognition References Thompson, R. K. R. (1995). Natural and relational concepts in ani- mals. In H. L. Roitblat and J. A. Meyer, Eds., Comparative Boysen, S. T. (1996). “More is less”: The distribution of rule-gov- Approaches to Cognitive Science. Cambridge, MA: MIT Press, erned resource distribution in chimpanzees. In A. E. Russon, K. pp. 175–224. A. Bard, and S. T. Parker, Eds., Reaching into Thought: The Tomasello, M., and J. Call. (1997). Primate Cognition. Oxford: Minds of the Great Apes. Cambridge: Cambridge University Oxford University Press. Press, pp. 177–189. Visalberghi, E., and D. Fragaszy. (1991). Do monkeys ape? In S. T. Byrne, R. W., and A. Russon. (Forthcoming). Learning by imita- Parker and K. R. Gibson, Eds., “Language” and Intelligence in tion: A hierarchical approach. Behavioral and Brain Sciences. Monkeys and Apes. Cambridge: Cambridge University Press, Byrne, R. W., and A. Whiten. (1988). Machiavellian Intelligence: pp. 247–273. Social Expertise and the Evolution of Intellect in Monkeys, Visalberghi, E., and L. Limongelli. (1996). Acting and understand- Apes and Humans. Oxford: Oxford University Press. ing: Tool use revisited through the minds of capuchin monkeys. Caro, T. M., and M. D. Hauser. (1992). Is there teaching in nonhu- In A. E. Russon, K. A. Bard, and S. T. Parker, Eds., Reaching man animals? Quarterly Review of Biology 67: 151–174. into Thought: The Minds of the Great Apes. Cambridge: Cam- Cheney, D. L., and R. M. Seyfarth. (1990). How Monkeys See the bridge University Press, pp. 57–79. World: Inside the Mind of Another Species. Chicago: University Whiten, A., and R. Ham. (1992). On the nature and evolution of of Chicago Press. imitation in the animal kingdom: Reappraisal of a century of de Waal, F. B. M. (1996). Good Natured. Cambridge, MA: Harvard research. In P. J. B. Slater, J. S. Rosenblatt, C. Beer, and M. University Press. Milinski, Eds., Advances in the Study of Behavior. New York: Gallistel, C. R. (1990). The Organization of Learning. Cambridge, Academic Press, pp. 239–283. MA: MIT Press. Gallup, G. G., Jr. (1970). Chimpanzees: Self-recognition. Science Further Readings 167: 86–87. Boesch, C., and H. Boesch. (1992). Transmission aspects of tool Harcourt, A. H., and F. B. M. de Waal. (1992). Coalitions and Alli- use in wild chimpanzees. In T. Ingold and K. R. Gibson, Eds., ances in Humans and Other Animals. Oxford: Oxford Univer- Tools, Language and Intelligence: Evolutionary Implications. sity Press. Hauser, M. D. (1996). The Evolution of Communication. Cam- Oxford: Oxford University Press. bridge, MA: MIT Press. Boysen, S. T., and G. G. Bernston. (1989). Numerical competence Hauser, M. D., and S. Carey. (1998). Building a cognitive creature in a chimpanzee. Journal of Comparative Psychology 103: 23– from a set of primitives: Evolutionary and developmental 31. insights. In D. Cummins and C. Allen, Eds., The Evolution of Bugnyar, T., and L. Huber. (1997). Push or pull: An experimental Mind. Oxford: Oxford University Press, pp. 51–106. study on imitation in marmosets. Animal Behaviour 54: 817– Hauser, M. D., J. Kralik, C. Botto, M. Garrett, and J. Oser. (1995). 831. Self-recognition in primates: Phylogeny and the salience of Byrne, R. (1996). The Thinking Ape. Oxford: Oxford University species-typical traits. Proceedings of the National Academy of Press. Sciences 92: 10811–10814. Byrne, R., and A. Whiten. (1990). Tactical deception in primates: Hauser, M. D., P. MacNeilage, and M. Ware. (1996). Numerical The 1990 database. Primate Report 27: 1–101. representations in primates. Proceedings of the National Acad- Cheney, D. L., and R. M. Seyfarth. (1988). Assessment of meaning emy of Sciences 93: 1514–1517. and the detection of unreliable signals by vervet monkeys. Ani- Heyes, C. M., and B. G. Galef. (1996). Social Learning and Imita- mal Behaviour 36: 477–486. tion in Animals. Cambridge: Cambridge University Press. Cheney, D. L., and R. M. Seyfarth. (1990). Attending to behaviour Humphrey, N. K. (1976). The social function of intellect. In P. P. G. versus attending to knowledge: Examining monkeys’ attribu- Bateson and R. A. Hinde, Eds., Growing Points in Ethology. tion of mental states. Animal Behaviour 40: 742–753. Cambridge: Cambridge University Press, pp. 303–321. Dasser, V. (1987). Slides of group members as representations of Matsuzawa, T. (1996). Chimpanzee intelligence in nature and in real animals (Macaca fascicularis). Ethology 76: 65–73. captivity: Isomorphism of symbol use and tool use. In W. C. Galef, B. G., Jr. (1992). The question of animal culture. Human McGrew, L. F. Nishida, and T. Nishida, Eds., Great Ape Societ- Nature 3: 157–178. ies. Cambridge: Cambridge University Press, pp. 196–209. Gallup, G. G., Jr. (1987). Self-awareness. In J. R. Erwin and G. Matsuzawa. T., and G. Yamakoshi. (1996). Comparision of chim- Mitchell, Eds., Comparative Primate Biology, vol. 2B. Behav- panzee material culture between Bossou and Nimba, West ior, Cognition and Motivation. New York: Alan Liss, Inc. Africa. In A. E. Russon, K. A. Bard, and S. T. Parker, Eds., Hauser, M. D. (1997). Tinkering with minds from the past. In M. Reaching into Thought: The Mind of the Great Apes. Cam- Daly, Ed., Characterizing Human Psychological Adaptations. bridge: Cambridge University Press. New York: Wiley, pp. 95–131. McGrew, W. C. (1992). Chimpanzee Material Culture. Cambridge: Hauser, M. D., and J. Kralik. (Forthcoming). Life beyond the mir- Cambridge University Press. ror: A reply to Anderson and Gallup. Animal Behaviour. Povinelli, D. J. (1993). Reconstructing the evolution of mind. Hauser, M. D., and P. Marler. (1993). Food-associated calls in American Psychologist 48: 493–509. rhesus macaques (Macaca mulatta). 1. Socioecological factors Povinelli, D. J., and T. J. Eddy. (1996). What young chimpanzees influencing call production. Behavioral Ecology 4: 194–205. know about seeing. Monographs of the Society for Research in Hauser, M. D., P. Teixidor, L. Field, and R. Flaherty. (1993). Food- Child Development 247. elicited calls in chimpanzees: Effects of food quantity and Povinelli, D. J., A. B. Rulf, K. R. Landau, and D. T. Bierschwale. divisibility? Animal Behaviour 45: 817–819. (1993). Self-recognition in chimpanzees (Pan troglodytes): Dis- Hauser, M. D., and R. W. Wrangham. (1987). Manipulation of tribution, ontogeny and patterns of emergence. Journal of Com- food calls in captive chimpanzees: A preliminary report. Folia parative Psychology 107: 347–372. Primatologica 48: 24–35. Premack, D. (1986). Gavagai! or the Future History of the Animal Matsuzawa, T. (1985). Use of numbers by a chimpanzee. Nature Language Controversy. Cambridge, MA: MIT Press. 315: 57–59. Primate Language 669 But how can one assess whether symbol meaningfulness Povinelli, D. J., K. E. Nelson, and S. T. Boysen. (1990). Inferences about guessing and knowing by chimpanzees (Pan troglodytes). and comprehension are present, given that apes can’t speak? Journal of Comparative Psychology 104: 203–210. There are several ways. First, the meaningfulness of sym- Povinelli, D. J., K. A. Parks, and M. A. Novak. (1991). Do rhesus bols with Sherman and Austin chimpanzees (Savage- monkeys (Macaca mulatta) attribute knowledge and ignorance Rumbaugh 1986) was documented by their symbol-based, to others? Journal of Comparative Psychology 105: 318–325. cross-modal matching. Without specific training, they could Premack, D. (1978). On the abstractness of human concepts: Why look at a lexigram and select the appropriate object, by it would be difficult to talk to a pigeon. In S. H. Hulse, H. touch, from others in a box into which they could not see. Fowler, and W. K. Konig, Eds., Cognitive Processes in Animal They also could label, by use of word-lexigrams, single Behavior. Hillsdale, NJ: Erlbaum, pp. 423–451. objects that they could feel but not see. Second, and more Tomasello, M., E. S. Savage-Rumbaugh, and A. Kruger. (1993). importantly, they learned word-lexigrams for the categories Imitative learning of actions on objects by children, chimpan- zees and enculturated chimpanzees. Child Development 64: of “food” and “tool” to which they appropriately sorted 17 1688–1706. individual lexigrams, each representing a specific food and Whiten, A. (1993). Evolving a theory of mind: The nature of non- implement. (Each lexigram represented either a food or a verbal mentalism in other primates. In S. Baron-Cohen, H. T. tool, such as a banana, magnet, cheese, lever, etc.) Thus, Flusberg, and D. J. Cohen, Eds., Understanding other minds. their lexigrams represented things not necessarily present— Oxford: Oxford University Press, pp. 367–396. the essence of SEMANTICS or word meaning. Wrangham, R. W., and D. Peterson. (1996). Demonic Males. New Comprehension became of special interest with the dis- York: Houghton Mifflin. covery that Kanzi, a bonobo (a rare species of chimpanzee, Pan paniscus), spontaneously learned the meanings of Primate Language word-lexigrams and later came to understand human speech—both single words and novel sentences of request Curiosity regarding apes’ capacity for language has a long (Savage-Rumbaugh et al. 1993). The discovery was made in history. From DARWIN’s nineteenth century postulations of the course of research with Matata, his adoptive mother. both biological and psychological continuities between ani- Matata’s essential failure in learning lexigrams was likely a mals and humans, to the more recent discovery (Sibley and reflection of her feral birth and development. Ahlquist 1987) that chimpanzee (Pan) DNA is more similar Though always present during Matata’s language train- to human than to gorilla (Gorilla) DNA, scientific findings ing, Kanzi was not taught; but later, when separated from have encouraged research into the language potential of her, it became clear that he had learned a great deal! Sponta- apes. A recent report (Gannon et al. 1998) that the chimpan- neously, he began to request and go get specific foods and zee planum temporale is enlarged in the left hemisphere, drinks, to label objects, and to announce what he was about with a humanlike pattern of Wernicke’s brain language-area to do with the appropriate lexigrams. homolog, will provide additional impetus. That area is held From that time forward, Kanzi was reared in an even basic to human language. Does, in fact, elaboration in the richer language-structured milieu. Caregivers commented chimpanzee’s planum temporale provide for language-rele- on events (present, future, and past) and particularly on vant processes or potential? Is its elaboration an instance of things of special interest to him. Where possible, caregivers homoplasy (convergent evolution)? Or is its function not used word-lexigrams as they spoke specific words. Kanzi necessarily related to language? was not required to use a keyboard to receive objects or to Language research with apes was revitalized in the 1960s participate in activities and was given no formal lessons. as Beatrix and Allen Gardner (Gardner, Gardner, and Cant- Kanzi quickly learned by observation how to ask to travel fort 1989) used a variation of American Sign Language to to specific sites in the forest, to play a number of games, to establish two-way communication with their chimpanzee, visit other chimps, to get and even cook specific foods, and Washoe, and as David Premack (Premack and Premack to watch television. He also commented on things and 1983) used an artificial language system of plastic tokens events and continued to announce eminent actions. In sharp with his chimpanzee, Sarah. In the 1970s, Sue Savage- contrast with our other apes, Kanzi also began to compre- Rumbaugh’s group (1977) developed a computer-monitored hend human speech—not just single words but also sen- keyboard of distinctive geometric patterns, called lexigrams, tences. to foster studies of language capacity with Lana, a chimpan- Consequently, Kanzi’s (8 yrs.) speech comprehension zee. Herbert Terrace’s (1979) chimpanzee Project Nim, Lynn was compared with that of a human child, Alia (2½ yrs.). In Miles’s (1990) orangutan Project Chantek, and Roger and controlled tests, they were given 415 novel requests—to Deborah Fouts’s (1989) project with Washoe and other chim- take a specific object to a stated location or person (“Take panzees obtained from the Gardners also started during the the gorilla (doll) to the bedroom”), to do something to an ’70s. object (“Hammer the snake!”), to do something with a spe- These projects initially emphasized language production. cific object relative to another object (“Put a rubber band on It was assumed that if an ape appropriately produced a sign your ball”), to go somewhere and retrieve a specific object then it must also understand its meaning. That assumption (“Get the telephone that’s outdoors”), and so on. An ever- proved unwarranted. Apes were proved capable of selecting changing variety of objects was present on each trial, and seemingly appropriate symbols without understanding their the ape and child were asked to fulfil various requests with meanings, even at a level grasped by 2- and 3-year-old chil- them. Each request was novel and had not been modeled by dren as they use words. Studies of comprehension ensued. others. 670 Primate Language Both Kanzi and Alia were about 70 percent correct in Miles, L. (1990). The cognitive foundations for reference in a sign- ing orangutan. In “Language” and Intelligence in Monkeys and carrying out the requests on their first presentation. As Apes: Comparative Developmental Perspectives. Cambridge: with the human child, Kanzi’s comprehension skills out- Cambridge University Press, pp. 511–539. paced those of production. Kanzi understood much more Premack, D., and A. J. Premack. (1983). The Mind of an Ape. New than he “could say.” Though his comprehension skills York: Norton Company. compared favorably with those of a 2½-year-old child, his Rumbaugh, D. M. (1977). Language Learning by a Chimpanzee: productive skills were more limited and approximate those The LANA Project. New York: Academic Press. of the average 1 to 1½-year-old child (Greenfield and Sav- Rumbaugh, D. M., E. S. Savage-Rumbaugh, and D. A. Washburn. age-Rumbaugh 1993). These major findings were repli- (1996). Toward a new outlook on primate learning and behav- cated in subsequent research with two other apes (Savage- ior: Complex learning and emergent processes in comparative Rumbaugh and Lewin 1994). psychology. Japanese Psychological Research 38: 113–125. Savage-Rumbaugh, E. S. (1986). Ape Language: From Condi- Thus, speech comprehension can be acquired spontane- tioned Response to Symbol. New York: Columbia University ously (e.g., without formal training) by apes if from birth Press. they are reared much as one would rear human children— Savage-Rumbaugh, E. S., and R. Lewin. (1994). Kanzi: The Ape at with language used by caregivers throughout the day to the Brink of the Human Mind. New York: Wiley. describe, to announce, and to coordinate social activities Savage-Rumbaugh, E. S., J. Murphy, R. A. Sevcik, K. E. Brakke, (i.e., feeding, traveling, and playing). S. Williams, and D. M. Rumbaugh. (1993). Language compre- These findings indicate that language acquisition (1) is hension in ape and child. Monographs of the Society for based in the social-communicative experiences of early Research in Child Development no. 233, 58: 3–4. infancy; (2) is based, first, in comprehension, not production Sibley, C. C., and J. E. Ahlquist. (1987). DNA hybridization evi- or speech; and (3) is based in the evolutionary processes that dence of hominoid phylogeny: Results from an expanded data set. Journal of Molecular Evolution 26: 99–121. have selected for primate taxa that have large and complex Terrace, H. S. (1979). Nim. New York: Alfred A. Knopf. brains (Rumbaugh, Savage-Rumbaugh, and Washburn 1996). Thus, the question should not be, “Do apes have lan- Further Readings guage?” Given a brain only about one third the size of ours, it would be unreasonable to expect the ape to have full com- Bates, E., D. Thal, and V. Marchman. (1991). Symbols and syntax: petence for language and its several dimensions. Rather, the A Darwinian approach to language development. In N. A. question should be, “Which aspects of language can they Krasnegor, D. M. Rumbaugh, R. L. Schiefelbush, and M. Stud- acquire, and under what conditions do they do so?” dert-Kennedy, Eds., Biological and Behavioral Determinants of Just as the discovery of even elementary forms of life on Language Development. Hillsdale, NJ: Erlbaum, pp. 29–65. Jerison, H. J. (1985). The evolution of mind. In D. A. Lakley, Ed., another planet will not be trivialized, the documentation of Brain and Mind. London: Methuen, pp. 1–33. elementary language competence in species other than Lieberman, P. (1984). The Biology and Evolution of Language. humans has significant implications for the understanding Cambridge, MA: Harvard University Press. of evolution and brain. Although the capacity for acquiring Matsuzawa, T. (1990). The Perceptual World of a Chimpanzee. even elementary language skills is surely limited among Project no. 63510057, Kyoto University, Kyoto, Japan. animal species, at least some ape, marine mammal, and Pepperberg, I. M. (1993). Cognition and communication in an avian species have capacities that include the abilities to African grey parrot (Psittacus erithacus): Studies on a nonhu- name, to request, to comprehend, and both to use and to man, nonprimate, nonmammalian subject. In H. Roitblat, L. M. comprehend symbols as representations of things and events Herman, and P. E. Nachtigall, Eds., Language and Communica- not necessarily present in space and time. These capacities tion: Comparative Perspectives. Hillsdale, NJ: Erlbaum, pp. 221–248. are inherently in the domain of language. Roitblat, H. L., L. M. Herman, and P. E. Nachtigall. (1993). Lan- See also DOMAIN SPECIFICITY; INNATENESS OF LANGUAGE; guage and Communication: Comparative Perspectives. Hills- LANGUAGE ACQUISITION; PRIMATE COGNITION; SYNTAX dale, NJ: Erlbaum. —Duane Rumbaugh and Sue Savage-Rumbaugh Rumbaugh, D. M. (1997). Competence, cortex and primate mod- els—a comparative primate perspective. In N. A. Krasnegor, G. R. Lyon, P. S. Goldman-Rakic, Eds., Development of the Pre- References frontal Cortex: Evolution, Neurobiology and Behavior. Balti- Fouts, R. S., and D. H. Fouts. (1989). Loulis in conversation with more, MD: Paul H. Brookes Publisher, pp. 117–139. cross-fostered chimpanzees. In R. A. Gardner, B. T. Gardner, Rumbaugh, D. M., and E. S. Savage-Rumbaugh. (1994). Language and T. E. Van Cantfort, Eds., Teaching Sign Language in Chim- in comparative perspective. In N. J. Mackintosh, Ed., Animal panzees. Albany: SUNY Press. Learning and Cognition. New York: Academic Press. Gannon, P. J., R. L. Holloway, D. C. Broadfield, and A. R. Braun. Savage-Rumbaugh, E. S., K. E. Brakke, and S. S. Hutchins. (1998). Asymmetry of chimpanzee planum temporale: Human- (1992). Linguistic development: Contrasts between co-reared like pattern of Wernicke’s brain language area homolog. Sci- Pan troglodytes and Pan paniscus. In T. Nishida, W. C. ence 279: 220–222. McGrew, P. Marler, M. Pickford, and F. B. M. deWaal, Eds., Gardner, R. A., B. T. Gardner, and T. E. Van Cantfort. (1989). Topics in Primatology. Vol. 1, Human Origins. Tokyo: Univer- Teaching Sign Language to Chimpanzee. New York: SUNY sity of Tokyo Press, pp. 51–66. Press. Schusterman, R. L., R. Gisiner, B. K. Grimm, and E. B. Hanggi. Greenfield, P. M., and E. S. Savage-Rumbaugh. (1993). Compar- (1993). Behavior control by exclusion and attempts at establish- ing communicative competence in child and chimp: The prag- ing semanticity in marine mammals using match-to-sample. In matics. Journal of Child Language 20: 1–26. H. Roiblat, L. M. Herman, and P. E. Nachtigall, Eds., Language Probabilistic Reasoning 671 judgment is coherent. Conversely, incoherent judgment and Communication: Comparative Perspectives. Hillsdale, NJ: Erlbaum, pp. 249–274. entails the holding of contradictory beliefs and leaves the Tuttle, R. H. (1986). Apes of the World: Their Social Behavior, person open to possible “Dutch books.” These consist of a Communication and Ecology. Park Ridge, NJ: Noyes Publica- set of probability judgments that, when translated into bets tions. that the person deems fair, create a set of gambles that the person is bound to lose no matter how things turn out (Osh- erson 1995; Resnik 1987; see also DECISION MAKING). Probabilistic Reasoning Note that coherent judgment satisfies a number of logi- cal, set-theoretic requirements. It does not insure that judg- Probabilistic reasoning is the formation of probability judg- ment is correct or even “well calibrated.” Thus, a person ments and of subjective beliefs about the likelihoods of out- whose judgment is coherent may nonetheless be quite fool- comes and the frequencies of events. The judgments that ish, believing, for example, that there is a great likelihood people make are often about things that are only indirectly that he or she will soon be the king of France. Normative observable and only partly predictable. Whether it is the probabilistic judgment needs to be not only coherent but weather, a game of sports, a project at work, or a new mar- also well calibrated. Consider a set of propositions each of riage, our willingness to engage in an endeavor and the which a person judges to be true with a probability of .90. If actions that we take depend on our estimated likelihood of she is right about 90 percent of these, then she is said to be the relevant outcomes. How likely is our team to win? How well calibrated. If she is right about less or more than 90 frequently have projects like this failed before? And what is percent, then she is said to be overconfident or underconfi- likely to ameliorate those chances? dent, respectively. Like other areas of reasoning and decision making, the A great deal of empirical work has documented system- study of probabilistic reasoning lends itself to normative, atic discrepancies between the normative requirements of descriptive, and prescriptive approaches. The normative probabilistic reasoning and the ways in which people reason approach to probabilistic reasoning is constrained by the about chance. In settings where the relevance of simple same mathematical rules that govern the classical, set- probabilistic rules is made transparent, subjects often reveal theoretic conception of probability. In particular, probability appropriate statistical intuitions. Thus, for example, when a judgments are said to be “coherent” if and only if they sat- sealed description is pulled at random out of an urn that is isfy conditions commonly known as Kolmogorov’s axioms: known to contain the descriptions of thirty lawyers and sev- (1) No probabilities are negative. (2) The probability of a enty engineers, people estimate the probability that the tautology is 1. (3) The probability of a disjunction of two description belongs to a lawyer at .30. In richer contexts, logically exclusive statements equals the sum of their however, people often rely on less formal considerations respective probabilities. And (4), the probability of a con- emanating from intuitive JUDGMENT HEURISTICS, and these junction of two statements equals the probability of the first, can generate judgments that conflict with normative require- assuming that the second is satisfied, times the probability of ments. For example, when a randomly sampled description the second. Whereas the first three axioms involve uncondi- from the urn sounds like that of a lawyer, subjects’ probabil- tional probabilities, the fourth introduces conditional proba- ity estimates typically rely too heavily on how representative bilities. When applied to hypotheses and data in inferential the description is of a lawyer and too little on the (low) prior contexts, simple arithmatic manipulation of rule (4) leads to probability that it in fact belongs to a lawyer. the result that the (posterior) probability of a hypothesis con- According to the representativeness heuristic, the likeli- ditional on the data is equal to the probability of the data hood that observation A belongs to class B is evaluated by conditional on the hypothesis times the (prior) probability of the degree to which A resembles B. Sample sizes and prior the hypothesis, all divided by the probability of the data. odds, both of which are highly relevant to likelihood, do Although mathematically trivial, this is of central impor- not impinge on how representative an observation appears tance in the context of so-called Bayesian inference, which and thus tend to be relatively neglected. In general, the underlies theories of belief updating and is considered by notion that people focus on the strength of the evidence many to be a normative requirement of probabilistic reason- (e.g., the warmth of a letter of reference) with insufficient ing (see BAYESIAN NETWORKS and INDUCTION). regard for its weight (e.g., how well the writer knows the There are at least two distinct philosophical conceptions candidate) can explain various systematic biases in proba- of probability. According to one, probabilities refer to the bilistic judgment (Griffin and Tversky 1992), including the relative frequencies of objective physical events in repeated failure to appreciate regression phenomena, and the fact trials; according to the other, probabilities are epistemic in that people are generally overconfident (when evidence is nature, expressing degrees of belief in specific hypotheses. remarkable but its weight is low), and occasionally under- While these distinctions are beyond the scope of the current confident (when the evidence is unremarkable but its reli- entry, they are related to ongoing debate concerning the sta- ability high). Probability judgments based on the support, tus and interpretation of some experimental findings (see, or strength of evidence, of focal relative to alternative e.g., Cosmides and Tooby 1996; Gigerenzer 1994, 1996; hypotheses form part of a theory of subjective probability, Kahneman and Tversky 1996). What is notable, however, is called support theory. According to support theory, which the fact that these different conceptions are arguably con- has received substantial empirical validation, unpacking a strained by the same mathematical axioms above. Adher- description of an event into disjoint components generally ence to these axioms suffices to insure that probability increases its support and, hence, its perceived likelihood. 672 Probabilistic Reasoning As a result, different descriptions of the same event can Gigerenzer, G. (1994). Why the distinction between single-event probabilities and frequencies is important for psychology (and give rise to different judgments (Rottenstreich and Tversky vice versa). In G. Wright and P. Ayton, Eds., Subjective Proba- 1997; Tversky and Koehler 1994). bility. New York: Wiley. Probability judgments often rely on sets of attributes— Gigerenzer, G. (1996). On narrow norms and vague heuristics: A for example, a prospective applicant’s exams scores, rele- rebuttal to Kahneman and Tversky (1996). Psychological vant experience, and letters of recommendation—which Review 103: 592–596. need to be combined into a single rating, say, likelihood of Griffin, D., and A. Tversky. (1992). The weighing of evidence and success at a job. Because people have poor insight into how the determinants of confidence. Cognitive Psychology 24: 411– much weight to assign to each attribute, they are typically 435. quite poor at combining attributes to yield a final judgment. Hammond, K. R. (1955). Probabilistic functioning and the clinical Much research has been devoted to the tension between method. Psychological Review 62: 255–262. Kahneman, D., and A. Tversky. (1996). On the reality of cognitive intuitive (“clinical”) judgment and the greater predictive illusions. Psychological Review 103: 582–591. success obtained by linear models of the human judge Meehl, P. E. (1954). Clinical Versus Statistical Prediction: A Theo- (Meehl 1954; Dawes 1979; Dawes and Corrigan 1974; retical Analysis and a Review of the Evidence. Minneapolis: Hammond 1955). In fact, it has been repeatedly shown that University of Minnesota Press. a linear combination of attributes, based, for example, on a Osherson, D. N. (1995). Probability judgment. In E .E. Smith and judge’s past probability ratings, does better in predicting D. N. Osherson, Eds., An Invitation to Cognitive Science. 2nd future (as well as previous) instances than the judge on ed. Cambridge, MA: MIT Press. whom these ratings are based. This bootstrapping method Osherson, D. N., E. Shafir, and E. E. Smith. (1994). Extracting the takes advantage of the person’s insights captured across coherent core of human probability judgment. Cognition 50: numerous ratings, and improves on any single rating where 299–313. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: less than ideal weightings of attributes may intrude. More- Networks of Plausible Inference. San Mateo, CA: Kaufmann. over, because attributes are often highly correlated and sys- Resnik, M. D. (1987). Choices: An Introduction to Decision The- tematically misperceived, a unit assignment of weights, not ory. Minneapolis: University of Minnesota Press. properly devised for the person, can often still outperform Rottenstreich, Y., and A. Tversky. (1997). Unpacking, repacking, the human judge (Dawes 1988). and anchoring: Advances in support theory. Psychological While human intuition can be a useful guide to the likeli- Review 104: 406–415. hoods of events, it often exhibits instances of incoherence, Tversky, A., and D. Kahneman. (1983). Extensional versus intui- in the sense defined above. Methods have been explored that tive reasoning: The conjunction fallacy in probability judgment. extract from a person’s judgments a coherent core that is Psychological Review 90: 293–315. maximally consistent with those judgments, and at the same Tversky, A., and D. J. Koehler. (1994). Support theory: A nonex- tensional representation of subjective probability. Psychologi- time come closer to the observed likelihoods than do the cal Review 101: 547–567. original (incoherent) judgments (Osherson, Shafir, and Yates, J. F. (1990). Judgment and Decision Making. Englewood Smith 1994; Pearl 1988). Probabilistic reasoning occurs in Cliffs, NJ: Prentice-Hall. complex situations, with numerous variables and interac- tions influencing the likelihood of events. In these situa- Further Readings tions, people’s judgments often violate basic normative rules. At the same time, people can exhibit sensitivity to and Arkes, H. R., and K. R. Hammond. (1986). Judgment and Decision appreciation for the normative principles. The coexistence Making: An Interdisciplinary Reader. Cambridge: Cambridge of fallible intuitions along with an underlying appreciation University Press. Goldstein, W. M., and R. M. Hogarth. (1997). Research on Judg- for normative judgment yield a subtle picture of probabilis- ment and Decision Making: Currents, Connections and Contro- tic reasoning, and interesting possibilities for a prescriptive versies. Cambridge: Cambridge University Press. approach. In this vein, a large literature on expert systems Hacking, I. (1975). The Emergence of Probability. Cambridge: has provided analyses and applications. Cambridge University Press. See also CAUSAL REASONING; DEDUCTIVE REASONING; Heath, C., and A. Tversky. (1990). Preference and belief: Ambigu- EVOLUTIONARY PSYCHOLOGY; PROBABILITY, FOUNDATIONS ity and competence in choice under uncertainty. Journal of Risk OF; TVERSKY and Uncertainty 4 (1): 5–28. Howson, C., and P. Urbach. (1989). Scientific Reasoning: The —Eldar Shafir Bayesian Approach. La Salle, IL: Open Court Publishers. Kahneman, D., P. Slovic, and A. Tversky, Eds. (1982). Judgment under Uncertainty: Heuristics and Biases. New York: Cam- References bridge University Press. Cosmides, L., and J. Tooby. (1996). Are humans good intuitive Shafer, G., and J. Pearl, Eds. (1990). Readings in Uncertain Rea- statisticians after all? Rethinking some conclusions from the lit- soning. San Mateo, CA: Kaufmann. erature on judgment under uncertainty. Cognition 58: 1–73. Skyrms, B. (1975). Choice and Chance 2nd ed. Belmont, CA: Dawes, R. M. (1979). The robust beauty of improper linear models Dickensen. in decision making. American Psychologist 34: 571–582. von Winterfeld, D., and W. Edwards. (1986). Decision Analysis Dawes, R. M., (1988). Rational Choice in an Uncertain World. and Behavioral Research. Cambridge: Cambridge University New York: Harcourt Brace Jovanovich. Press. Dawes, R. M., and B. Corrigan. (1974). Linear models in decision Wu, G., and R. Gonzalez. (1996). Curvature of the probability making. Psychological Bulletin 81: 97–106. weighting function. Management Science 42 (12): 1676–1690. Probability, Foundations of 673 Several important consequences follow from this “condi- Probability, Foundations of tioned degree of belief” interpretation of probability. One consequence is that two different probability assertions, According to one widely held interpretation of probability, such as the one above, and the numerical probability assigned to a proposition given P(impulse-within-.1-seconds | 70 out of 100 previous particular evidence is a measure of belief in that proposition within .1 sec. & dead(cell)) = 0 given the evidence. For example, the statement “The proba- are not contradictory because the probabilities involve dif- bility that the next nerve impulse will occur within .1 sec- ferent conditioning evidence, although the latter assertion is onds of the previous impulse is .7, given that 70 of the last a better predictor if its conditioning evidence is correct. Dif- 100 impulses occured within .1 seconds” is an assertion of ferent entities can condition on different evidence, and so degree of belief about a future event given evidence of pre- can assign different probability values to the same proposi- vious similar events. In symbols this would be written: tion. This means that there is a degree of subjectivity in P(impulse-within-.1-seconds | 70 out of 100 previous within evaluating probabilities, because different subjects generally .1 sec.) = .7, have different experience. However, when different subjects agree on the evidence, they typically give the same probabil- where the “|” symbol (called the “givens”) separates the tar- ities. That is, a degree of objectivity with probabilities can get proposition (on the left) from the evidence used to sup- be achieved through intersubjective agreement. Another port it (on the right), the conditioning evidence. It is now consequence of the degree of belief interpretation of proba- widely accepted that unconditioned probability statements bility is that the conditioning propositions do not have to be are either meaningless or a shorthand for cases where the true—a probability assertion gives the numeric probability conditioning information is understood. assuming the conditioning information is true. This means Probability can be viewed as a generalization of classical that probability theory can be used for hypothetical reason- propositional LOGIC that is useful when the truth of particu- ing, such as “If I drop this glass, it will probably break.” lar propositions is uncertain. This generalization has two Another consequence of the degree of belief interpretation important components: one is the association of a numerical of probability is that there is no such thing as the probability degree of belief with the proposition; the other is the explicit of a proposition; the numerical degree is always dependent dependence of that degree of belief on the evidence used to on the conditioning evidence. This means that a statement assess it. Writers such as E. T. Cox (1989) and E. T. Jaynes such as “the probability of a coin landing tails is 1/2,” is (1998) have derived standard probability theory from this meaningless—some experts are able to flip coins on request generalization and the requirement that probability should so that they land heads or tails. The existence of such experts agree with standard propositional logic when the degree of refutes the view that the probability of 1/2 for tails is a physi- belief is 1 (true beyond doubt) or 0 (false beyond doubt). cal property of the coin, just like its mass, as has been This means that any entity that assigns degrees of belief to asserted by some writers. The reason many users of probabil- uncertain propositions, given evidence, must use probability ity talk about “the” probability of tails for a coin or any other to do so, or be subject to inconsistencies, such as assigning event is that the event occurence is assumed to be under “typ- different degrees of belief to the same proposition given the ical” or “normal” conditions. In a “normal” coin toss, we same evidence depending on the order that the evidence is expect tails with a probability of 1/2, but this probability is evaluated. The resulting laws of probability are: conditioned on the normality of the toss and the coin, and has 1. 0 ≤ P(A|I) ≤ 1 no absolute status relative to any other conditional probabil- 2. P(A|I) + P(not A|I) = 1 (Probabilistic Law of excluded ity about coins. Probabilities that are implicitly conditioned middle) by “normal” operation are called by some “propensities,” and 3. P(A or B|I) = P(A|I) + P(B|I) - P(A & B|I) assumed to be an intrinsic property of a proposition. For 4. P(A & B|I) = P(A|I)*P(B|A & I) (Product Law) example, an angry patient has a propensity to throw things, All other probability laws, such as Bayes’s theorem, can be even if that patient is physically restrained. However, this constructed from the above laws. statement is just a shorthand for the probability of the patient This derivation of probability provides a neat resolution to behave in a particular way, conditioned on being angry and to the old philosophical problem of justifying INDUCTION. In unconstrained—“normal” behavior for an angry person. the eighteenth century, the philosopher David HUME dis- Although there is universal agreement on the fundamen- proved the assumption that all “truth” could be established tal laws of probability, there is much disagreement on inter- deductively, as in mathematics, by showing that proposi- pretation. The two main interpretations are the “degree of tions such as “The sun will rise tomorrow” can never be belief ” (subjective) interpretation and the “long run fre- known with certainty, no matter how may times the sun has quency” (frequentist or objective) interpretation. In the fre- risen in the past. That is, generalizations induced from par- quentist interpretation, it is meaningless to assign a ticular evidence are always subject to possible refutation by probability to a particular proposition, such as “This new further evidence. However, an entity using probability will type of rocket will launch successfully on the first try,” assign such a high probability to the sun rising tomorrow, because there are no previous examples on which to base a given the evidence, that it is rational for it to make decisions relative frequency. The degree of belief interpretation can based on this belief, even in the absence of complete cer- assign a probability in this case by using evidence such as tainty. the previous history of other rocket launches, the complexity 674 Problem Solving of the new rocket, the failure rate of machinery of compara- See also BAYESIAN LEARNING; BAYESIAN NETWORKS; ble complexity, knowing who built it, and so on. When suffi- PROBABILISTIC REASONING; RATIONAL CHOICE THEORY; RA- cient frequency evidence is available, and this is the best TIONAL DECISION MAKING; STATISTICAL LEARNING THEORY evidence, then the frequentist and the subjectivist will give —Peter Cheeseman essentially the same probability. In other words, when the observed frequency of similar events is used as the condi- References tioning information, both interpretations agree, but the degree of belief interpretation gives reasonable answers even Bayes, T. (1763). An essay towards solving a problem in the doc- when there is insufficient frequency information. trine of chances. Philosophical Transactions of the Royal Soci- The main form of probabilistic inference in the degree of ety of London 53: pp. 370–418. Berger, J. O. (1985). Statistical Decision Theory and Bayesian belief interpretation is to use Bayes’s theorem to go from a Analysis. 2nd ed. Springer. prior probability (or just “prior”) on a proposition to a pos- Cox, E. T. (1989). Bayesian Statistics: Principles, Models, and terior probability conditioned on the new evidence. For this Applications. S. James Press, Wiley. reason, the degree of belief interpretation is referred to as Jaynes, E. T. (1989). Where do we stand on maximum entropy. In Bayesian inference. It dates back to its publication in 1763 R. Rosenkrantz, Ed., E. T. Jaynes: Papers on Probability, Sta- in a posthumous paper by the Rev. Thomas Bayes. However, tistics and Statistical Physics. Dordrecht: Kluwer. before any specific evidence has been incorporated in Baye- Jaynes, E. T. (1998). “Probability Theory: The Logic of Science.” sian inference, a prior probability distribution must be given Not yet published book available at http://omega.albany.edu: over the propositions of interest. In 1812, Pierre Simon 8008/JaynesBook.html. Laplace proposed using the “principle of indifference” to Laplace, P. S. (1812). Théorie Analytique des Porbabiletés. Paris: Courcier. assign these initial (prior) probabilities. This principle gives equal probability to each possibility. When there are con- straints on this set of possibilities, the “principle of maxi- Problem Solving mum entropy” (Jaynes 1989) must be used as the appropriate generalization of the principle of indifference. Solving a problem is transforming a given situation into a Even here, subjectivity is apparent, as different observers desired situation or goal (Hayes 1989). Problem solving may perceive different sets of possibilities, and so assign may occur inside a mind, inside a computer, in some combi- different prior probabilities using the principle of indiffer- nation of both, or in interaction with an external environ- ence. For example, a colorblind observer may see only ment. An example of the first would be generating an small and large flowers in a field, and so assign a prior prob- English sentence; of the last, driving to the airport. If a ability of 1/2 to the small size possibility; but another detailed strategy is already known for reaching the goal, no observer sees that the large flowers have two distinct colors, problem solving is required. A strategy may be generated in and so assigns a prior probability of 1/3 to the small size advance of any action (planning) or while seeking the goal. possibility because this is now one of three possibilities. Studying how humans solve problems belongs to cogni- There is no inconsistency here, as the different observers tive psychology. How computers solve problems belongs to have different information. As specific flower data is col- artificial intelligence. ROBOTICS studies how computers lected, a better estimate of the small flower probability can solve problems requiring interaction with the environment. be obtained by calculating the posterior probability condi- As similar processes are often used, there is frequent tioned on the flower data. If there is a large flower sample, exchange of ideas among these three subdisciplines. the posterior probabilities for both observers converges to To solve a problem, a representation must be generated, the same value. In other words, data will quickly “swamp” or a preexisting representation accessed. A representation weak prior probabilities such as those based on the principle includes (1) a description of the given situation, (2) opera- of indifference, which is why different priors are typically tors or actions for changing the situation, and (3) tests to not important in practice. determine whether the goal has been achieved. Applying The main reason for the vehement disagreement between operators creates new situations, and potential applications the frequentist (or classical statistics) interpretation and the of all permissible operators define a branching tree of degree of belief (or Bayesian) interpretation is the perjora- achievable situations, the problem space. Problem solving tive label “subjective” associated with the Bayesian amounts to searching through the problem space for a situa- approach, particularly in the assignment of prior probabili- tion that satisfies the tests for a solution (VanLehn 1989). ties. This dispute is largely academic, as in practice, domain In computer programs and (as cumulating evidence indi- knowledge usually suggests appropriate priors. Because pri- cates) in people, operators usually take the form of condi- ors are inherently subjective does not mean that they are tion-action rules (productions). When the system notices arbitrary, as they are based on the subject’s experience. that the conditions of a production are satisfied, it takes the Recently, writers such as J. O. Berger (1985) have shown corresponding action of accessing information in memory, that the “objective” frequentist interpretation is just as sub- modifying information, or acting on the environment (New- jective as the Bayesian. In other words, the attempt to cir- ell and Simon 1972). cumscribe the definition of probability to “objective” long- In most problems of interest, the problem space is very run frequencies not only greatly reduced its applicability but large (in CHESS, it contains perhaps 1020 states; in real life, did not succeed in eliminating the inherent subjectivity in many more). Even the fastest computers cannot search such reasoning under UNCERTAINTY. Problem Solving 675 spaces exhaustively, and humans require several seconds to ceeds largely by recognition of symptoms, reinforced by examine each new state. Hence, search must be highly inference processes. Most of what is usually called “intu- selective, using heuristic rules to select a few promising ition,” and even “inspiration,” can be accounted for by rec- states for consideration. Chess grandmasters seldom search ognition processes. more than 100 states before making a move; the most pow- Testing theories of problem solving, usually expressed as erful chessplaying computer programs search tens of bil- computer programs that simulate the processes, requires lions, still a minuscule fraction of the total. In such tasks, observing both outcomes and the steps along the way. At expert computer programs trade off computer speed for present, the most fine-grained techniques for observation of some human selectivity (De Groot 1965). the processes are verbal (thinking aloud) or video protocols, The heuristics that guide search derive from properties of and eye movement records (Ericsson and Simon 1993). the task (e.g., in chess, “Ignore moves that lose pieces with- Studies of brain damage and MAGNETIC RESONANCE IMAG- out compensation”). If a domain has strong mathematical ING (MRI) of the brain are beginning to provide some structure (e.g., is describable as a linear programming prob- sequential information, but most current understanding of lem), strategies may exist that always find an optimal solu- problem solving is at the level of symbolic processes, not tion in acceptable computation time. In less well structured neural events. domains (including most real-life situations) the heuristics Research has progressed from well-structured problems follow plausible paths that often find satisfactory (not neces- calling for little specific domain knowledge to ill-structured sarily optimal) solutions with modest computation but with- problems requiring extensive knowledge. As basic under- out guarantees of success. In puzzles, the problem space standing of problem solving in one domain is achieved, may be small, but misleadingly constructed so that plausible research moves to domains of more complex structure. heuristics avoid the solution path. For example, essential There has also been considerable movement “downward” intermediate moves may increase the distance from the goal, from theories, mainly serial, of symbolic structures that whereas heuristics usually favor moves that decrease the model processes over intervals of a fraction of a second or distance. longer, to theories, mainly parallel, that use connectionist An important heuristic is means-ends analysis (Newell networks to represent events at the level of neural circuits. and Simon 1972). Differences between the current situation Finally, there has been movement toward linking problem and the goal situation are detected, and an operator selected solving with other cognitive activities through unified mod- that usually removes one of these differences. After applica- els of cognition (Newell 1990). tion of the operator eliminates or decreases the difference, Recent explorations extend to such ill-structured the process is repeated. If the selected operator is not appli- domains as scientific discovery (e.g., Langley et al. 1987) cable to the current situation, a subgoal is established of and architectural design (Akin 1986). In modeling scientific applying the operator. Means-ends analysis is powerful and discovery and similarly complex phenomena, multiple general, but is not useful in all problems: for example, it problem spaces are required (spaces of alternative hypothe- cannot solve the Rubik’s Cube puzzle in any obvious way. ses, alternative empirical findings, etc.). We are learning Problems are called well structured if the situations, how experimental findings induce changes in representa- operators, and goal tests are all sharply defined; ill struc- tion, and how such changes suggest new experiments. Allen tured, to the extent that they are vaguely defined. Blending Newell’s (1990) Soar system operates in such multiple petroleum in a refinery, using linear programming, is a well- problem spaces. Kim, Lerch, and Simon (1995) have structured problem. Playing chess is less well structured, as explored the generation of problem representations, and it requires fallible heuristics to seek and evaluate moves. there is increasing attention to nonpropositional inference Designing buildings and writing novels are highly ill- (e.g., using mental IMAGERY in search; Glasgow, Naray- structured tasks. The tests of success are complex and ill anan, and Chandrasekaran 1995). Several models now can defined, and are often elaborated during the solution process represent information in both propositional (verbal or math- (Akin 1986). The alternative operations for synthesizing a ematical) and diagrammatic forms, and can use words and design are innumerable, and may be discovered only by mental images conjointly to reason about problems. inspecting a partially completed product. As optimization is Another important area of recent research activity impossible and several satisfactory solutions may be explains intuition and insight in terms of the recognition pro- encountered, the order in which alternatives are synthesized cesses of the large memory structures found in knowledge- strongly affects the final product. Starting with the floor plan rich domains (Kaplan and Simon 1990; Simon 1995). Yet produces a different house than starting with the facade. another area, robotics, shows, by studying problem solving The study of problem solving has led to considerable in real-world environments (Shen 1994), how the problem- understanding of the nature of EXPERTISE (human and auto- solver’s incomplete and inaccurate models of changing situa- mated). The expert depends on two main processes: HEURIS- tions are revised and updated by sensory feedback from the TIC SEARCH of problem spaces, and recognition of cues in environment. the situation that access relevant knowledge and suggest Finally, interest in knowledge-rich domains has called heuristics for the next step. In domains that have been stud- attention to the essential ties between problem solving and ied, experts store tens of thousands or hundreds of thou- LEARNING (Kieras and Bovair 1986). Both connectionist sands of “chunks” of information in memory, accessible learning systems and serial symbolic systems have shown when relevant cues are recognized (Chi, Glaser, and Farr considerable success in accounting for concept learning 1988). Medical diagnosis (by physicians or computers) pro- (CATEGORIZATION), a key to the recognition capabilities of 676 Procedural Semantics knowledge-rich systems. Self-modifying systems of condition- Procedural Semantics action rules (adaptive production systems) have been shown capable, under classroom conditions, of accounting for stu- See FUNCTIONAL ROLE SEMANTICS dents’ success in learning such subjects as algebra and physics from worked-out examples (Zhu et al. 1996). Production Systems See also DECISION MAKING; KNOWLEDGE REPRESENTA- TION; PRODUCTION SYSTEMS; NEWELL, ALLEN Production systems are computer languages that are widely —Herbert A. Simon employed for representing the processes that operate in References models of cognitive systems (NEWELL and Simon 1972). In a production system, all of the instructions (called pro- Akin, O. (1986). Psychology of Architectural Design. London: ductions) take the form: Pion. Chi, M. T. H., R. Glaser, and M. Farr. (1988). The Nature of Exper- IF<, THEN<, tise. Hillsdale, NJ: Erlbaum. That is to say, “if certain conditions are satisfied, then take De Groot, A. (1965). Thought and Choice in Chess. The Hague: the specified actions” (abbreviated C → A). Production sys- Mouton. tem languages have great generality: they can possess the Ericsson, K. A., and H. A. Simon. (1993). Protocol Analysis. Rev. full power and generality of a Turing machine (see TURING). ed. Cambridge, MA: MIT Press. Glasgow, J., N. H. Narayanan, and B. Chandrasekaran, Eds. They have an obvious affinity to the classical stimulus- response (S → R) connections in psychology, but greater (1995). Diagrammatic Reasoning: Computational and Cogni- tive Perspectives. Menlo Park, CA: AAAI/MIT Press, pp. 403– complexity and flexibility, for, in production systems, both 434. the conditions and the actions may, and generally do, con- Hayes, J. R. (1989). The Complete Problem Solver. 2nd ed. Hills- tain variables that are instantiated to the appropriate values dale, NJ: Erlbaum. in each separate application. Kaplan, C., and H. A. Simon. (1990). In search of insight. Cogni- The conditions of a production are propositions that state tive Psychology 22: 374–419. properties of, or relations among, the components of the Kieras, D. E., and S. Boviar. (1986). The acquisition of procedures system being modeled, in its current state. In implementing from text. Journal of Memory and Language 25: 507–524. production systems the conditions are usually stored in a Kim, J., J. Lerch, and H. A. Simon. (1995). Internal representation and rule development in object-oriented design. ACM Transac- WORKING MEMORY, which may represent short-term mem- tions on Computer-Human Interaction 2 (4): 357–390. ory or current sensory information, or an activated portion Langley, P., H. A. Simon, G. L. Bradshaw, and J. M. Zytkow. of semantic memory (Anderson 1993). To activate a produc- (1987). Scientific Discovery: Computational Explorations of tion, all of the conditions specified in its “IF” clause must be the Creative Processes. Cambridge, MA: MIT Press. satisfied by one or more elements in working memory. The Newell, A. (1990). Unified Theories of Cognition. Cambridge, actions that are then initiated may include actions on the MA: Harvard University Press. system’s environment (e.g., motor actions) or actions that Newell, A., and H. A. Simon. (1972). Human Problem Solving. alter its memories, including erasing and creating working Englewood Cliffs, NJ: Prentice-Hall. memory elements. Shen, W. (1994). Autonomous Learning from the Environment. The operation of a production system can be illustrated New York: W. H. Freeman. Simon, H. A. (1995). Explaining the ineffable: AI on the topics of by a simple algebraic example that solves linear equations intuition, insight and inspiration. Proceedings of the Fourteenth in one unknown: International Joint Conference on Artificial Intelligence 1: 939–948. 1. IF the expression has the form X = N, where N is a num- VanLehn, K. (1989). Problem solving and cognitive skill acquisi- ber, THEN halt and check by substituting N in the origi- tion. In M. L. Posner, Ed., Foundations of Cognitive Science. nal equation. Cambridge, MA: MIT Press. 2. IF there is a term in X on the right-hand side, THEN sub- Zhu, X., Y. Lee, H. A. Simon, and D. Zhu. (1996). Cue recognition tract it from both sides, and collect terms. and cue elaboration in learning from examples. Proceedings of 3. IF there is a numerical term on the left-hand side, THEN the National Academy of Sciences 93: 1346–1351. subtract it from both sides, and collect terms. 4. IF the equation has the form NX = M, N ≠ 1, THEN Further Readings divide both sides by N. If, for instance, the equation were 7X + 6 = 4X + 12, the Larkin, J. H. (1983). The role of problem representation in physics. In D. Gentner, and A. Collins, Eds., Mental Models. Hillsdale, condition of the second production would be satisfied, and NJ: Erlbaum. 4X would be subtracted from both sides, yielding 3X + 6 = McCorduck, P. (1979). Machines Who Think. San Francisco: Free- 12. Now the condition of the third production is satisfied, man. and 6 is subtracted from both sides, yielding 3X = 6. Next, Polya, G. (1957). How to Solve It. Garden City, NY: Doubleday- the condition of the fourth production is satisfied and both Anchor. sides are divided by 3, yielding X = 2. Finally, the condition Simon, H. A. (1996). The Sciences of the Artificial. 3rd ed. Cam- of the first production is satisfied, and substituting 2 for X in bridge, MA: MIT Press. the original equation and simplifying gives 14 + 6 = 8 + 12, Wertheimer, M. (1945). Productive Thinking. New York: Harper or 20 = 20. and Row. Production Systems 677 Notice that at the outset, the conditions of both produc- have already seen above) and add them to its production tions 2 and 3 were satisfied. A production system must con- system. Thenceforth, it would be capable of solving equa- tain precedence rules that select which production will be tions of this general kind. executed when the conditions of more than one are satisfied This method, widely applied in adaptive production sys- (Brownston et al. 1985). One way in which the set of pro- tems, of noting and eliminating from an equation expres- ductions that are executable at any time can be limited is by sions that are absent from the final result, is essentially the including goals among their conditions. A goal is simply a method of means-ends analysis incorporated in the General symbol that must be present in working memory in order for Problem Solver (Newell and Simon 1972; see also HEURIS- a production containing that goal among its conditions to TIC SEARCH and PROBLEM SOLVING). The key steps in execute. Goals are set (i.e., goal symbols are placed in means-ends analysis are (1) to detect differences between working memory) as part of the actions of other produc- the current and goal situations and (2) to find actions capa- tions. Goals establish contexts so that only productions rele- ble of removing these differences. The set of productions vant to the current context will be executed. For example, shown earlier is simply the “Table of Connections” between we could add to the condition side of each production in the differences and operations in the GPS system. system for algebra described above “IF the goal is to solve Adaptive production systems, using the method of an equation &.” Then, even if the other conditions of these worked-out examples, today provide the theoretical founda- productions were satisfied, if that goal symbol were not in tion for a number of computer tutoring systems and mathe- working memory, the productions would not execute. matics textbooks and workbooks employed successfully in Production systems were first invented by the logician the United States and China (Anderson et al. 1995; Zhu et Emil Post (1943) to provide a simple, clean language for the al. 1996) investigation of questions in the foundations of LOGIC and See also COGNITIVE ARCHITECTURE; COGNITIVE MODEL- mathematics. They were borrowed, sometimes under the ING, SYMBOLIC; CHURCH-TURING THESIS label of “Markov productions,” for use in computer systems —Herbert A. Simon programming (languages for compiling compilers). In about the mid-1960s, they were introduced into cognitive science at References Carnegie Mellon University, some of their early uses being in Anderson, J. R. (1993). Rules of the Mind. Hillsdale, NJ: Erlbaum. the General Problem Solver (GPS; Newell, Shaw, and Simon Anderson, J. R., A. T. Corbett, K. R. Koedinger, and R. Pelletier. 1960), and in Tom Williams’ thesis (Williams 1972) on a (1995). Cognitive tutors: Lessons learned 94. Journal of Learn- general game-playing program (see GAME-PLAYING SYS- ing Science 4: 167–207. TEMS). They also found early use as languages for FORMAL Brownston, L., R. Ferrell, E. Kant, and N. Martin. (1985). Pro- GRAMMARS of natural language. Among production systems gramming Expert Systems in OPS5: An Introduction to Rule- widely used in cognitive simulation are OPS5 (Brownston et based Programming. Reading, MA: Addison-Wesley. al. 1985; Cooper and Wogrin 1988), Prolog (Clocksin and Clocksin, W. F., and C. S. Mellish. (1994). Programming in Pro- Mellish 1994), and Act-R (Anderson et al. 1993). log. 4th ed. New York: Springer. Cooper, T., and N. Wogrin. (1988). Rule-Based Programming with Adaptive production systems are production systems that OPS5. San Mateo, CA: Kaufmann. contain a learning component that is capable of modifying Neves, D. M. (1978). A computer program that learns algebraic productions already in the system and of creating new pro- procedures by examining examples and working problems in a ductions that can be added to the system (Neves 1978; see textbook. Proceedings of the Second Conference of Computa- LEARNING SYSTEMS). Neves showed how this could be done tional Studies of Intelligence. Toronto: Canadian Society for for algebra by the method of learning from worked-out Computational Studies of Intelligence, pp. 191–195. examples. Consider our earlier example: Newell, A., and H. A. Simon. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. 7X + 6 = 4X + 12 Newell, A., J. C. Shaw, and H. A. Simon. (1960). Report on a gen- 3X + 6 = 12 eral problem solving program for a computer. Proceedings of 3X = 6 the International Conference on Information Processing. Paris: X=2 UNESCO, pp. 256–264. Post, E. L. (1943). Formal reductions of the general combinatorial Assume that the adaptive production system had learned decision problem. American Journal of Mathematics 65: 197– previously that the allowable actions include adding or sub- 215. tracting the same numbers from both sides of an equation, Williams, T. (1972). Some studies in game playing with a digital or multiplying or dividing both sides by the same number, computer. In H. A. Simon and L. Siklossy, Eds., Representation and simplifying by combining similar terms, but did not and Meaning: Experiments with Information Processing Sys- know when these actions should be applied to solve a prob- tems. Englewood Cliffs, NJ: Prentice-Hall. lem. It would now examine the first two lines above, discov- Zhu, X., Y. Lee, H. A. Simon, and D. Zhu. (1996). Cue recognition ering that the unwanted 4X (“unwanted” because there is no and cue elaboration in learning from examples. Proceedings of the National Academy of Sciences 93: 1346–1351. such term in the last line) had been removed from the right- hand side, and that the action was to subtract this term. In Further Readings the same way it would find that the condition “unwanted numerical term on left-hand side” characterized the second Langley, P., H. A. Simon, G. L. Bradshaw, and J. M. Zytkow. change, and “unwanted numerical coefficient of X,” the (1987). Scientific Discovery: Computational Explorations of third. It would create three new productions (the ones we the Creative Process. Cambridge, MA: MIT Press. 678 Propositional Attitudes tions of the representation. A third alternative is to take Newell, A. (1973). Production systems: Models of control struc- tures. In W. G. Chase, Ed., Visual Information Processing. New propositional contents—the referents of that-clauses—to be York: Academic Press. the truth conditions themselves. In this account, the proposi- Newell, A. (1990). Unified Theories of Cognition. Cambridge, tion that Booth killed Lincoln can be identified with the set MA: Harvard University Press. of possible circumstances in which Booth killed Lincoln. Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San This conception of a proposition—call it an informational Francisco: Kaufmann. content—is the most coarse-grained conception of represen- Shen, W. M. (1994). Autonomous Learning from the Environment. tational content. Any structured proposition will determine New York: W. H. Freeman and Co. a unique informational content, but different structured Simon, H. A. (1996). The Sciences of the Artificial. 3rd ed. Cam- propositions may have the same informational content. bridge, MA: MIT Press. The choice between different accounts of content Tabachneck-Schijf, H. J. M., A. M. Leonardo, and H. A. Simon. (1997). CaMeRa: a computational model of multiple represen- depends on the role of content in determining the proposi- tations. Cognitive Science 21: 305–330. tional attitudes—the way in which the properties such as Waterman, D. A., and F. Hayes-Roth, Eds. (1978). Pattern- believing that Booth killed Lincoln are determined as a func- Directed Inference Systems. New York: Academic Press. tion of the content that Booth killed Lincoln. The more fine- grained conceptions of content will be justifiable only if the distinctions between different fine-grained contents with the Propositional Attitudes same informational content play a role in distinguishing dif- ferent states. We also need a conception of content that can Propositional attitudes are mental states with representa- account for intuitive judgments about attitudes. If it is intu- tional content. Belief is the most prominent example of a itively obvious that believing that P is different from believ- propositional attitude. Others include intention, wishing and ing that Q, then we need a notion of content according to wanting, hope and fear, seeming and appearing, and tacit which “that P” and “that Q” denote different propositions. presupposition. Verbs of propositional attitude express a The defender of the coarse-grained conception of content relation between an agent and some kind of abstract needs to reconcile this account with apparently conflicting object—the content of the attitude, the object that is denoted intuitions (see LOGICAL OMNISCIENCE). by a nominalized sentence. So a statement such as “Fred The problem of propositional attitudes is often discussed believes that fleas have wings” says that Fred stands in the in the context of a problem of semantics: the problem of believes relation to that fleas have wings. The predicate giving the compositional semantics for propositional atti- “believes that fleas have wings” expresses a property that is tude reports. The focus of attention in such discussions has ascribed to Fred. A philosophical account of propositional been on the role of pronouns and other context dependent attitudes must answer two interrelated kinds of questions: expressions, and of quantification in belief contexts. The first, what kind of thing is the content of an attitude (what is problems are closely connected: obviously, the question, the object denoted by the that-clause)? Second, how can the “what kind of object is the content of a belief?” cannot be states of mind of agents relate them to such objects? The answered independently of the question, “what kind of problem of explaining how mental states can have represen- object is the referent of a that-clause?” But the question tational content is the problem of INTENTIONALITY. about the semantics of attitude reports needs to be distin- A propositional attitude—more generally, any state or act guished from the philosophical problem of explaining what that can be said to be representational—represents the world propositional attitude properties are, and what it is that as being a certain way, and the content of the attitude is what gives them their content. There are a number of alternative determines the way the world is represented (see MENTAL strategies that have been developed for giving such expla- REPRESENTATION). So propositions must be objects that nations. have truth conditions that must be satisfied for a representa- First, in one familiar kind of account of propositional tional state with that content to correctly represent the world. attitudes, belief and desire are correlative dispositions that Many different accounts of the contents of propositional are displayed in rational action (see RATIONAL AGENCY and attitudes have been proposed. Often, propositions are INTENTIONAL STANCE). Roughly, to believe that P is to be assumed to be complex objects, ordered sequences with disposed to act in ways that would satisfy one’s desires if P ordered sequences as parts that reflect the recursive seman- (along with one’s other beliefs) were true, and to desire that tic structure of the sentences that express the proposition. In P is to be disposed to act in ways that would tend to bring it one kind of account of structured propositions, the primitive about in situations in which one’s beliefs were true. This is constituents are taken to be Fregean senses, modes of pre- only a very rough sketch of a strategy of locating belief and sentation (see SENSE AND REFERENCE), or CONCEPTS. In an desire in a general theory of action. Because belief and alternative account of structured propositions, the constitu- desire are explained in terms of each other, the strategy does ents are individuals, properties, and relations. So, for exam- not offer the promise of a reductive analysis of propositional ple, the proposition that Booth killed Lincoln might contain, attitudes, and must be supplemented with additional con- as constituents, the senses of the names “Booth” and “Lin- straints if the contents of attitudes are not to be wholly inde- coln” and of the verb “kill,” or alternatively, the men Booth terminate. and Lincoln and the killing relation. A second strategy that may supplement the first—the in- Both of these kinds of account treat the content of propo- formation theoretic strategy—is to explain representational sitional attitudes as a recipe for determining the truth condi- states in terms of causal and counterfactual dependencies Prosody and Intonation 679 between the agent and the world (see INFORMATIONAL SE- Fodor, J. A. (1987). Psychosemantics: The Problem of Meaning in Psychology of Mind. Cambridge, MA: MIT Press. MANTICS). One may explain the content of the representa- Quine, W. V. (1956). Quantifiers and propositional attitudes. Jour- tional states of an agent in terms of the way those states tend nal of Philosophy 53: 177–187. to vary systematically in response to states of the environ- Richard, M. (1990). Propositional Attitudes. Cambridge: Cam- ment. A state of a person or thing carries the information bridge University Press. that P if the person or thing is in the state because of the fact Salmon, N., and S. Soames, Eds. (1982). Propositional Attitudes. that P, and would not be in the state if it were not the case Oxford: Oxford University Press. that P. The strategy is to explain the content of representa- Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press. tional states in terms of the information that the states tend Stich, S. (1983). From Folk Psychology to Cognitive Science: The to carry, or the information that the states would carry if Case Against Belief. Cambridge, MA: MIT Press. they were functioning properly, or if conditions were nor- mal. The information theoretic strategy will give determi- Proprioception nate content to representational states only relative to some specification of the relevant normal conditions. A central task of the development and defense of the information- See AFFORDANCES; IMITATION; SELF; SENSATIONS theoretic strategy is to give an account of these conditions. A third strategy—the linguistic strategy—is to begin Prosody and Intonation with linguistic representation, and to explain the content of mental states in terms of the content of sentences that real- ize the mental states, or of sentences to which the agent is The term prosody refers to the grouping and relative promi- disposed to assent. One version of this strategy, defended by nence of the elements making up the speech signal. One Jerry Fodor (1987) among others, assumes that proposi- reflex of prosody is the perceived rhythm of the speech. Pro- tional attitudes are realized by the storage (in the “belief sodic structure may be described formally by a hierarchical box,” to use the popular metaphor) of sentences of a LAN- structure in which the smallest units are the internal compo- GUAGE OF THOUGHT. Another version takes a social practice nents of the syllable and the largest is the intonation phrase. of speech as primary, accounting for the contents of beliefs Units of intermediate scale include the syllable, the metrical in terms of the contents of the sentences of the public lan- foot, and the prosodic word (Selkirk 1984; Hayes 1995). guage that the agent “holds true,” or to which the agent is Intonation refers to phrase-level characteristics of the disposed to assent. Donald Davidson (1984) has defended melody of the voice. Intonation is used by speakers to mark this kind of strategy (see RADICAL INTERPRETATION). The the pragmatic force of the information in an utterance. The first version of the linguistic strategy needs a distinction alignment of the intonation contour with the words is con- between explicit or “core” beliefs and implicit beliefs, strained by the prosody, with intonational events falling on because it would not be plausible to say that everything the most prominent elements of the prosodic structure and believed is explicitly stored. The second version has a prob- at the edges. As a result, intonational events can often pro- lem explaining attitudes with content that is not easily vide information to the listener about the prosodic structure, expressed in linguistic form (for example, perceptual in addition to carrying a pragmatic message. The term into- states), and it seems to conflict with the intuition that nation is often used, by extension, to refer to systematic thought without the capacity for speech is at least a possibil- characteristics of the voice melody at larger scales, such as ity. Both accounts need to be supplemented with some the discourse segment or the paragraph (Beckman and account of what it is in virtue of which the relevant kind of Pierrehumbert 1986; Pierrehumbert and Hirschberg 1990; linguistic representations have content. Ladd 1996). The primary phonetic correlate of intonation is the fun- See also COMPOSITIONALITY; FOLK PSYCHOLOGY; GRICE; damental frequency of the voice (F0), which is perceived as INTENTIONAL STANCE; MEANING pitch and which arises from the rate of vibration of the vocal —Robert Stalnaker folds. The F0 is determined by the configuration of the lar- ynx, the subglottal pressure, and the degree of oral closure References and Further Readings (Clark and Yallop 1990; Titze 1994). Articulatory maneu- vers that change the rate of vibration of the vocal folds also Barwise, J., and J. Perry. (1983). Situations and Attitudes. Cam- affect the exact shape of the glottal waveform and hence the bridge, MA: MIT Press. voice timbre (or voice quality). Perceived voice quality is Burge, T. (1978). Individualism and the mental. Midwest Studies in Philosophy, 4, Studies in Metaphysics. Minneapolis: University probably used in perception to assist in the identification of of Minnesota Press. intonation patterns (Pierrehumbert 1997). Intonation is not Crimmins, M. (1992). Talk about Belief. Cambridge, MA: MIT the only source of F0 variation. Speech segments also have Press. systematic effects on F0. However, the largest segmental Davidson, D. (1984). Inquiries into Truth and Interpretation. effects are on the time-frequency scale of the smaller into- Oxford: Oxford University Press. national effects. Thus, F0 contours can be roughly viewed Dennett, D. (1987). The Intentional Stance. Cambridge, MA: MIT as a superposition of segmental factors on the intonationally Press. determined contour. Dretske, F. (1988). Explaining Behavior. Cambridge, MA: MIT Many experimental studies show that prosody affects all Press. aspects of the speech signal (see Papers in Laboratory Field, H. (1978). Mental representation. Erkenntnis 13: 9–61. 680 Prosody and Intonation Phonology and references cited there). In general, elements structure in which three intonation phrases are on a par with found in prosodically prominent positions are more force- each other. fully and fully articulated than elements in prosodically (1) This is the cat % that ate the rat % that stole the cheese. weak positions. The space of acoustic contrasts is therefore The intonation system of English has been extensively expanded in strong positions compared to that in weak studied. Points of agreement among many researchers in the positions. Edges of prosodic units also affect phonetic out- English-speaking countries have recently been codified in comes. Consonantal articulations tend to be strengthened the ToBI transcription standard, for which on-line training at initial edges of prosodic words and intonation phrases. materials are available (Pitrelli, Beckman, and Hirschberg Final syllables of words and intonation phrases are regu- 1994; Beckman and Ayers Elam 1994/97). According to this larly lengthened. An extensive literature on isochrony standard, intonation contours may be “spelled” using three addresses the possibility that speech has a steady beat with basic tonal elements: low tone (L), high tone (H), and down- a constant interval between the stresses. This literature has stepped high tone (!H). !H represents the combination of established that interstress intervals in fact vary widely as a high tone with a compression and lowering of the pitch function of the material comprising the interval. However, range; a sequence of !Hs generates an F0 contour with a when the principle determinants of duration are controlled descending staircase. Pitch accents, which mark promi- for, evidence of a tendency towards isochrony is reported nently stressed syllables in the phrase, are made up of these in some studies. elements. The nuclear accent is defined as the accent on the Contextual effects related to prosody are substantial and main stress of the entire phrase. The prenuclear accents fall rank with speech style and speaker characteristics as sources on prominent syllables earlier in the phrase. Every complete of variation in the realization of phonemes and DISTINCTIVE utterance must have (at least one) nuclear accent, but some FEATURES. The variation is great enough that a token of one utterances lack prenuclear accents. In addition to the pitch phoneme in one prosodic position can be identical to a accents, each contour has boundary tones which mark the token of a different phoneme in some other prosodic posi- edges of the intonation phrase. tion. For example, in American English, a phrase-final /z/ is All languages have prosody and intonation, but there are virtually identical to a medial /s/. Similarly, a 20-story many important differences among the systems found in building in Evanston, Illinois, provides an example of a “tall various languages. They differ in the total inventory of into- building,” but it would be an example of a “short building” national patterns and in the pragmatic meanings assigned to if it were in downtown Chicago. That is, the context-depen- particular patterns. Languages with lexical tone (whether dence of the phonetic realizations of phonemes is similar in tone languages such as Mandarin or classic pitch accent lan- character to context-dependence found in other domains, guages such as Japanese) tend to have somewhat simpler and it provides an example of the abstractness and adapt- intonational systems than English, presumably because ability of human cognition. much of the F0 contour is taken up with providing phonetic Because of intense research activity over the last two expression of the tones in the words (see Pierrehumbert and decades, the phonological theory of prosody and intonation Beckman 1988; Hayes and Lahiri 1991; Hayes 1995; Myers is now well developed. It characterizes the cognitive struc- 1996). tures that must be viewed as implicitly present in the minds In the prosodic domain, languages differ in the constraints of speakers in order to explain their use of prosody and they impose on the composition of the various units. At the intonation in both speech production and SPEECH PERCEP- phrasal level, they differ in how they set up the correspon- TION. dance between intonational phrases and syntactic and The central concepts of prosodic theory are the prosodic semantic structures. Some languages tend to locate prosodic units (the syllable, the foot, the intonation phrase, etc.) and breaks after a syntactic head, whereas others tend to locate the relations defined among these units. The units are tem- breaks before. Some languages (such as English) permit the porally ordered. Bigger units dominate smaller ones. Within main prominence to be located anywhere in the phrase (for each unit, a relationship of strength is available that singles the purpose of highlighting or foregrounding particular out one element as more prominent than the other elements words). Other languages make little or no use of variable of the same type in the group. Strength is inherited through placement of prominence within the phrase, instead moving the hierarchy; the head syllable of an intonation phrase can new information to fixed prosodically prominent positions. be defined as the strongest syllable in the strongest foot in Turning to smaller prosodic units, some languages permit the strongest word in that phrase. Although it is generally syllables with complicated consonant clusters and others do agreed that prosodic structures are hierarchical, they con- not (Goldsmith 1990). Languages also differ in foot structure trast with syntactic structures in making much less use of (Hayes 1995) and in the salience or importance of the differ- recursion. In SYNTAX, we find clauses embedded within ent prosodic units (Beckman 1995). For example, in English other clauses, but in PHONOLOGY, we do not find syllables the foot structure conspicuously shapes the lexical inventory embedded within other syllables. The only serious candi- and greatly affects how phonemes are pronounced. Foot date for a recursive node in prosody is the prosodic word, structure exists in Japanese but smaller units (the syllable and scholars do not agree about whether this node is recur- and the mora) vary much less with position in the foot and, sive or not. As a consequence of the relative flatness of the as a result, exhibit a robustness that they lack in English. prosodic structure, syntactic structures are flattened when In considering the contribution of prosody to interpreta- the prosodic phrasing is computed. For example, in sentence tion, it is useful to separate prosodic structure within the (1), a recursive syntactic structure corresponds to a prosodic Prosody and Intonation 681 understood force of patterns as they are presented in con- word from prosodic structure above the word level. Prosodic text. With a careful eye to the discourse context, experimen- structure within the word (i.e., syllable and foot structure) is tal work on intonational meaning is one of the more feasible an important factor in lexical access, shaping the segmenta- and promising areas for experimental work in PRAGMATICS. tion strategy in each language and the set of active competi- tors for any given word at any given time (Cutler 1995). See also PHONETICS; PRESUPPOSITION; PROSODY AND Prosodic structure above the word level (phrasing and INTONATION, PROCESSING ISSUES; SPOKEN WORD RECOGNI- phrasal prominence) reflects syntax, SEMANTICS, and DIS- TION; STRESS, LINGUISTIC; TONE COURSE structure. As a result, it has repercussions for syn- —Janet Pierrehumbert tactic parsing, for the scope of operators such as “only,” “even,” and “not,” for the understood reference of pronouns, and for the topic/comment structure of the discourse (Jack- References endoff 1972; Terken and Nooteboom 1987; Hirschberg and Beckman, M. E. (1995). On blending and the mora. Papers in Lab- Ward 1991). oratory Phonology 4: 157–167. Intonation contours function as independent pragmatic Beckman, M. E., and G. Ayers Elam. (1994/1997). Guide to ToBI morphemes. According to Pierrehumbert and Hirschberg Labelling. Electronic text and accompanying audio example (1990), the contour indicates the relationship of each utter- files available at http://ling.ohio-state.edu/Phonetics/E_ToBI/ ance to the mutual beliefs that are developed and modified etobi_homepage.html. in the course of a conversation. For example, an H accent Beckman, M. E., and J. Pierrehumbert. (1986). Intonational struc- marks an intended addition to the mutual beliefs, whereas L ture in Japanese and English. Phonology Yearbook 3: 15–70. accents mark information that is marked as salient but not to Clark, J. E., and C. Yallop. (1990). An Introduction to Phonetics and Phonology. Oxford: Blackwell. be added. The tremendous variety of understood meanings Cutler, A. (1995). Spoken word recognition and production. In J. of patterns in context arises from the interplay of these fac- Miller and P. Eimas, Eds., Speech, Language, and Communica- tors with the goals and assumptions of the interlocutors. tion. New York: Academic Press, pp. 97–136. (For other treatments of the pragmatic meaning of intona- Goldsmith, J. (1990). Autosegmental and Metrical Phonology. tional morphemes, see Gussenhoven 1983, Ward and Hir- Oxford: Blackwell. schberg 1985, and Morel 1995.) Gussenhoven, G. (1983). On the Grammar and Semantics of Sen- Intonation and prosody are obligatory. Every single utter- tence Accents. Publications in the Linguistics Sciences 16. Dor- ance has a prosodic analysis and represents some choice of drecht: Foris. intonation pattern, just as it represents some choice of pho- Hayes, B. (1995). Metrical Stress Theory: Principles and Case nemes and syllables. In experimental studies with aural Studies. Chicago: University of Chicago Press. Hayes, B., and A. Lahiri. (1991). Bengali intonational phonology. stimuli, it is not possible to avoid or omit the contribution of Natural Language and Linguistic Theory 9: 47–96. intonation by using a monotone F0 contour. Similarly, Hirschberg, J., and G. Ward. (1991). Accent and bound anaphora. experiments on words “in isolation” are in fact using words Cognitive Linguistics 2: 101–121. which are phrase-initial, phrase-final, and under main stress Jackendoff, R. (1972). Semantic Interpretation in Generative in the phrase (if the stimuli are well formed), because lin- Grammar. Cambridge, MA: MIT Press. guistic structure requires that every utterance no matter how Ladd, D. R. (1996). Intonational Phonology. Cambridge: Cam- short be a full intonation phrase. As a result, words pro- bridge University Press. duced “in isolation” also carry a complete phrasal melody. Morel, M.-A. (1995). Valeur énonciative des variations de hauteur Results of experiments on words in isolation often show mélodique en français. French Language Studies 5: 189–202. artifacts of this prosodic positioning and fail to generalize to Cambridge: Cambridge University Press. Myers, S. (1996). Boundary tones and the phonetic implementation words in running speech, which most often constitute only a of tone in Chichewa. Studies in African Linguistics 25: 29–60. part of a full intonation phrase. Papers in Laboratory Phonology. Cambridge: Cambridge Univer- The outcomes of experiments on syntactic processing, sity Press. Vol. 1, (1990). J. Kingston and M. E. Beckman, Eds.; scope, and reference resolution in running speech are likely Vol. 2, (1992). G. Docherty and D. R. Ladd, Eds.; Vol. 3, to be affected by the phrasal prosody of the stimuli. It is (1994). P. Keating, Ed.; Vol. 4, (1995). B. Connell and A. therefore desirable to control for this factor and to use an Arvaniti, Eds.; Vol. 5, Forthcoming, Broe and J. Pierrehumbert, established transcriptional standard to report the prosody of Eds.; Vol 6, Forthcoming, Ogden and Local, Eds. the stimuli actually used. Orthogonal variation of the word Pierrehumbert, J., (1997). Consequences of intonation for the string and the prosodic pattern may be used to factor out the voice Source. In S. Kiritani, H. Hirose, and H. Fujisaki, Eds., prosodic and nonprosodic factors in the domain under inves- Speech Production and Language, Speech Research 13. Berlin: Mouton, pp. 111–131. tigation. Pierrehumbert, J., and M. E. Beckman. (1988). Japanese Tone Experimental work on intonational meaning is challeng- Structure. Cambridge, MA: MIT Press. ing because the meanings by their very nature are highly Pierrehumbert, J., and J. Hirschberg. (1990). The meaning of into- variable with context. Judgments of intonational meaning nation contours in the interpretation of discourse. In P. Cohen, obtained for materials out of context are variable and diffi- J. Morgan, and M. Pollack, Eds., Plans and Intentions in Com- cult to interpret because they are affected by the subjects’ munication. Cambridge, MA: MIT Press, pp. 271–312. uncontrolled imaginings of what the context might be. How- Pitrelli, J. F., M. E. Beckman, and J. Hirschberg. (1994). Evalua- ever, very good results have been achieved with experimen- tion of prosodic transcription labeling reliability in the ToBI tal studies in which subjects evaluate the felicity of framework. Proceedings of the International Conference on particular patterns for specified discourse contexts or the Spoken Language Processing, Yokohama, Japan. 682 Prosody and Intonation, Processing Issues Whether prosodic and intonational information play a Selkirk E. O. (1984). Phonology and Syntax. Cambridge, MA: MIT Press. role in the processing of word forms per se—for instance, in Terken, J. M. B., and S. G. Nooteboom. (1987). Opposite effects of the activation of lexical entries—is as yet unresolved. The accentuation and deaccentuation on verification latencies for structure of spoken words again differs across languages in given and new information. Language and Cognitive Processes ways that affect this issue. Explicit suprasegmental distinc- 2: 145–163. tions between words—for example, TONE in languages such Titze, I. (1994). Principles of Voice Production. Englewood Cliffs, as Thai, pitch accent in languages such as Japanese—con- NJ: Prentice-Hall. strain word activation and thus show that suprasegmental Ward, G., and J. Hirschberg. (1985). Implicating uncertainty: The information can play a role at this level. Suprasegmental pragmatics of fall-rise. Language 61: 747–776. cues to LINGUISTIC STRESS in English nevertheless appear to play no part in word activation (Cutler 1986): two words that Further Readings differ solely in suprasegmental structure, such as foregoing Bird, S. (1995). Computational Phonology: A Constraint-Based (primary stress on the first syllable) and forgoing (primary Approach. Cambridge: Cambridge University Press. stress on the second syllable), are both activated when listen- Grice, M., and R. Benzmueller. (1997). Transcribing German ers hear either one, just as is the case with two words pro- intonation with GToBI. http://www.coli.uni-sb.de/phonetik/ nounced in every respect identically (such as sale and sail). projects/Tobi/gtobi.html. However, stress in English is, except in rare pairs such as Horne, M., Ed. (1998). Prosody: Theory and Experiment. Studies foregoing/forgoing, expressed segmentally (in vowel qual- Presented to Gosta Bruce. Dordrecht: Kluwer. ity) as well as suprasegmentally, so that segmental informa- Ladd, D. R. (1980). The Structure of Intonational Meaning. tion alone may in practice suffice for lexical activation in Bloomington, IN: University of Indiana Press. Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. this language. This is not necessarily the case in other stress Cambridge, MA: MIT Press. languages (Cutler, Dahan, and van Donselaar 1997). Pierrehumbert, J., and S. Steele. (1990). Categories of tonal align- In syntactic processing, the relevant questions have been: ment in English. Phonetica 46: 181–196. do prosody and intonation serve to divide the input into Venditti, J. (1995). Japanese ToBI Labelling Guidelines. http:// major syntactically motivated chunks? Does such informa- ling.ohio-state.edu/Phonetics/J_ToBI/jtobi_homepage.html. tion help to resolve ambiguity, such that sentences which Also in K. Ainsworth-Darnell and M. D’Imperio, Eds., Ohio allow more than one interpretation when they are written— State Working Papers in Linguistics. 50: 127–162. for example, I read about the repayment with interest—are effectively unambiguous when spoken? And does prosodic Prosody and Intonation, Processing Issues and intonational information determine selection between alternative syntactic analyses that present themselves, albeit PROSODY AND INTONATION determine much of the form of temporarily, during the processing even of an unambiguous spoken language. An account of the processing of language— sentence—for example, between possible continuations of the production and comprehension of words and sentences— The horse raced by the—(—gate;—Queen won)? The evi- must therefore pay attention to prosodic (rhythmic, grouping) dence is mixed (see special issues of Language and Cogni- and intonational (melodic) structure. The fact that more tive Processes and Journal of Psycholinguistic Research in research in PSYCHOLINGUISTICS has involved written language 1996 for overviews) but in general offers little support for than spoken language, however, means that the role of pros- direct availability of syntactic information in prosodic and ody and intonation in processing is not yet fully described. intonational structure. Prosodic hierarchies, after all, encode In studies of language comprehension, prosody and into- specifically prosodic, not syntactic relationships (Shattuck- nation have figured in research on SPOKEN WORD RECOGNI- Hufnagel and Turk 1996; Beckman 1996). Prosody may sig- TION, on SENTENCE PROCESSING (the computation of nal syntactic cohesion (Tyler and Warren 1987), and the syntactic structure) and on DISCOURSE processing. One role presence of a sentence accent or of prosodic correlates of a of prosody and intonation in word recognition is to aid in syntactic boundary can have an effect on syntactic analysis the operation of segmenting a continuous input into its com- in that it can, for example, lead the listener to prefer analy- ponent words. Studies in many languages (see, for example, ses that are consistent with the prosody (Nespor and Vogel the summaries in Otake and Cutler 1996) have shown that 1983). But no evidence suggests that syntactic analysis is listeners can use the rhythmic structure of utterances to directly derived from prosodic or intonational cues. determine where word boundaries are most likely to fall. In the comprehension of discourse structure, prosodic Because rhythmic structure differs across languages, this salience appears most important; speakers highlight via means that the processes involved in segmenting utterances accent the words that are semantically more central to a into words can also be language-specific—stress-based in message (Bolinger 1978; Ladd 1996), and listeners actively English (Cutler and Norris 1988), syllable-based in French search for accented words because of their central semantic (Mehler et al. 1981), and mora-based in Japanese (Otake et role (Cutler 1982; Sedivy et al. 1995). Furthermore, pro- al. 1993). This language specificity can result in inappropri- cessing is facilitated by the placement of accent on new ate application of native-language segmentation procedures information, and the deaccenting of old information (Bock to foreign languages with a different rhythmic structure and Mazzella 1983). Experimental evidence suggests that (Otake and Cutler 1996). Young infants can discriminate the processing of deaccented words involves integration between rhythmically dissimilar but not rhythmically simi- with an existing discourse model (Fowler and Housum lar languages (Nazzi, Bertoncini, and Mehler 1998). 1987), but it is unclear whether the facilitation observed Psychoanalysis, Contemporary Views 683 with deaccentuation reflects direct exploitation of accent Fowler, C. A., and J. Housum. (1987). Talkers’ signaling of “new” information in discourse-structure decisions or arises indi- and “old” words in speech and listeners’ perception and use of rectly via reference to an existing discourse model in the the distinction. Journal of Memory and Language 26: 489–504. Grabe, E., C. Gussenhoven, J. Haan, E. Marsi, and B. Post. (1998). course of decoding the poorer acoustic information avail- Preaccentual pitch and speaker attitude in Dutch. Language and able from deaccented speech. Finally, listeners can interpret Speech 41, 63–85. prosodic information to derive cues to topic and turn-taking Hirschberg, J., and J. Pierrehumbert. (1986). The intonational structure in discourse (Hirschberg and Pierrehumbert 1986). structuring of discourse. Proceedings of Twentyfourth Associa- Intonational structure is also important for the interpretation tion Computational Linguistics 134–144. of discourse (Pierrehumbert and Hirschberg 1990); the deri- Kuijpers, C., and W. van Donselaar. (1998). The influence of vation of meaning from intonation involves simultaneous rhythmic context on schwa epenthesis and schwa deletion. Lan- consideration of contours and of the sentential and dis- guage and Speech 41: 87–108. course context in which they appear (Grabe et al. 1997). Ladd, D. R. (1996). Intonational Phonology. Cambridge: Cam- The computation of both prosodic and intonational form bridge University Press. Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. must of course likewise play a role in speakers’ utterance Cambridge, MA: MIT Press. production (Levelt 1989), with prosodic generation refer- Mehler, J., J.-Y. Dommergues, U. Frauenfelder, and J. Segui. ring both to the lexical items and the syntactic structure (1981). The syllable’s role in speech segmentation. Journal of selected (Ferreira 1993; Meyer 1994), and intonational gen- Verbal Learning and Verbal Behavior 20: 298–305. eration referring to both the sentence to be spoken and its Meyer, A. S. (1994). Timing in sentence production. Journal of context (Ladd 1996). Production of phonologically alterna- Memory and Language 33: 471–492. tive forms of words (e.g., via deletion or addition of sounds, Nazzi, T., J. Bertoncini, and J. Mehler. (1998). Language discrimi- as when the middle vowel of family is deleted, or a vowel is nation by newborns: Towards an understanding of the role of inserted between the last two sounds of film) can be rhythm. Journal of Experimental Psychology: Human Percep- prompted by the prosodic pattern in which the word is tion and Performance 24: 756–766. Nespor, M., and I. Vogel. (1983). Prosodic structure above the uttered (Kuijpers and van Donselaar 1998). word. In A. Cutler and D. R. Ladd, Eds., Prosody: Models and Because, as noted above, there has been less psycholin- Measurements. Heidelberg: Springer, pp. 123–140. guistic research on issues specific to spoken language than Otake, T., and A. Cutler, Eds. (1996). Phonological Structure and on the processing of written language, and because it is in Language Processing: Cross-Linguistic Studies. Berlin: Mouton. addition true that LANGUAGE PRODUCTION has so far Otake, T., G. Hatano, A. Cutler, and J. Mehler. (1993). Mora or attracted far less experimental research than language com- syllable? Speech segmentation in Japanese. Journal of Memory prehension, it will be obvious that the production of prosody and Language 32: 358–378. and intonation is a research area very much in need of wider Pierrehumbert, J., and J. Hirschberg. (1990). The meaning of into- empirical attention. national contours in the interpretation of discourse. In P. R. Cohen, J. Morgan, and M. E. Pollack, Eds., Intentions in Com- See also COMPUTATIONAL PSYCHOLINGUISTICS; INNATE- munication. Cambridge, MA: MIT Press, pp. 271–323. NESS OF LANGUAGE; PHONOLOGY; SPEECH PERCEPTION; Sedivy, J., M. Tanenhaus, M. Spivey-Knowlton, K. Eberhard, and SYNTAX G. Carlson. (1995). Using intonationally marked presupposi- —Anne Cutler tional information in on-line language processing: Evidence from eye movements to a visual model. Proceedings of the Sev- References enteenth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum, pp. 375–380. Beckman, M. E. (1996). The parsing of prosody. Language and Shattuck-Hufnagel, S., and A. E. Turk. (1996). A prosody tutorial Cognitive Processes 11: 17–67. for investigators of auditory sentence processing. Journal of Bock, J. K., and J. R. Mazzella. (1983). Intonational marking of Psycholinguistic Research 25: 193–247. given and new information: Some consequences for compre- Tyler, L. K., and P. Warren. (1987). Local and global structure in hension. Memory and Cognition 11: 64–76. spoken language comprehension. Journal of Memory and Lan- Bolinger, D. L. (1978). Intonation across languages. In J. Green- guage 26: 638–657. berg, Ed., Universals of Human Language, vol. 2, Phonology. Palo Alto, CA: Stanford University Press, pp. 471–524. Further Readings Cutler, A. (1982). Prosody and sentence perception in English. In Friederici, A., Ed. (1998). Language Comprehension: A Biological J. Mehler, E. C. T. Walker, and M. F. Garrett, Eds., Perspectives Perspective. Heidelberg: Springer. on Mental Representation: Experimental and Theoretical Stud- Journal of Psycholinguistic Research. (1996). Special Issue on ies of Cognitive Processes and Capacities. Hillsdale, NJ: Prosodic Effects on Parsing 25(2). Erlbaum, pp. 201–216. Language and Cognitive Processes. (1996). Special Issue on Pros- Cutler, A. (1986). Forbear is a homophone: Lexical prosody does ody and Parsing 11(½). not constrain lexical access. Language and Speech 29: 201–220. Cutler, A., D. Dahan, and W. van Donselaar. (1997). Prosody in the comprehension of spoken language: A literature review. Lan- Psychoanalysis, Contemporary Views guage and Speech 40: 141–201. Cutler, A., and D. G. Norris. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psy- Though a number of key issues have been clarified, there is chology: Human Perception and Performance 14: 113–121. no more agreement now than there was half a century ago Ferreira, F. (1993). Creation of prosody during sentence produc- concerning the status and objectivity of psychoanalysis. tion. Psychological Review 100: 233–253. 684 Psychoanalysis, Contemporary Views of them positive (see MacDonald 1954: pt. VI). In response Controversy in the understanding and evaluation of psycho- to the perceived problem of mechanism in Freudian meta- analysis has its origin in the multifaceted character of Sig- psychology, Schafer (1976) undertook to translate its terms mund FREUD’s theorizing—the plurality of other disciplines into those of “action language,” an approach found attractive with which Freud allied it, and the mix of methodologies by many psychoanalytic theorists. that he employed—but it is also a function of several other Contemporary developments in this vein bear witness to variables, including the diversity of schools (Kleinian, Jun- the subsequent explosion of work in the philosophy of gian, ego-psychological, etc.) within the psychoanalytic mind, particularly to the influence of Donald Davidson’s movement itself, the uncertainty as to whether psychoanaly- compatibilism of reasons and causes, and his ANOMALOUS sis is fundamentally a theory or a practice, and the variety of philosophical outlooks that have had an interest in either MONISM. The significance of Davidson’s work is to allow a assimilating or repudiating psychoanalytic ideas. reading of psychoanalysis that is consistent with physical- It is helpful to distinguish in the diversity of schools two ism and does justice simultaneously to psychoanalysis’ modes of approach to psychoanalysis: those that locate dis- commitment to search for meaningful connections among cussion of psychoanalysis firmly in the context of scientific mental phenomena and its claim to provide causal explana- methodology, and those that give priority to issues in the tion, while freeing it from the obligation to come up with philosophy of mind. There is a tendency for this distinction strict causal laws. Against this background it becomes pos- to be correlated with contrasting estimates—respectively sible to argue that psychoanalysis is an extension of com- negative and positive—of the objectivity of psychoanalysis. monsense FOLK PSYCHOLOGY, arrived at by modifying the In the first group, the two landmark writings are Karl familiar “belief-desire” schema of practical reason explana- Popper 1963 (ch. 1) and Adolf Grünbaum 1984. Popper’s tion, for which it substitutes the concepts of wish and fan- enormously influential attack on psychoanalysis, in the con- tasy. (Melanie Klein’s development of Freud’s theories text of a general rejection of inductivism in the philosophy assumes, in this approach, special importance.) In such a of science, consists of the claim that psychoanalysis fails to view, psychoanalysis does not, contra Grünbaum, repose open itself to refutation and so does not satisfy the condition logically on therapeutic claims, and the specific inductive of falsifiability that (in his account) supplies the only alter- canons of the natural sciences are inappropriate to its evalu- native to inductive support. On account of its alleged immu- ation; the grounds for psychoanalysis lie instead in its offer- nity to counter-evidence, psychoanalysis is classified as a ing a unified explanation for phenomena (DREAMING, “pseudoscience.” Popper’s criticisms (which in part stand psychopathology, mental conflict, sexuality, and so on) that independently of his own philosophy of science) had the commonsense psychology is unable, or poorly equipped, to effect of making untenable the naive view of psychoanalysis explain. (Defending this broadly circumscribed approach, as a set of hypotheses unproblematically grounded in expe- see above all Wollheim 1984, 1991, and 1993, Hopkins rience, and of provoking attempts—the results of which 1988, 1991, and 1992; and also Davidson 1982; Lear 1990; have been markedly inconclusive—to test psychoanalytic Cavell 1993; and Gardner 1993.) This approach can be vin- hypotheses experimentally in controlled, extra-clinical con- dicated only if there is a determinate interpretative path texts (see Eysenck and Wilson 1973). from the attributions of commonsense psychology to those In direct opposition to Popper, Grünbaum maintains that of psychoanalysis, a matter that can be decided only by psychoanalysis can be evaluated scientifically, and has elab- examining clinical material. The central philosophical diffi- orated a highly detailed critique of psychoanalysis centered culty facing this approach is to show that psychoanalysis on Freud’s avowed aspiration to provide a theory of the can extend commonsense psychology at the same time as mind that is successful by the canons of natural science, revising it, that is, that the modifications psychoanalysis these being inductive, in Grünbaum’s view. Grünbaum makes to commonsense psychology are not so radical as to argues that Freudian theory reposes on claims that only psy- effectively cut it loose from the latter. Thus two important choanalysis can give correct insight into the cause of neuro- questions for this approach, which continue to generate con- sis, and that such insight is causally necessary for a durable troversy, concern the intelligibility of postulating mental cure. Grünbaum then proceeds to underline the empirical states that are unconscious (see Searle 1992: ch. 7) and weakness of psychoanalysis’ claim to causal efficacy, and mental content that is prelinguistic, unconceptualized, or presses the familiar objection that the therapeutic effects of nonpropositional (see Cavell 1993). psychoanalysis may be due to suggestion. Furthermore, The ascent of cognitive science has encouraged the for- Grünbaum argues that even if the clinical data is taken at mulation of a further set of positions on psychoanalysis, face value, the inferences that Freud draws are unwarranted. which stand midway between the two groups just described. (For further discussion of psychoanalysis’ scientificity, see Freud’s very early “Project for a scientific psychology” Hook 1964 and Sachs 1991.) (1950/1895), an attempt at a general neurological theory of Discussion of psychoanalysis was initiated by philoso- mental functioning, allows itself to be recast in more con- phers sensitized by Wittgenstein’s work in the philosophy of temporary, computational terms (see Gill and Pribram psychology to a set of issues independent from scientific 1976), and subpersonal reconstructions of Freud’s properly methodology. They addressed the more basic conceptual psychoanalytic theories, implying their fundamental conti- question of whether psychoanalytic explanations are causal nuity with the “Project,” have been offered (see Erdelyi or rationalizing in form—the common assumption being that 1985). Kitcher (1992) has made a detailed case for the these are exclusive modes of explanation—and on this basis stronger thesis that Freud should be interpreted as seeking formulated different views of psychoanalysis’ cogency, some to establish an interdisciplinary science of the mind of the Psychoanalysis, History of 685 sort that cognitive science now aims at. Cummins (1983: C. Wright, Eds., Mind, Psychoanalysis and Science. Oxford: Blackwell, pp. 33–60. ch. 4) offers an understanding of psychoanalysis as striving Hopkins, J. (1991). The interpretation of dreams. In J. Neu, Ed., to coordinate an interpretive level of description of the mind The Cambridge Companion to Freud. Cambridge: Cambridge with an underlying functional story. Assuming psychoanal- University Press, pp. 86–135. ysis and cognitive science to be both empirically well Hopkins, J. (1992). Psychoanalysis, interpretation, and science. grounded, some degree of fit between their functional delin- In J. Hopkins and A. Savile, Eds., Psychoanalysis, Mind and eations of the mind is almost certain. Whether any substan- Art: Perspectives on Richard Wollheim. Oxford: Blackwell, tial theoretical integration of psychoanalysis with cognitive pp. 3–34. science can reasonably be expected is moot, however, and Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary arguably stands or falls with the success of cognitive sci- Science of Mind. Cambridge, MA: MIT Press. ence in analyzing higher-level, propositional attitude- Lear, J. (1990). Love and its Place in Nature: A Philosophical Reconstruction of Psychoanalysis. London: Faber. involving cognitive capacities. MacDonald, M., Ed. (1954). Philosophy and Analysis. Oxford: The positions indicated above are far from exhaustive, Blackwell. and a comprehensive survey would include also Continental Neu, J., Ed. (1991). The Cambridge Companion to Freud. Cam- developments. One particularly influential early contribu- bridge: Cambridge University Press. tion is Jürgen Habermas’s (1971/1968) hermeneutic read- Popper, K. (1963). Conjectures and Refutations: The Growth of ing, which seeks to separate psychoanalysis wholly from the Scientific Knowledge. London: Routledge and Kegan Paul. natural sciences—this association being attributed to a natu- Sachs, D. (1991). In fairness to Freud: A critical notice of “The ralistic and scientific misconception of psychoanalysis on Foundations of Psychoanalysis,” by Adolf Grünbaum. In J. Freud’s part—and integrate it with communication theory. Neu, Ed., The Cambridge Companion to Freud. Cambridge: Later Continental writers, having Lacanian psychoanalysis Cambridge University Press, pp. 309–338. as a model, have tended to develop theories of psychoanaly- Schafer, R. (1976). A New Language for Psychoanalysis. New sis strongly oriented toward purely philosophical themes of Haven: Yale University Press. Searle, J. (1992). The Rediscovery of the Mind. Cambridge, MA: representation and subjectivity. MIT Press. Collections discussing psychoanalysis from various Wollheim, R. (1984). The Thread of Life. Cambridge: Cambridge philosophical angles include Wollheim and Hopkins 1982, University Press. Clark and Wright 1988, and Neu 1991. Wollheim, R. (1991). Freud. 2nd ed. London: Fontana Collins. See also PSYCHOANALYSIS, HISTORY OF Wollheim, R. (1993). Desire, belief and Professor Grünbaum’s Freud. In The Mind and Its Depths. Cambridge, MA: Harvard —Sebastian Gardner University Press, pp. 91–111. Wollheim, R., and J. Hopkins, Eds. (1982). Philosophical Essays References on Freud. Cambridge: Cambridge University Press. Cavell, M. (1993). The Psychoanalytic Mind: From Freud to Phi- Psychoanalysis, History of losophy. Cambridge, MA: Harvard University Press. Clark, P., and C. Wright, Eds. (1988). Mind, Psychoanalysis and Science. Oxford: Blackwell. One of Sigmund Freud’s basic psychoanalytic claims was Cummins, R. (1983). The Nature of Psychological Explanation. that dreams and symptoms were wish fulfillments (Freud Cambridge, MA: MIT Press. 1900; for italicized terms see Laplanche and Pointalis Davidson, D. (1982). Paradoxes of irrationality. In R. Wollheim 1973). A particularly simple example is that in which a and J. Hopkins, Eds., Philosophical Essays on Freud. Cam- bridge: Cambridge University Press, pp. 289–305. thirsty person dreams of drinking, and thereby temporarily Erdelyi, M. (1985). Psychoanalysis: Freud’s Cognitive Psychol- pacifies the underlying desire. Schematically, in the case of ogy. New York: Freeman. real satisfaction, a desire that P (that I get a drink) brings Eysenck, H., and G. Wilson, Eds. (1973). The Experimental Study about a real situation that P (I get a drink), and this in turn of Freudian Theories. London: Methuen. brings about an experience or belief that P which pacifies Freud, S. (1950/1895). Project for a scientific psychology. In The the desire, that is, causes it to cease to operate. In Freudian Standard Edition of the Complete Psychological Works of Sig- wish fulfillment, by contrast, a desire operates to bring mund Freud. 24 vols. Translated by J. Strachey, Ed., in collabo- about an experience- or belief-like representation of satis- ration with A. Freud, assisted by A. Strachey and A. Tyson. faction (I dream of drinking) and so pacifies the desire in the London: Hogarth (1953–74), vol. 1, pp. 281–397. absence of the real thing. Freud hypothesized that this pro- Gardner, S. (1993). Irrationality and the Philosophy of Psycho- analysis. Cambridge: Cambridge University Press. cess was effected by the activation of neural prototypes of Gill, M., and K. Pribram. (1976). Freud’s “Project” Re-Assessed. past desire-satisfaction sequences, and he took this to be the London: Hutchinson. mind/brain’s earliest and most basic way of coping with Grünbaum, A. (1984). The Foundations of Psychoanalysis. Berke- desire (1895). ley: University of California Press. FREUD found this pattern of representational pacification Habermas, J. (1968/1971). Knowledge and Human Interests. in more complex instances, and was thus able to see dreams, Translated by J. Shapiro. Boston: Beacon. symptoms, and many other depictive phenomena as repre- Hook, S., Ed. (1964). Psychoanalysis, Scientific Method and Phi- senting the satisfaction of unfulfilled wishes or desires, losophy. New York: New York University Press. which could be traced back to childhood and bodily experi- Hopkins, J. (1988). Epistemology and depth psychology: Critical ence. Analysis indicated that little children attached great notes on “The Foundations of Psychoanalysis.” In P. Clark and 686 Psychoanalysis, History of techniques for analyzing the play of children. This made it and formative emotional significance to very early interac- possible for child analysts to confirm and revise Freud’s tions with their parents in such basic proto-social activities descriptions of childhood, and to propose further hypothe- as feeding and the expulsion and management of waste. ses about infancy. These involved the first use of the mouth, genitals, and anus, Klein (1975) noted that the uninhibited play of children and the early stimulus of these organs apparently roused in analysis showed that their conflicts were rooted in uncon- feelings continuous with their later uses in normal and scious images that represented versions of their parents as abnormal sexuality (1905). Little children’s motives thus both unrealistically good and extremely bad and malevolent. included desires to harm or displace each parent, envied and She explained these as resulting from a process of projec- hated as a rival for the love of the other, as well as to pre- tion that represented the other as containing disowned serve and protect that same parent, loved both sensually and aspects of the self, which in turn was fragmented and as a caretaker, helper, and model. Because these desires depleted by the projective loss. Klein called this projective were subject to particularly radical conflict they were char- identification, and hypothesized that it operated most force- acteristically repressed, and thus rendered unconscious, and fully before the child gained a working grasp of the concept kept from everyday planning and thought. of identity, and hence before it recognized that the parental Repression entailed that such desires could enter con- figures it felt as bad were the same as those it felt as good. sciousness only in a symbolic form, and so could be Klein called this preobjectual phase of development the expressed in intentional action only via symbol-forming paranoid-schizoid position, the term “paranoid” marking processes such as sublimation (1908). Symbolizing incestu- the extremity of the baby’s potential for anxiety, and “schiz- ous desires in terms of ploughing and planting mother earth, oid” the fragmentary way it represents both itself and its for example, could render such activities meaningful as objects. As the infant starts unifying its images, this phase expressions of wish fulfilling phantasy. Thus, according to yields to the depressive position, so named because unifica- Freud, the activities of everyday life acquired the kind of tion entails liability to depression about harming or losing unconscious representational significance he had found in the object (principally the feeding, caring mother), now seen dreams and symptoms, and were accordingly subject to as complex (frustrating as well as gratifying) and liable to unconscious reinforcement, inhibition, etc. Infantile desires be misconceived, but also as unique and irreplaceable. (or the contents of infantile neural prototypes) were not lost, Klein’s discussions of the child’s relation to phantasied but were continually rearticulated through symbolism so as objects, characteristically different from those in the actual to direct action toward their pacification throughout life. environment, inaugurated the object-relations approach to This was thus the primary process through which desire was psychoanalysis, continued by Ronald Fairbairn (1954), regulated. Donald Winnicott (1958), and a number of others of the so- Particular phantasies also realize many of the mecha- called British School. Accounts in these terms have now nisms described in psychoanalytic theory. Thus phantasies been elaborated by all schools (see Kernberg 1995). Klein’s of projection assign (usually undesirable) traits from the self ideas were applied to groups by Wilfred Bion (1961), and to another, whereas those of introjection assign (usually by Bion (1989) and Hanna Segal (1990) to the infantile ori- desired) traits of the other to the self. The “good in/bad out” gins of symbolism and thought. They also influenced John operation of these mechanisms, and the processes of identi- Bowlby (1980), whose work fostered extensive study of fication that they effect, are significant for both individual child-parent attachment (Ainsworth 1985; Karen 1994). development and social organization. Freud introduced the ego and super-ego both as func- The young child achieves self-control partly by forming tional systems mediating between the individual’s innate phantasy images derived from the parents as regulators of drives and the external world, and as modelled on persons. its early bodily activities. Because these “earliest parental In this he attempted to combine functional explanation with imagoes” (1940) embody the child’s infantile aggression in the empirical claim that the way persons function depends a projected form, they are introjected as a super-ego far upon their internal representations of themselves and others. more threatening and punitive than the actual parents. Later This mode of explanation, called ego-psychology, was elab- the child identifies with its parents in their role as agents, orated by Anna Freud (1936), and by Heinz Hartmann that is, as satisfiers and pacifiers of their own desires, and (1958) and his colleagues in the United States. Hartmann these identifications form the ego. The members of many focused particularly upon the attainment of autonomy in groups identify with one another by introjecting a common object-relations, which he took to be dependent upon object idealized leader or cause (1921: 67ff), or by projecting their constancy, the ability to represent self and other consis- destructive motives into a common locus that thereby tently, despite absence and changes in emotion. This was becomes a legitimated focus of collective hate. Those who carried into empirical research on children by Renee Spitz find such a common good or bad object feel united, purified, (1965), and developed further by Edith Jackobsen (1964) and able to validate destructive motives by common ideals. and Margaret Mahler (1968) and her associates, who sought The processes that establish the individual conscience thus to describe the process of individuation that issued in object also create a pattern of “good us/bad them” that enlists its constancy. ferocity in the service of group aggression. More recently, Heinz Kohut (1977) has argued that the After the Nazi occupation many analysts left Europe, and pathology of the “fragmented” or “depleted” SELF requires a post-Freudian psychoanalysis evolved in distinct ways in different countries, often in response to analysts who settled new “self-psychology” for its conceptualization. He intro- there. In England, Anna Freud and Melanie Klein developed duces the notion of “self-object,” that is, another who is Psychoanalysis, History of 687 experienced as performing essential psychological functions Fairbairn, W. R. D. (1954). An Object-Relations Theory of the Per- sonality. New York: Basic Books. for the self, and so felt part of it. When the parents fail in Freud, A. (1936). The Ego and the Mechanisms of Defense. Lon- essential self-object functions the child—or the analytic don: Hogarth Press. patient in whom such needs have been re-activated— Freud, S. (1895). Project for a Scientific Psychology. In Freud responds with narcissistic rage, and may become convinced (1974) vol. 1. that the environment is fundamentally hostile. Kohut com- Freud, S. (1900). The Interpretation of Dreams. In Freud (1974) pares this situation to Klein’s paranoid-schizoid position, vols. 4, 5. and the fragmentation and depletion with which he is con- Freud, S. (1905). Three Essays on the Theory of Sexuality. In Freud cerned is evidently linked to that which Klein describes as (1974) vol. 6. consequent on infantile projective identification (for a Freud, S. (1908). Civilized Sexual Morality and Modern Nervous recent synthesis see Kumin 1996). Illness. In Freud (1974) vol. 9. Freud, S. (1921). Group Psychology and the Analysis of the Ego. Psychoanalysis in France has been particularly influ- In Freud (1974) vol. 18. enced by Jacques Lacan, whose resonant formulations Freud, S. (1940). A Short Outline of Psycho-Analysis. In Freud (1977) link analytic ideas with themes in French philoso- (1974) vol. 22. phy, linguistics, and anthropology. A baby who joyfully Freud, S. (1974). The Standard Edition of the Collected Psycho- identifies itself in a mirror, according to Lacan, thereby rep- logical Works of Sigmund Freud. Translated by J. Strachey, Ed. resents itself as having a wholeness, permanence, and unity London: Hogarth Press. that anticipates and facilitates its ability to move and relate Hartmann, H. (1958). Ego Psychology and the Problem of Adapta- to others. But this identification is also an alienation, for the tion. New York: International Universities Press. infant now regards itself as something it does not actually Jackobson, E. (1964). The Self and the Object World. New York: feel itself to be, and which it may yet fail to become. Identi- International Universities Press. Karen, R. (1994). Becoming Attached: Unfolding the Mystery of fications with others are simultaneously enabling and alien- the Mother-Infant Bond and its Impact on Later Life. New ating in the same way, so that the external images by which York: Warner Books. the self is constituted always threaten to confront it as Kernberg, O. (1995). Psychoanalytic object relations theories. In reminders of its own lack of being. B. Moore and B. Fine, Eds., Psychoanalysis: The Major Con- Lacan assigns these images to an order of representations cepts. New Haven: Yale University Press. that he describes as the imaginary, and contrasts with the Kohut, H. (1977). The Restoration of the Self. New York: Interna- symbolic order of personal and social sign-systems whose tional Universities Press. elements are constrained by rules of combination and sub- Klein, M. (1975). The Writings of Melanie Klein. London: Karnac stitution comparable to those of natural language. The com- Books and the Institute of Psychoanalysis. binations/substitutions of (representations of) objects in Kumin, I. (1996). Pre-Object Relatedness: Early Attachment and the Psychoanalytic Situation. New York: Guilford Press. dreams and symptoms, or again in the course of develop- Lacan, J. (1977). Écrits. New York: Norton. ment, can be seen as constrained by such rules, and so as Laplanche, J., and J. B. Pontalis. (1973). The Language of Psycho- instances of metaphor, metonymy, and other linguistic Analysis. London: Hogarth Press. forms. So, Lacan argues, the unconscious is structured like a Mahler, M. S. (1968). On Human Symbiosis and the Vicissitudes of language. Comparable structuring holds for social phenom- Individuation. New York: International Universities Press. ena. Thus the resolution of the Oedipus Complex is a devel- Moore, B., and B. Fine. (1995). Psychoanalysis: The Major Con- opment in which the boy forgoes an imaginary relation with cepts. New Haven: Yale University Press. the mother to occupy a place in the social order that is sym- Segal, H. (1990). Dream, Phantasy, and Art. London: Tavistock/ bolic, social, and constitutive of human culture. As in the Routledge. prior instance of the mirror, the child secures a potentially Spitz, R. (1965). The First Year of Life: A Psychoanalytic Study of Normal and Deviant Development of Object Relations. New fulfilling identity via the enabling but alienating assumption York: International Universities Press. of an image, this time of the symbolic father, who embodies Winnicott, D. (1958). Collected Papers: Through Pediatrics to the social laws regulating sexual desire and providing for its Psychoanalysis. New York: Basic Books. procreative satisfaction. See also DREAMING; EMOTIONS; PSYCHOANALYSIS, CON- Further Readings TEMPORARY VIEWS; SELF-KNOWLEDGE Cavell, M. (1993). The Psychoanalytic Mind. Cambridge, MA: —James Hopkins Harvard University Press. Clark, P., and C. Wright, Eds. (1988). Mind, Psychoanalysis, and Science. Oxford: Blackwell. References Erwin, E. (1996). A Final Accounting: Philosophical and Empiri- Ainsworth, M. D. S. (1985). 1: Patterns of infant-mother attach- cal Issues in Freudian Psychology. Cambridge, MA: MIT ment: Antecedents and effects on development. 2. Attachment Press. across the life span. Bull. N.Y. Acad. Med. 6: 771–812. Fink, B. (1997). A Clinical Introduction to Lacanian Psychoanaly- Bion, W. R. (1961). Experiences in Groups. New York: Basic sis. Cambridge, MA: Harvard University Press. Books. Gardner, S. (1993). Irrationality and the Philosophy of Psycho- Bion, W. R. (1989). Second Thoughts: Selected Papers on Psycho- analysis. Cambridge: Cambridge University Press. analysis. London: Heinemann. Gay, P. (1988). Freud, A Life for Our Time. London: J. M. Dent. Bowlby, J. (1980). Attachment and Loss. Vols. 1–3. New York: Gill, M., and K. Pribram. (1976). Freud’s Project Re-assessed. Basic Books. London: Hutchinson. 688 Psycholinguistics often begin with pauses and disfluencies (uh or um, elon- Glymour, C. (1992). Freud’s androids. In J. Neu, Ed., The Cam- bridge Companion to Freud. Cambridge: Cambridge University gated words, repeated words). For example, one speaker Press. recounting a film said: “[1.0 sec pause] A--nd u--m [2.6 sec Grunbaum, A. (1984). The Foundations of Psychoanalysis: A pause] you see him taking . . . picking the pears off the Philosophical Critique. Berkeley: University of California leaves.” Press. Planning such units generally proceeds from the top level Grunbaum, A. (1993). Validation in the Clinical Theory of Psycho- of language down—from intention to ARTICULATION (Lev- analysis. Madison, CT: International Universities Press. elt 1989). Speakers decide on a message, then choose con- Hinshelwood, R. D. (1990). A Dictionary of Kleinian Thought. structions for expressing it, and finally program the phonetic London: Free Associations Books. segments for articulating it. They do this in overlapping Hopkins, J. (1997). Psychoanalysis, post-Freudian. In E. Craig, stages. Ed., The Routledge Encyclopaedia of Philosophy. London: Routledge. Formulation starts at a functional level. Consider a Hopkins, J. (1998). Freud and the science of mind. In S. Glendin- woman planning “Take the steaks out of the freezer.” First ning, Ed., The Edinburgh Encyclopaedia of Continental Philos- she chooses the subject, verb, direct object, and source she ophy. Edinburgh: Edinburgh University Press. wants to express, roughly “the addressee is to get meat from Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary a freezer.” Then she chooses an appropriate syntactic frame, Science of Mind. Cambridge, MA: MIT Press. an imperative construction with a verb, object, and source Kline, P. (1984). Psychology and Freudian Theory: An Introduc- location. She then finds the noun and verbs she needs, take, tion. London: Methuen. steak, and freeze. Finally, she fills in the necessary syntactic Lear, J. (1990). Love and Its Place in Nature. New York: Farrar, elements—the article the, the preposition out of, and the suf- Strauss, and Giroux. fixes -s and -er. Formulation then proceeds to a positional MacDonald, C., and D. MacDonald, Eds. (1995). Philosophy of Psychology: Debates on Psychological Explanation. Oxford: level. She creates a phonetic plan for what she has formu- Blackwell. lated so far. She uses the plan to program her articulatory Masson, J., Ed. (1985). The Complete Letters of Sigmund Freud to organs (tongue, lips, glottis) to produce the actual sounds, Wilhelm Fliess 1887–1904. Cambridge, MA: Harvard Univer- “Take the steaks out of the freezer.” Processing at these lev- sity Press. els overlaps as she plans later phrases while articulating ear- Moore, B., and B. Fine. (1990). Psychoanalytic Terms and Con- lier ones. cepts. New Haven: Yale University Press. Much of the evidence for these stages comes from slips Neu, J. (1992). The Cambridge Companion to Freud. Cambridge: of the tongue collected over the past century (Fromkin Cambridge University Press. 1973; Garrett 1980). Suppose that the speaker of the last Wollheim, R. (1991). Freud. 2nd ed. London: Fontana. example had, by mistake, transposed steak and freeze as she Wollheim, R. (1994). The Mind and its Depths. Cambridge, MA: Harvard University Press. introduced them. She would then have added -s to freeze and -er to steak and produced “Take the freezes out of the steaker.” Other slips occur at the positional level, as when Psycholinguistics the initial sounds in left hemisphere are switched to form heft lemisphere. Psycholinguistics is the study of people’s actions and men- Listeners are often thought to work from the bottom up. tal processes as they use language. At its core are speaking They are assumed to start with the sounds they hear, infer and listening, which have been studied in domains as differ- the words and syntax of an utterance, and, finally, infer what ent as LANGUAGE ACQUISITION and language disorders. Yet the speakers meant. The actual picture is more complicated. the primary domain of psycholinguistics is everyday lan- In everyday conversation, listeners have a good idea of what guage use. speakers are trying to do, and working top down, they use Speaking and listening have several levels. At the bottom this information to help them identify and understand what are the perceptible sounds and gestures of language: how they hear (Tanenhaus and Trueswell 1995). speakers produce them, and how listeners hear, see, and Spoken utterances are identified from left to right by an identify them (see PHONETICS, PHONOLOGY, SIGN LAN- incremental process of elimination (Marslen-Wilson 1987). GUAGES). One level up are the words, gestural signals, and As listeners take in the sounds of “elephant,” for example, syntactic arrangement of what is uttered: how speakers for- they narrow down the words it might be. They start with an mulate utterances, and how listeners identify them (see initial cohort of all words beginning with “e” (roughly 1000 SENTENCE PROCESSING). At the next level up are communi- words), narrow that to the cohort of all words beginning with cative acts: what speakers do with their utterances, and how “el” (roughly 100 words), and so on. By the sound “f” the listeners understand what they mean (see PRAGMATICS). At cohort contains only one word, allowing them to identify the the highest level is DISCOURSE, the joint activities people word as “elephant.” This way listeners often identify a word engage in as they use language. At each level, speakers and before it is complete. Evidence also suggests that listeners listeners have to coordinate their actions. access all of the meanings of the words in these cohorts Speakers plan what they say more than one word at a (Swinney 1979). For example, the moment they identify time. In conversation and spontaneous narratives, they tend “bugs” in “He found several bugs in the corner of his room” to plan in intonation units, generally a single major clause they activate the two meanings “insects” and “hidden micro- or phrase delivered under a unifying intonation contour phones.” Remarkably, they activate the same two meanings (Chafe 1980). Intonation units take time to plan, so they in “He found several spiders, roaches, and other bugs in the Psycholinguistics 689 “Can you tell me the time?” is a conventional way to ask for corner of his room,” even though the context rules out micro- the time, making it harder to construe as a question about phones. But after only .2 to .4 seconds “hidden micro- ability (Gibbs 1994). phones” gets suppressed in favor of “insects.” People work hard in conversation to establish that each Still, listeners do use top-down information in identify- utterance has been understood as intended (Clark 1996). To ing words and constructions (Tanenhaus et al. 1995). When do that, speakers monitor their speech for problems and people are placed at a table with many objects on it and are repair them as quickly as reasonable (Levelt 1983; Schegloff, asked, “Pick up the candle,” they move their gaze to the can- Jefferson, and Sacks 1977). In “if she’d been—he’d been dle before they reach for it. Indeed, they start to move their alive,” the speaker discovers that “she” is wrong, replaces it eyes toward the candle about 50 msec before the end of with “he,” and continues. Listeners also monitor and, on “candle.” But if there is candy on the table along with the finding problems, often ask for repairs, as Barbara does here: candle, they do not start to move their eyes until 30 msec after the end of “candle.” As a sentence, “Put the apple on Alan: Now,—um do you and your husband have a j- car? the towel in the box” may mean either (1) an apple is to go Barbara: Have a car? on a towel that is in a box, or (2) an apple on a towel is to go Alan: Yeah. into a box. Without context, listeners strongly prefer inter- Barbara: No. pretation 1. But when people are placed at a table with two People monitor at all levels of speaking and listening. apples, one on a towel and another on a napkin, their eye Speakers, for example, monitor their addressees for lapses movements show that they infer interpretation 2 from the of attention, mishearings, and misunderstandings. They also beginning. In identifying utterances, then, listeners are flex- monitor for positive evidence of attention, hearing, and ible in the information they exploit—auditory information, understanding, evidence that addressees provide. Address- knowledge of syntax, and the context. ees, for example, systematically signal their attention with Speaking and listening aren’t autonomous processes. eye gaze and acknowledge hearing and understanding with People talk in order to do things together, and to accomplish “yeah” and “uh huh.” that they have to coordinate speaking with listening at many Speaking and listening are not the same in all circum- levels (Clark 1996). stances. They vary with the language (English, Japanese, One way people coordinate in conversation is with adja- etc.), with the medium (print, telephones, video, etc.), with cency pairs. An adjacency pair consists of two turns, the age (infants, adults, etc.), with the genre (fiction, parody, first of which projects the second, as in questions and etc.), with the trope (irony, metaphor, etc.), and with the answers: joint activity (gossip, court trials, etc.). Accounting for these Sam: And what are you then? variations remains a major challenge for psycholinguistics. Duncan: I’m on the academic council. See also FIGURATIVE LANGUAGE; LANGUAGE AND COMMU- In his first turn Sam proposes a simple joint project, that he NICATION; LANGUAGE PRODUCTION; LEXICON; METAPHOR; and Duncan exchange information about what Duncan is. In SPOKEN WORD RECOGNITION; VISUAL WORD RECOGNITION the next turn Duncan takes up his proposal, completing the —Herbert H. Clark joint project, by giving the information Sam wanted. People use adjacency pairs for establishing joint commitments References throughout conversations. They use them for openings (as in the exchange “Hey, Barbara” “Yes?”) and closings (“Bye” Chafe, W. (1980). The deployment of consciousness in the produc- “Bye”). They use them for setting up narratives (“Tell you tion of a narrative. In W. Chafe, Ed., The Pear Stories. Nor- who I met yesterday—” “Who?”), elaborate questions (“Oh wood, NJ: Ablex, pp. 9–50. Clark, H. H. (1996). Using Language. Cambridge: Cambridge there’s one thing I wanted to ask you” “Mhm”), and other University Press. extended joint projects. Fromkin, V. A., Ed. (1973). Speech Errors as Linguistic Evidence. Speakers use their utterances to perform illocutionary The Hague: Mouton. acts—assertions, questions, requests, offers, promises, apol- Garrett, M. F. (1980). Syntactic processes in sentence production. ogies, and the like—acts that differ in the uptake they In B. Butterworth, Ed., Speech Production. New York: Aca- project. Most constructions (e.g., “Sit down”) can be used demic Press, pp. 170–220. for more than one illocutionary act (e.g., a command, a Gibbs, R. W., Jr. (1994). The Poetics of Mind: Figurative Thought, request, an advisory), so speakers and listeners have to coor- Language, and Understanding. Cambridge: Cambridge Univer- dinate on what is intended. One way they coordinate is by sity Press. treating each utterance as a contribution to a larger joint Levelt, W. J. M. (1983). Monitoring and self-repair in speech. Cog- nition, 14: 41–104. project. For example, when restaurant managers were asked Levelt, W. J. M. (1989). Speaking. Cambridge, MA: MIT Press. on the telephone, “Do you accept American Express cards?” Marslen-Wilson, W. (1987). Functional parallelism in spoken word they inferred that the caller had an American Express card recognition. Cognition 25: 71–102. and wanted a “yes” or “no” answer. But when they were Schegloff, E. A., G. Jefferson, and H. Sacks. (1977). The prefer- asked “Do you accept any kinds of credit cards?” they ence for self-correction in the organization of repair in conver- inferred the caller had more than one credit card and wanted sation. Language 53: 361–382. a list of the cards they accepted (“Visa and Mastercard”). Swinney, D. A. (1979). Lexical access during sentence comprehen- Listeners draw such inferences more quickly when the con- sion: (Re)consideration of context effects. Journal of Verbal struction is conventionally used for the intended action. Learning and Verbal Behavior 18: 645–660. 690 Psychological Laws cooperate in producing the more sophisticated corporate Tanenhaus, M. K., M. J. Spivey-Knowlton, K. M. Eberhard, and J. C. Sedivy. (1995). Integration of visual and linguistic informa- explanandum activity. tion in spoken language comprehension. Science 268: 1632– For example, to explain language understanding, a psy- 1634. cholinguist posits a phonological segmenter, a parser, a SYN- Tanenhaus, M. K., and J. C. Trueswell. (1995). Sentence compre- TAX, a LEXICON, and (notoriously) a store of real-world hension. In J. L. Miller and P. D. Eimas, Eds., Handbook of knowledge, and starts to tell a story of how those compo- Perception and Cognition: Speech, Language, and Communi- nents interact. Of any functionary figuring in that story, we cation. 2nd ed. San Diego: Academic Press, pp. 217–262. might want to ask in turn how it performs its own particular job. This is another psychological question of just the same Psychological Laws form as the first, only this time about the functional organi- zation of one of the subagencies. It is important to see that Psychology is a science. Sciences are supposed to feature we can go on asking our function-analytical questions at laws, that is, laws of nature, generalizations that are strongly considerable length. (What neural structures ultimately real- projectible past the actual data that confirm them. Yet in ize the lexicon?) general, psychologists are reluctant to dub their generaliza- Thus we can see human beings and other psychological tions “laws,” even when they have great confidence in those subjects as being simply functionally organized systems, generalizations; some are even uncomfortable in talking of corporate entities that have myriad behavioral capacities by psychological laws at all. virtue of their internal bureaucratic organizations. An organ- Over the decades, a few generalizations or regularities ism’s complete psychological description would consist of a have explicitly been called laws, in GESTALT PSYCHOLOGY flow chart depicting the organism’s immediate subagencies (the laws of organization), in the theory of conditioning (the and their routes of cooperative access to each other, fol- law of effect, the law of exercise, Jost’s law, the law of gen- lowed by a set of lower-level flow charts—“detail” maps or eralization), in neuropsychology (Bowditch’s laws, Hebb’s “blowups”—for each of the main components, and so on law), and of course in psychophysics (Weber, Fechner, down. At any given level, the flow charts show how the Thurstone, Steven, Bloch, Ricco, Bunsen-Roscoe, Ferry- components depicted at that level cooperate to realize the Porter, Grassmann, Yerkes-Dodson, Schachter). For the capacities of the single agency whose functional analysis most part—the law of effect being arguably an exception— they corporately constitute. these are empirical generalizations. Function-analytical laws of two kinds could be read off Most such laws are equations relating one measurable such diagrams. First there would be qualitative laws of the magnitude to one or more other, independently measurable, sort mentioned above; a given flow chart would show what a magnitudes. Laws of this particular type are of course found creature would do, given that it is in such-and-such a state throughout the natural and social sciences. But there are and that (say) it received a stimulus of a certain sort. These psychological generalizations of other kinds that are laws or “horizontal” laws would be of direct use in the prediction lawlike as well, even if they are not commonly called by that and control of behavior. Second, there would be “vertical” name. For example, some of the empirical laws are thought laws, relating lower-level states to the higher-level states that to be explained by higher-level theoretical principles or they constitute at a time. Of course, all such laws would be hypotheses, which would themselves have to be considered qualified by normalcy or “ceteris paribus” clauses because lawlike in order to play that explanatory role. And surely exceptions to them can be created by hardware breakdown or there are qualitative rather than quantitative generalizations, by perturbation of the system by some external agent. truths of the form “Whenever organism S is in state A and X Philosophers have raised several deeper, skeptical ques- occurs, S goes into state B,” that are lawlike. tions about putative laws that are more distinctive to psy- Such qualitative laws would be derived from a standard chology than to the natural sciences, though none of these explanatory pattern in psychology: the function-analytical questions is directed at the acknowledged empirical laws explanation in the sense of Cummins (1983), as found in with which we began. The questions stem from the widely much of current cognitive theory, perceptual psychology, shared assumption that many psychological states are repre- and PSYCHOLINGUISTICS. (The pattern is ubiquitous in biol- sentational (see MENTAL REPRESENTATION). One might sug- ogy, in computer science, and in electronics as well.) Some gest that what distinguishes psychological laws from those psychological questions take the form, “How does S Φ?,” of other natural sciences is that they concern representa- where “Φ” designates some accomplishment carried out by tional states of organisms, but that would seem to rule out the organism S (e.g., “How do speakers of English under- laws of conditioning. stand novel sentences?” or “How does an experimental sub- Contemporary cognitive and perceptual psychology do ject estimate the distance in miles from her present location traffic in representations, information-carrying states pro- to the place she was born?” or “How do dogs recognize duced and computationally manipulated by psychological individual human smells?”). In answer to such a question, mechanisms. By its nature, a representation represents the theorist appeals to a functional or componential analysis something. That is, it has a content or, as the medieval phi- of S. S’s performance is explained as being the joint product losophers called it, an “intentional object.” And, remark- of several constituent operations, individually less demand- ably, that content or object need not exist in reality; for ing, by components or subagencies of the subject acting in instance, through deceptive environment or malfunction, a concert. The components’ individual functions are specified visual edge detector may signal an edge that is not really first, and then the explanation details the ways in which they there (see INTENTIONALITY). Psychophysics 691 A first skeptical question is this: Representation of a pos- Fodor, J. A. (1987). Psychosemantics. Cambridge, MA: MIT Press. sibly inexistent object is a property not found in physics or McClamrock, R. (1995). Existential Cognition. Chicago: Univer- chemistry—and it is a relational property, ostensibly a rela- sity of Chicago Press. tion that its subject bears to an external thing, determined in Searle, J. (1983). Intentionality. Cambridge: Cambridge University part by factors external to the organism. Yet natural laws are Press. causal, and some philosophers (notably Fodor 1987 and Wilson, R. A. (1995). Cartesian Psychology and Physical Minds. Searle 1983) have argued that because an entity’s causal Cambridge: Cambridge University Press. powers are intrinsic to the entity itself, either representational properties cannot properly figure in psychological laws or Further Readings there must be a kind of “narrow” representational content Fodor, J. A. (1974). Special sciences. Synthese 28: 77–115. that is intrinsic to the subject and independent of environ- (Reprinted in Fodor 1981.) mental factors. (See INDIVIDUALISM. Against that view, see Kim, J. (1996). Philosophy of Mind. Boulder, CO: Westview Press. Burge 1986; McClamrock 1995; and Wilson 1995.) Lycan, W. (1981). Psychological laws. Philosophical Topics 12: A second set of questions concerns commonsense mental 9–38. notions, such as those of believing, desiring, seeing, feeling pain, and the like (see PROPOSITIONAL ATTITUDES). Jerry Psychology, History of Fodor (1975, 1981) argues that only slightly cleaned-up ver- sions of those concepts have a home in scientific psychol- ogy and indeed will figure explicitly in laws; certainly some See INTRODUCTION: PSYCHOLOGY; EBBINGHAUS, HERMANN; current psychological experiments make unabashed refer- HELMHOLTZ, HERMANN LUDWIG FERDINAND VON; WUNDT, ence to subjects’ beliefs, desires, memories, etc. WILHELM However, Donald Davidson (1970, 1974) and Daniel Dennett (1978, 1987) contend that such commonsense con- Psychophysics cepts correspond to no NATURAL KINDS, not even within psychology. This is in part because they are ascribed to sub- Psychophysics is the scientific study of relationships jects on grounds that are in large part normative rather than between physical stimuli and perceptual phenomena. For empirical, and in part because (Davidson alleges) the “cet- example, in the case of vision, one can quantify the influ- eris paribus” clauses needed as qualifying such common- ence of the physical intensity of a spot of light on its detect- sense generalizations as “If a man wants to eat an acorn ability, or the influence of its wavelength on its perceived omelette, then he generally will if the opportunity exists and hue. Examples could as well be selected from the study of no other desire overrides” (1974: 45) are open-ended and AUDITION or other sensory systems. unverifiable in a way that such clauses are not when they In a typical psychophysical experiment, subjects are occur in biology or chemistry. For somewhat different rea- tested in an experimental environment intended to maximize sons, Paul Churchland (1989) agrees that commonsense the control of stimulus variations over variations in the sub- concepts correspond to nothing real in psychology or biol- ject’s responses. Stimuli are carefully controlled, often vary- ogy and that they will very probably be dropped from sci- ing along only a single physical dimension (e.g., intensity). ence altogether (see ELIMINATIVE MATERIALISM). The subject’s responses are highly constrained (e.g., “Yes, I See also ANOMALOUS MONISM; FOLK PSYCHOLOGY; PSY- see the stimulus,” or “No, I don’t see it”). Small numbers of CHOPHYSICS subjects are tested with extensive within-subject designs. —William Lycan Individual differences are small in the normal population. Experiments are routinely replicated and replicable across References laboratories. Theoretical treatments of the data consist of computa- Burge, T. (1986). Individualism and psychology. Philosophical tional or physiological models, which attempt to provide an Review 95: 3–45. account of the transformation between stimulus and percep- Churchland, P. M. (1989). A Neurocomputational Perspective. tion within the sensory neural system. The classic modeling Cambridge, MA: Bradford Books/MIT Press. approach is SIGNAL DETECTION THEORY. Many other exper- Cummins, R. (1983). The Nature of Psychological Explanation. imental uses and theoretical treatments are illustrated in the Cambridge, MA: MIT Press/Bradford Books. related entries listed below. Davidson, D. (1970). Mental events. In L. Foster and J. W. Swan- son, Eds., Experience and Theory. Amherst, MA: University of A detection threshold is the smallest amount of physical Massachusetts Press, pp. 79–101. energy required for the subject to detect a stimulus. The Davidson, D. (1974). Psychology as philosophy. In S. C. Brown, example of the intensity required to detect a spot of light Ed., Philosophy of Psychology. London: Macmillan, pp. 41–52. will be used throughout the following paragraphs. Dennett, D. C. (1978). Brainstorms. Montgomery, VT: Bradford There are a variety of formalized methods for measuring Books. detection thresholds (Gescheider 1997). In the method of Dennett, D. C. (1987). The Intentional Stance. Cambridge, MA: adjustment, the subject is given control of the intensity of MIT Press. the stimulus and asked to vary it until her perception reaches Fodor, J. A. (1975). The Language of Thought. Hassocks, England: some criterion value (e.g., the light is just barely visible). In Harvester Press. the method of constant stimuli, a set of intensities of the Fodor, J. A. (1981). RePresentations. Cambridge, MA: MIT Press. 692 Psychophysics light is preselected and presented many times in a random mechanisms that detect the stimuli also code the properties series. The result is a psychometric function, describing the required for identification. percent of a particular response (e.g., “Yes”) as a function of Much psychophysical research is guided and united by light intensity. In staircase or iterative methods, the stimu- sophisticated mathematical modeling. For example, spa- lus in each trial is chosen based on the accumulated data tiotemporal aspects of vision have been treated in extensive from previous trials, using a rule designed to optimize the theories centered around the concept of multiple, more or less efficiency of threshold estimation. independent processing mechanisms (e.g., Graham 1989; see When a series of trials is used, there are two major also SPATIAL PERCEPTION), and COLOR VISION has been simi- options concerning the subject’s task and responses. In Yes/ larly unified by models of the encoding and recoding of No techniques, the subject reports whether she did or did not wavelength and intensity information (e.g., Wandell 1995). see the stimulus in each trial. In forced-choice techniques, Models of psychophysical data are often heavily influ- the stimulus is presented in one of two spatial locations or enced by the known anatomy and physiology of sensory temporal intervals, and the subject’s task is to judge in systems, and advances in each field importantly influence which location or interval the stimulus occurred. the experiments done in the other. For example, the psy- Briefly, the method of adjustment has the advantage of chophysical trichromacy of color vision provided the first maximal efficiency if the effects of stimulus history and evidence for the presence of three and only three channels subject bias are small. The method of constant stimuli underlying color vision, and provided the impetus for a reduces the influence of stimulus history. Forced-choice century of anatomical, physical, physiological, and genetic techniques have the advantage of minimizing the influence as well as psychophysical attempts to identify the three of subject bias, and forced-choice iterative techniques often cone types. As a converse example, anatomically and phys- provide an optimal balance of efficiency and bias minimiza- iologically based models of parallel processing of color tion. and motion have importantly influenced the psychophysi- A discrimination threshold is the smallest physical dif- cal investigation of losses of motion perception for purely ference between two stimuli required for the subject to dis- chromatic (isoluminant) stimuli (see MOTION, PERCEPTION criminate between them. Measurement techniques are OF). Treatments are also available of the need for special analogous to those used for detection thresholds. bridge laws, or linking propositions, in arguments that In supra-threshold experiments, the subject views attempt to explain perceptual events on the basis of physio- readily visible stimuli and is asked to report the quantity or logical events (Teller 1984). quality of her own SENSATIONS. For example, in a scaling In sum, psychophysics underlies the accumulation of experiment, a subject could be shown lights of different knowledge in many parts of perception. It has many tools to intensities. The task would be to describe the perceived offer to cognitive scientists. Its empirical successes illustrate brightness of each stimulus by assigning a number to it. The the value of tight experimental control of stimuli and result is a description of how brightness grows with inten- responses. It provides experimental paradigms that can be sity. In a color-naming experiment, a subject could be generalized successfully to higher level perceptual and cog- shown lights of different wavelengths. The task would be to nitive problems (e.g., Palmer 1995). The extensive modeling describe the perceived hues using a constrained set of color in the field provides successes that might be worth emulat- names (e.g., red, yellow, green, and blue), and partitioning ing, and perhaps blind alleys that might be worth avoiding. the perceived hue among them (e.g., “10% Green, 90% Finally, and most importantly, in combination with direct Blue”). The result is a description of the variations in hue studies of the neural substrates of sensory processing, psy- across the spectrum. chophysics provides an important example of interdiscipli- Psychophysics is also marked by a variety of experimen- nary research efforts that illuminate the relationship tal paradigms with established theoretical interpretations between mind and brain. (Graham 1989; Wandell 1995). For example, in the summa- See also DEPTH PERCEPTION; HIGH-LEVEL VISION; LIGHT- tion-at-threshold paradigm, thresholds are measured for two NESS PERCEPTION; MID-LEVEL VISION; SURFACE PERCEP- component stimuli and for a compound made by superim- TION; TOP-DOWN PROCESSING IN VISION posing the two. The goal is to quantify the energy required —Davida Teller and John Palmer for detection of the compound stimulus with respect to the energy required for detection of each of the components. Outcomes vary widely, from facilitation to linear summa- References tion to independent detection to interference, and lead to Gescheider, G. A. (1997). Psychophysics: The Fundamentals. 3rd different theoretical conclusions concerning the degree and ed. Hillsdale, NJ: Erlbaum. form of interaction of the mechanisms detecting the two Graham, N. V. S. (1989). Visual Pattern Analysers. New York: components. Oxford. Similarly, selective adaptation paradigms examine the Palmer, J. (1995). Attention in visual search: Distinguishing four extent to which adaptation to one stimulus affects the causes of a set-size effect. Current Directions in Psychological threshold for another, and are used to argue for independent Science 4: 118–123. or nonindependent processing of the two stimuli. Identifica- Teller, D. Y. (1984). Linking propositions. Vision Research 24: tion/detection paradigms examine whether or not two differ- 1233–1246. ent stimuli can be identified (discriminated) at detection Wandell, B. A. (1995). Foundations of Vision Science. Sunderland, threshold, and are used to determine the extent to which the MA: Sinauer. Qualia 693 cepts, the best candidates being causal or functional Qualia concepts that have claim to being part of our commonsense understanding of mental states (see FUNCTIONALISM). Many The terms quale and qualia (pl.) are most commonly used to have doubted, however, that commonsense characterizations characterize the qualitative, experiential, or felt properties of could be necessary or sufficient to capture qualitative con- mental states. Some philosophers take qualia to be essential cepts (Block 1978). In response, some physicalists argue features of all conscious mental states; others only of SEN- that there are ways to broaden the scope of commonsense SATIONS and perceptions. In either case, qualia provide a characterization (Levin 1991), others that any knowledge particularly vexing example of the MIND-BODY PROBLEM, we gain uniquely from experience is merely a kind of prac- because it has been argued that their existence is incompati- tical knowledge—the acquisition of new imaginative or rec- ble with a physicalistic theory of the mind (see PHYSICAL- ognitional abilities, rather than concepts that one previously lacked (Nemirow 1990; Lewis 1990). Yet others suggest ISM). Three recent antiphysicalist arguments have been espe- that, despite appearances to the contrary, there is no deter- cially influential. The first claims that one can conceive of minate, coherent content to our qualitative concepts over the qualitative features of one’s pains or perceptions in the and above that which can be explicated by functional or absence of any specific physical or functional properties causal characterizations (Dennett 1991). (and vice versa), and that properties that can be so con- Another physicalist strategy is to reject thesis (b) and ceived must be distinct (Kripke 1980). The second argu- argue that the irreducibility of qualitative to physicalistic ment claims that one cannot know, even in principle, concepts does not entail the irreducibility of qualitative to WHAT-IT’S-LIKE to be in pain or see a color before actually physicalistic properties. Some have argued that there can be having these (or similar) experiences, and that no physical plausible non-Fregean, direct accounts of how qualitative or functional properties can afford this perspectival or sub- concepts denote physical states—on the model of INDEXI- jective knowledge (Nagel 1974; Jackson 1982). The third CALS AND DEMONSTRATIVES—that do not require the intro- states that no physical or functional characterization of duction of further irreducible properties (Loar 1990; Tye mental states can explain what it’s like to have them, and 1995). Others have argued that the lack of conceptual con- that such an EXPLANATORY GAP raises doubts about nection between qualitative and physicalistic concepts is not whether qualia can be identified with such properties unique, but occurs in many cases of successful intertheoretic (Levine 1983). They conclude that qualia cannot be (or, at reduction (Block and Stalnaker forthcoming). least, cannot be easily believed to be) identical with physi- It is also commonly thought that (sincere) beliefs cal or functional properties. about our own qualia have special authority (that is, nec- These arguments are linked in that each first premise essarily are for the most part true), and are also self-inti- assumes (a) that there is no conceptual connection between mating (that is, will necessarily produce, in individuals qualitative and physicalistic terms or concepts. Otherwise, it with adequate conceptual sophistication, accurate beliefs would be impossible to conceive (for example) of pain qua- about their nature). Insofar as they have these special lia existing apart from the relevant physical or functional epistemic features, qualia are importantly different from properties, and it would be possible to know all there is to physical properties such as shape, temperature, and know about pain without ever having experienced pain one- length, about which beliefs may be both fallible and self; it would also be easy to explain why it feels painful to uncompelled. Can they nonetheless be physical or func- have the associated physical or functional property. The sec- tional properties? ond premise also depends upon a common thesis, (b) that Functionalists can claim that qualia have these features given this lack of connection, the use of qualitative terms or as a matter of conceptual necessity, because according to (at concepts requires (or at least suggests) the existence of irre- least some versions of) this doctrine, states with qualitative ducibly qualitative properties. This thesis is supported, at properties and the beliefs they produce are interdefined least implicitly, by a theory of reference, deriving from Got- (Shoemaker 1990). Physicalists who deny such definitional tlob FREGE, that permits nonequivalent concepts to denote connections have argued instead that sufficient introspective the same item only by picking out different properties accuracy is insured by the proper operation of our cognitive (modes of presentation) of it; in this view, if pain is not faculties; thus, as a matter of law, we cannot be mistaken equivalent to any physical or functional concept, then even about (or fail to notice certain properties of) our mental if pain denotes the same property as some physicalistic con- states (Hill 1991). In such a view, these epistemic features cept, this can only be by introducing a mode of presentation of our mental states will be nomologically necessary, but not that is distinct from anything physical or functional. In addi- necessary in any stronger sense (see INTROSPECTION and tion, anti-physicalists have cited inductive evidence for this SELF-KNOWLEDGE). premise, namely, that in all other cases of intertheoretic There are many other interesting issues regarding qualia, reduction, there have been successful analyses, using terms among them whether qualia, if physical, are to be identified constructed from those of the reducing theory, of the con- with neural or narrow functional properties of the individual cepts of the theory to be reduced (Jackson 1993; Chalmers who has them (Block 1990), or whether, to have qualia, one 1996). must also be related to properties of objects in the external Physicalists, in turn, have attempted to deny both theses. world (Lycan 1996; Dretske 1995; cf. INTENTIONALITY). Yet Those denying (a) have argued that there is in fact a concep- another question is whether (and if so, how) the myriad qua- tual connection between qualitative and physicalistic con- lia we seem to experience at any given time are bound 694 Quantifiers together at a given moment, and are continuous with our ties of individuals, and we interpret the S as True in a situa- experiences at previous and subsequent times, or whether tion s if the individuals with those properties (in s) stand in the commonsense view that we enjoy a unity of conscious- the relation expressed by the quantifier. Different quantifiers ness, and a stream of consciousness, is rather an illusion to typically denote different relations. ALL (we write denota- be dispelled (Dennett 1991). tions in upper case) says that the individuals that have the noun property (CAT) are included in those with the predi- See also CONSCIOUSNESS; SELF cate property (GREY). SOME says that the individuals with —Janet Levin the CAT property overlap with those that are GREY; NO says there is no overlap. EXACTLY TEN says the overlap References has exactly ten members. MOST, in the sense of MORE THAN HALF, expresses a proportion: The overlap between Block, N. (1978). Troubles with Functionalism. In C. W. Savage, CAT and GREY is larger than that between CAT and NON- Ed., Perception and Cognition: Issues in the Foundations of GREY; that is, the number of grey cats is larger than the Psychology. Minneapolis: University of Minnesota Press. number of non-grey ones. LESS THAN HALF makes the Reprinted in N. Block, Ed., Readings in Philosophy and Psy- chology, vol. 1. Cambridge, MA: Harvard University Press, opposite claim. 1980. The role of the noun is crucial. Syntactically it forms an Block, N. (1990). Inverted Earth. In J. Tomberlin, Ed., Philosophi- NP constituent (all cats) with the quantifier. We interpret cal Perspectives, no. 4. Atascadero, CA: Ridgeview Publishing. this NP as a function, called a generalized quantifier, that Block, N., and R. Stalnaker. (Forthcoming). Conceptual analysis maps the predicate property to True or False (in a situation). and the explanatory gap. Philosophical Review. So we interpret All cats are grey by (ALL CAT)(GREY), Chalmers, D. (1996). The Conscious Mind: In Search of a Funda- where ALL CAT is a function mapping a property P to True mental Theory. Oxford: Oxford University Press. in a situation s if the cats in s are a subset of the objects with Dennett, D. (1991). Consciousness Explained. Boston: Little, P in s. More generally, Ss of the form [[Det+N]+Predicate] Brown. denote the truth value given by (D(N))(P), where P is the Dretske, F. (1995). Naturalizing the Mind. Cambridge, MA: MIT Press. property denoted by the predicate, N that denoted by the Hill, C. (1991). Sensations: A Defense of Type Materialism. New noun, D the denotation of the Det, and D(N) the denotation York: Cambridge University Press. of Det+noun NP. Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quar- Semantically the noun property N serves to restrict the terly 32: 127–136. class of things we are quantifying over. The literature cap- Jackson, F. (1993). Armchair metaphysics. In J. O’Leary-Haw- tures this intuition with two very general constraints on pos- thorne and M. Michael, Eds., Philosophy in Mind. Dordrecht: sible Det denotations. These constraints limit both the Kluwer. logical expressive power of natural languages and the Kripke, S. (1980). Naming and Necessity. Cambridge, MA: Har- hypotheses children need consider in learning the meanings vard University Press. of the Dets in their language (Clark 1996; see SEMANTICS, Levin, J. (1991). Analytic functionalism and the reduction of phe- nomenal state. Philosophical Studies 61. ACQUISITION OF). Levine, J. (1983). Materialism and qualia: The explanatory gap. One condition is extensions (Van Benthem 1984), which Pacific Philosophical Quarterly 64 (4). says in effect that the truth of Det Ns are Ps cannot depend Lewis, D. (1990). What experience teaches. In W. G. Lycan, Ed., on which individuals are non-Ns. For example, scouring Old Mind and Cognition. Cambridge, MA: Blackwell. English texts, you will never stumble upon a Det blik that Loar, B. (1990). Phenomenal states. In J. Tomberlin, Ed., Philo- guarantees that Blik cats are grey is true if and only if the sophical Perspectives 4. Atascadero, CA: Ridgeview. number of non-cats that are grey is ten. Lycan, W. G. (1996). Consciousness and Experience. Cambridge, The second condition is conservativity (Keenan 1981; MA: MIT Press. Barwise and Cooper 1981; Higginbotham and May 1981; Nagel, T. (1974). What is it like to be a bat? Philosophical Review Keenan and Stavi 1986), which says that the truth of Det Ns 82: 435–456. Nemirow, L. (1990). Physicalism and the cognitive role of are Ps cannot depend on Ps that lack N. So Det Ns are Ps acquaintance. In W. G. Lycan, Ed., Mind and Cognition. Cam- must have the same truth value as Det Ns are Ns that are Ps. bridge, MA: Blackwell. For instance, Most cats are grey is equivalent to Most cats Shoemaker, S. (1990). First-person access. In J. Tomberlin, Ed., are cats that are grey. And most can be replaced by any Det, Philosophical Perspectives 4. Atascadero, CA: Ridgeview. including syntactically complex ones: most but not all, Tye, M. (1995). Ten Problems of Consciousness: A Representa- every child’s, or more male than female. Despite appear- tional Theory of the Phenomenal Mind. Cambridge, MA: MIT ances this semantic equivalence is not trivial. Keenan and Press. Stavi show that in a situation with only two individuals there are 65,536 logically possible Det denotations (functions Quantifiers from pairs of properties to truth values). Only 512 of them are conservative! Sentences (Ss) such as All cats are grey consist of a predi- The restricting role of the noun property distinguishes cate are grey and a noun phrase (NP) all cats, itself consist- natural languages from first-order LOGIC (FOL). FOL essen- ing of a noun cats and a determiner (Det), of which the tially limits its quantifiers to (∀x) every object and (∃x) quantifiers all, some, and no are special cases. Semantically some object and forces logical forms to vary considerably we treat both the noun and the predicate as denoting proper- and nonuniformly from the English Ss they represent. All Quantifiers 695 weren’t exactly ten cats/more cats than dogs in the yard is cats are grey becomes “For all objects x, if x is a cat then x is natural but becomes ungrammatical when exactly ten is grey”; Some cats are grey becomes “For some object x, x is replaced by most or all. a cat and x is grey.” Now proportionality quantifiers (most, Finally, we can isolate the purely “quantitative” or “logi- less than half, a third of the, ten percent of the) are inher- cal” Dets as those whose denotations are invariant under ently restricted (Keenan 1993): there is no Boolean com- permutations p of the domain of objects under discussion. pound S of cat(x) and grey(x) such that (for most x)S is True So they satisfy D(N)(P)) = D(pN)(pP), where p is a permu- if and only if the grey cats outnumber the non-grey ones. tation and p(A) is {p(x)|x ∈ A}. All, most but not all, just Indeed the proper proportionality Dets are not even defin- finitely many always denote permutation invariant functions, able in FOL (see Barwise and Cooper (1981) for most), but no student’s, every . . . but John, more male than female whence the logical expressive power of natural languages don’t. properly extends that of FOL. Cognitive and logical complexity increases with Ss built English presents subclasses of Dets of both semantic and from transitive verbs (P2s) and two NPs. For example Some syntactic interest. We note two such: First, simplex (= single editor reads every manuscript has two interpretations: One, word) Dets satisfy stronger conditions than conservativity there is at least one editor who has the property that he reads and extension. We say that an NP X is increasing (↑) if and every manuscript; and two, for each manuscript there is at only if X is a P (or X are Ps) and all Ps are Qs entails that X is a Q. Proper names are ↑: If all cats are grey and Felix is a least one editor who reads it (possibly different editors read cat, then Felix is grey. An NP of the form [Det+N] is ↑ different manuscripts). Cognitively, language acquisition studies (Lee 1986; Philip 1992) support that children when Det is every, some, both, most, more than half, at least acquire adult-level competence with such Ss years after they ten, infinitely many, or a possessive Det like some boy’s are competent on Ss built from one-place predicates (P1s). whose possessor NP (some boy) is itself increasing. X is And mathematically, whether a sentence is logically true is decreasing (↓) if all Ps are Qs and X is a Q entails X is a P. [Det+N] is ↓ when Det is no, neither, fewer than ten, less mechanically decidable in first-order languages with just than half, not more than five, at most five or NP’s, for NP ↓. P1s but loses this property once a single P2 is added (Boolos and Jeffrey 1989). X is monotonic if it is either increasing or decreasing. But some Ss built from P2s and two NPs cannot be ade- [Det+N] is non-monotonic if Det equals exactly five, quately represented by iterated application of generalized between five and ten, all but ten, more male than female, at quantifiers (Keenan 1987, 1992; van Benthem 1989): Dif- least two and not more than ten. Simplex Dets build mono- tonic NPs (usually increasing), a very proper subset of the ferent people like different things; No two students answered NPs of English. exactly the same questions; John criticized Bill but no one Syntactically note that ↓ NPs license negative polarity else criticized anyone else. Adequate intrepretations treat items in the predicate, ↑ ones do not (Ladusaw 1983). Thus the pair of NPs in each S as a function mapping the binary relation denoted by the P2 to a truth value. ever is natural in No/Fewer than five students here have ever Lastly, quantification can also be expressed outside of been to Pinsk but not in Some/More than five students here Dets and NPs: Students rarely/never/always/often/usually have ever been to Pinsk. Often, as here, grammatical prop- take naps after lunch (Lewis 1975; Heim 1982; de Swart erties of NPs are determined by their Dets. 1996). Bach et al. 1995 contains several articles discussing Second, many English Dets are intersective, in that we languages in which non-Det quantification is prominent: determine the truth of Det Ns are Ps just by checking which Eskimo (Bittner), Mayali (Evans), or possibly even absent: Ns are Ps, ignoring Ns that aren’t Ps. Most intersective Dets Straits Salish (Jelinek) and Asurini do Trocará (Vieira). are cardinal, in that they just depend on how many Ns are Recent overviews of Det type quantification are Keenan Ps. Some is cardinal: whether Some Ns are Ps is decided (1996) and the more technical Keenan and Westerståhl just by checking that the number of Ns that are Ps is greater (1997). than 0. Some other cardinal Dets are no, (not) more than See also AMBIGUITY; LOGICAL FORM IN LINGUISTICS; ten, fewer than/exactly/at most ten, between five and ten, about twenty, infinitely many and just finitely many. No . . . LOGICAL FORM, ORIGINS OF; POSSIBLE WORLDS SEMANTICS but John (as in no student but John) is intersective but not —Edward L. Keenan cardinal. All and most are not intersective: if we are just given the set of Ns that are Ps we cannot decide if all or most Ns are Ps. References Intersectivity applies to two-place Dets like more . . . Bach, E., E. Jelinek, A. Kratzer, and B. Partee, Eds. (1995). Quan- than . . . that combine with two Ns to form an NP, as in tification in Natural Languages. Dordrecht: Kluwer. More boys than girls were drafted. It is intersective in that Barwise, J., and R. Cooper. (1981). Generalized quantifiers and the truth of the S is determined once we are given the inter- natural language. Linguistics and Philosophy 4: 159–219. section of the predicate property with each of the noun Boolos, G., and R. Jeffrey. (1989). Computability and Logic. 3rd properties. Other such Dets are fewer . . . than . . . , exactly ed. New York: Cambridge University Press. as many . . . as . . . , more than twice as many . . . as . . . , the Clark, R. (1996). Learning first order quantifier denotations, an same number of . . . as . . . . In general these Dets are also essay in semantic learnability. Technical Reports I.R.C.S.–96– not first-order definable (even on finite domains). Moreover, 19, University of Pennsylvania. of syntactic interest, it is the intersective Dets that build NPs de Swart, H. (1996). Quantification over time. In J. van der Does that occur naturally in existential There contexts: There and J. van Eijck 1996, pp. 311–336. 696 Radical Interpretation Gärdenfors, P., Ed. (1987). Generalized Quantifier: Linguistic and ter Meulen, Eds., Generalized Quantifiers. Dordrecht: Foris, Logical Approaches. Dordrecht: Reidel. pp. 73–124. Heim, I. R. (1982). The Semantics of Definite and Indefinite Lindström, P. (1966). First order predicate logic with generalized Noun Phrases. Ph.D. diss., University of Massachusetts, quantifiers. Theoria 35: 186–195. Amherst. Montague, R. (1969/1974). English as a formal language. In R. Higginbotham, J., and R. May. (1981). Questions, quantifiers and Thomason, Ed., Formal Philosophy. New Haven, CT: Yale Uni- crossing. The Linguistic Review 1: 41–79. versity Press. Keenan, E. L. (1981). A boolean approach to semantics. In J. Partee, B. H. (1995). Quantificational structures and composition- Groenendijk et al., Eds., Formal Methods in the Study of Lan- ality. In Bach et al. 1995, pp. 541–560. guage. Amsterdam: Math Centre, pp. 343–379. Reuland, E., and A. ter Meulen. (1987). The Representation of Keenan, E. L. (1987). Unreducible n-ary quantifiers in natural lan- (In)definiteness. Cambridge, MA: MIT Press. guage. In P. Gärdenfors (1987). van Benthem, J. (1986). Essays in Logical Semantics. Dordrecht: Keenan, E. L. (1992). Beyond the Frege boundary. Linguistics and Reidel. Philosophy 15: 199–221. (Augmented and reprinted in van der Westerståhl, D. (1989). Quantifiers in formal and natural lan- Does and van Eijck 1996). guages. In D. Gabbay and F. Guenthner, Eds., Handbook of Keenan, E. L. (1993). Natural language, sortal reducibility, and Philosophical Logic, vol. 4. Dordrecht: Reidel, pp. 1–131. generalized quantifiers. J. Symbolic Logic 58: 314–325. Keenan, E. L. (1996). The semantics of determiners. In S. Lappin, Radical Interpretation Ed., The Handbook of Contemporary Semantic Theory. Oxford: Blackwell, pp. 41–63. Keenan, E. L., and J. Stavi. (1986). A semantic characterization of Radical interpretation is interpretation from scratch—as natural language determiners. Linguistics and Philosophy 9: would be necessary if we found ourselves in a community 253–326. whose language and way of life were completely alien to us, Keenan, E. L., and D. Westerståhl. (1997). Generalized quantifiers without bilingual guides, dictionaries, grammars, or previ- in linguistics and logic. In J. van Benthem and A. ter Meulen ous knowledge of the culture. Recent interest in radical (1997), pp. 837–893. interpretation focuses on two related but significantly differ- Ladusaw, W. (1983). Logical form and conditions on grammatical- ent sets of investigations, the first initiated by Quine (1960), ity. Linguistics and Philosophy 6: 389–422. the second by Davidson (“Truth and Meaning,” 1967, col- Lee, T. (1986). Acquisition of quantificational scope in Mandarin Chinese. Papers and Reports on Child Language Development lected with other relevant papers in his 1984). It will be con- No. 25. Stanford University. venient to start with Davidson’s work which, though later, is Lewis, D. (1975). Adverbs of quantification. In E. L. Keenan, Ed., the more immediately intelligible (as will be seen when Formal Semantics of Natural Language. Cambridge: Cam- Quine’s views are presented). bridge University Press, pp. 3–15. If we knew what the foreigners meant by their sentences, Philip, W. (1992). Distributivity and logical form in the emergence we could discover their beliefs and desires; if we knew their of universal quantification. In Proceedings of the Second Con- beliefs and desires, we could discover what they meant. ference on Semantics and Linguistic Theory. Ohio State Univer- Knowing neither, “we must somehow deliver simulta- sity Dept. of Linguistics, pp. 327–345. neously a theory of belief and a theory of meaning” (David- Szabolcsi, A., Ed. (1997). Ways of Scope Taking. Dordrecht: Klu- son 1984: 144). Davidson offers suggestions about both the wer. Van Benthem, J. (1984). Questions about quantifiers. J. Symbolic form of a theory of meaning and how to make sure we have Logic 49: 443–466. the correct one for a given language. Van Benthem, J. (1989). Polyadic quantifiers. Linguistics and Phi- It seems reasonable to require a theory of meaning for losophy 12: 437–465. English to tell us things like “‘Snow is white’ means (in Van Benthem, J., and A. ter Meulen, Eds. (1997). The Handbook of English) that snow is white.” To do so, it must presumably Logic and Language. Amsterdam: Elsevier. explain how the meanings of whole sentences depend on the Van der Does, J., and J. van Eijck. (1996). Quantifiers, logic, and meanings of their parts. Davidson’s suggestion is that this language. CSLI Lecture Notes. Stanford. can be done by a theory of truth for the given language, of broadly the type proposed by Tarski (1936). Tarski aimed to Further Readings explain what it was for the sentences of a language to be true Beghelli, F. (1992). Comparative quantifiers. In P. Dekker and M. without introducing into the explanation either the notion of Stokhof, Eds., Proceedings of the Eighth Amsterdam Collo- truth itself or other problematic notions such as reference quium. ILLC, University of Amsterdam. (see SENSE AND REFERENCE; REFERENCE, THEORIES OF). To Cooper, R. (1983). Quantification and Syntactic Theory. Dor- get some idea of his approach, which in detail is technical, drecht: Reidel. notice that if we knew all the basic predicates of a given lan- Gil, D. (1995). Universal quantifiers and distributivity. In E. Bach guage, we could explain what it was for them to be true of et al. 1995, pp. 321–362. things by listing “axioms,” one for each basic predicate, on Kanazawa, M., and C. Pion, Eds. (1994). Dynamics, Polarity and the lines of “‘Snow’ is true of x (in English) if and only if x Quantification. Stanford, CA: CSLI Publications, pp. 119–145. is snow.” To ensure that such a theory genuinely explained Kanazawa, M., C. Pion, and H. de Swart, Eds. (1996). Quantifiers, Deduction and Context. Stanford, CA: CSLI Publications. what it is for the sentences of the given language to be true, Keenan, E. L., and L. Faltz. (1985). Boolean Semantics for Natural Tarski required it to entail all sentences such as: Language. Dordrecht: Reidel. (1) “Snow is white” is true-in-English if, and only if, snow Keenan, E. L., and L. Moss. (1985). Generalized quantifiers and is white. the expressive power of natural language. In van Benthem and Radical Interpretation 697 These are the famous T-sentences. In contrast to Tarski, not that we cannot know we have hit on the right interpre- Davidson assumes we start with an adequate understanding tation, but that there is nothing to be right or wrong about of truth: “Our outlook inverts Tarski’s: we want to achieve (Quine 1960: 73, 221; 1969: 30, 47). Nor will bilinguals an understanding of meaning or translation by assuming a help, since he thinks their translations are as much subject prior grasp of the concept of truth” (1984: 150). to the indeterminacy as others. Nor, finally, is he laboring He points out that the T-sentences may be regarded as the familiar point that in translation between relatively giving the meanings of the sentences named on the left-hand remote languages and cultures, nonequivalent sentences of side—provided the theory of truth satisfies sufficiently one language will often do equally well as rough transla- strong constraints. One powerful constraint is the circum- tions of a sentence of the other. This is made clear by his stances in which the foreigners hold true the sentences of application of the indeterminacy thesis to the “translation” their language. He invokes a “principle of charity”: optimize of sentences within one and the same language. We agreement with the foreigners unless disagreements are assume that for each sentence of our shared language, explicable on the basis of known error. what you mean by it is also what I mean by it. Quine One acknowledged difficulty is that Tarski’s theory as it thinks that if I were perverse and ingenious I could “scorn” stands applies only to formalized languages, which lack that scheme and devise an alternative that would attribute many important features such as INDEXICALS AND DEMON- to you “unimagined views” while still fitting all the rele- STRATIVES, proper names, and indirect speech (see David- vant objective facts. son 1984 and essays in LePore 1986). Quine’s indeterminacy thesis is highly contentious (see Davidson believes that although there will be some inde- Kirk 1986 for discussions). But if correct it has profound terminacy of interpretation, it will be superficial: just a mat- implications for psychology and philosophy of mind. If ter of stating the facts in alternative ways, as with beliefs and desires were matters of objective fact, those facts Fahrenheit and Celsius scales. In this and other respects, his would settle significant differences over translation. So if it position contrasts strongly with Quine’s, which has also is a mistake to think translation is determinate, it is also a proved more liable to be misunderstood. mistake to think our beliefs and desires are matters of fact. Suppose the radical interpreter has worked out a transla- Quine and others (e.g., Stich 1983) regard his arguments as tion manual that, given a foreign sentence, enables a transla- undermining intentional psychology and supporting ELIMI- tion to be constructed. Quine maintains we could devise a NATIVE MATERIALISM. rival manual so different that it would reject “countless” See also INTENTIONALITY; MEANING; SEMANTICS translations offered by the first. Both manuals would fit all the objective facts, yet competing versions of the same for- —Robert Kirk eign sentence would not even be loosely equivalent (Quine References 1960, 1969, 1990). Here is a famous line of supporting argument. Imagine Davidson, D. (1984). Inquiries into Truth and Interpretation. New the foreigners use the one-word sentence “Gavagai” in sit- York: Oxford University Press. uations where English speakers would use “Rabbit.” Davidson, D., and J. Hintikka, Eds. (1969). Words and Objections. Quine suggests that this does not establish that the native Dordrecht: Reidel. term “gavagai” refers to rabbits. It might refer instead to Kirk, R. (1986). Translation Determined. Oxford: Clarendon such radically different items as undetached rabbit-parts, Press. or rabbit-phases, or rabbithood. Nor could we rule out any LePore, E., Ed. (1986). Truth and Interpretation: Perspectives on of these alternatives by pointing, or by staging experi- the Philosophy of Donald Davidson. Oxford: Blackwell. Quine, W. V. (1960). Word and Object. Cambridge, MA and New ments, since those operations would depend on untestable York: MIT Press/Wiley. assumptions. Such “inscrutability of reference” appears to Quine, W. V. (1969). Ontological Relativity and Other Essays. involve indeterminacy of sentence translation (Quine New York: Columbia University Press. 1960, 1969). Quine, W. V. (1990). Pursuit of Truth. Cambridge, MA: Harvard All this is part of Quine’s wider campaign against our University Press. tendency to think of MEANING and synonymy as matters of Stich, S. (1983). From Folk Psychology to Cognitive Science: The fact. He argues that the notion of meaning is not genuinely Case Against Belief. Cambridge, MA: MIT Press. explanatory. It makes us think we can explain behavior, but Tarski, A. (1936). The concept of truth in formalized languages. In it is a sham. It fails to mesh in with matters of fact in an A. Tarski (1956), Logic, Semantics, Metamathematics. Oxford: explanatorily useful way. The indeterminacy thesis, if true, Clarendon Press, pp. 152–278 (translated from the German of 1936). would powerfully support that contention. Quine’s thesis is easily confused with others. It is not, Further Readings for example, an instance of the truism that there is no limit to the number of different ways to extrapolate beyond Dummett, M. (1978). The significance of Quine’s indeterminacy finite data. He claims that two schemes of translation thesis. In M. Dummett, Truth and Other Enigmas. London: could disagree with one another even when both fitted not Duckworth, pp. 375–419 (reprinted from 1973). just actual verbal and other behavior, but the totality of rel- Fodor, J., and E. LePore (1992). Holism: A Shopper’s Guide. evant facts. The indeterminacy “withstands even . . . the Oxford: Blackwell. whole truth about nature” (Davidson and Hintikka 1969: Hahn, L. E., and P. A. Schilpp, Eds. (1986). The Philosophy of W. 303). Nor is it any ordinary sort of scepticism. The idea is V. Quine. La Salle, IL: Open Court. 698 Rational Agency development of logic in this century would then cease even to Hookway, C. (1988). Quine: Language, Experience and Reality. Stanford, CA: Stanford University Press. be intelligible). For a COMPUTATIONAL THEORY OF MIND, Kripke, S. (1982). Wittgenstein on Rules and Private Language. where the agent’s deductive competence must be represented Oxford: Blackwell. as a finite algorithm, the ideal agent would in fact have to vio- Lewis, D. (1974). Radical interpretation. Synthese 27: 331–344. late Church’s undecidability theorem for first-order logic Massey, G. (1992). The indeterminacy of translation: A study in (Cherniak 1986). philosophical exegesis. Philosophical Topics 20: 317–345. The agent-idealizations—within the limits of their appli- Putnam, H. (1981). Reason, Truth and History. Cambridge: Cam- cability—of course have served very successfully as simpli- bridge University Press. fied approximations in economic, game, and decision Quine, W. V. (1981). Theories and Things. Cambridge, MA: Har- theory. Nonetheless, a sense of their psychological unreality vard University Press. has motivated two types of subsequent theorizing. One type Tarski, A. (1949). The semantic conception of truth. In H. Feigl and W. Sellars, Eds., Readings in Philosophical Analysis. New reinforces an eliminativist impulse, that the whole frame- York: Appleton Century Crofts, pp. 52–84. work of intentional psychology—with rationality at its core —ought to be cleared away as prescientific pseudotheory (see ELIMINATIVE MATERIALISM); a related response is a Rational Agency quasi-eliminativist instrumentalism (e.g., Dennett 1978), where the agent’s cognitive system and its rationality dimin- In philosophy of mind, rationality is conceived of as a ish to no more than convenient (but impossible) fictions of coherence requirement on personal identity: roughly, “No the theoretician that may help in predicting agent behavior, rationality, no agent.” The agent must have a means-ends but cannot be psychologically real. Ultimately, a sense of competence to fit its actions or decisions, according to its the unreality of ideal agent models can spur doubts about beliefs or knowledge-representation, to its desires or goal- the very possibility of a cognitive science. structure. That agents possess such rationality is more than The other type of response to troubles with the idealiza- an empirical hypothesis; for instance, as a putative set of tions is a via media strategy. After recognizing that nothing beliefs, desires, and decisions accumulated inconsistencies, could count as an agent or person that satisfied no rational- the set would cease even to qualify as containing beliefs, ity constraints, one stops to wonder whether one must jump etc., and disintegrate into a mere set of sentences. This to a conclusion that the agent has to be ideally rational. Is agent-constitutive rationality is distinguished from more rationality all or nothing, or is there some golden mean stringent normative rationality standards, for agents can and between unattainable, perfect unity of mind and utter, cha- often do possess cognitive systems that fall short of otic disintegration of personhood? The normative and epistemic uncriticizability (e.g., with respect to perfect con- empirical rationality models of Simon (1982) are among the sistency) without thereby ceasing to constitute agents. earliest of this less stringent sort: the central principle is Standard philosophical conceptions of rationality derive that, rather than optimizing or maximizing, the agent only from models of the rational agent in microeconomic, game, “satisfices” its expected utility, choosing decisions that are and decision theory earlier this century (e.g., Von Neumann good enough according to its belief-desire set, rather than and Morgenstern 1944; Hempel 1965). The underlying ide- perfect. Such modest coherence realistically is all that an alization is that the agent, given its belief-desire system, agent ought to attempt, and all that can in general be optimizes its choices. While this optimization model was expected. What amounts to a corresponding account for proposed as either a normative standard or an empirically agent-constitutive rationality appears in Cherniak (1981), predictive account (or both), the philosophical model con- with a requirement of minimal, rather than ideal, charity on cerns the idea that we cannot even make sense of agents that making sense of an agent’s actions. An even more latitudi- depart from such optimality. Related ideal-agent concepts narian conception can be found in Stich (1990). Related can be discerned in principles of charity for RADICAL INTER- limited-resource models are now also employed in artificial PRETATION of human behavior of W. V. Quine (1960) and of intelligence (see BOUNDED RATIONALITY). Donald Davidson (1980), and in standard epistemic logic Moderate rationality conceptions leave room for the above- (Hintikka 1962). To accomplish this perfection of appropri- mentioned widely observed phenomena of suboptimal human ate decisions in turn would require vast inferential insight: reasoning, rather than excluding them as unintelligible behav- for example, the ideal agent must possess a deductive com- ior. We are, after all, only human. Indeed, these more psycho- petence that includes a capacity to identify and eliminate logically realistic models can explain the departures from any and all inconsistencies arising in its cognitive system. correctness as symptoms of our having to use more efficient While such LOGICAL OMNISCIENCE might appropriately but formally imperfect “quick but dirty” heuristic procedures. characterize a deity, prima facie it seems at odds with the most Formally correct and complete inference procedures are typi- basic law of human psychology, that we are finite entities. A cally computationally complex, with surprisingly small-sized wide range of experimental studies since the 1970s indicate problem instances sometimes requiring vastly unfeasible time interesting and persistent patterns of our departures from ideal and memory resources. (To an extent, this practical intractabil- logician (Tversky and Kahneman 1974), for instance in har- ity parallels, and extends, classical absolute unsolvability; see boring inconsistent preferences. A more extreme departure GÖDEL’S THEOREMS.) Antinomies like Russell’s paradox lurk- from reality is that for such ideal agents, major portions of the ing at the core of our conceptual scheme can then be inter- deductive sciences would be trivial (e.g., the role of the dis- preted similarly as signs of our having to use heuristic covery of the semantic and set-theoretic paradoxes in the procedures to avoid computational paralysis. Rational Choice Theory 699 To conclude, some vigilance about unwarranted reifica- Eells 1982; Gauthier 1988/89; Gibbard and Harper 1985; tion of cognitive architecture remains advisable. Just as Kavka 1983; Lewis 1985; McClennen 1989; Nozick 1969; attention has turned to evaluation of uncritical idealizing, Rosenthal 1982) whose interpretation and resolution call for scope continues for scrutiny of tacit assumptions in rational- the return of the repressed: an explicit psychology of DECI- ity models about psychologically realistic representational SION MAKING and a full-blown theory of mind. No wonder format (if any)—for example, the discussions reviewed more and more cognitive scientists today (philosophers, above tend to presuppose agents as sentence-processors, artificial intelligence specialists, psychologists) participate, rather than as, say, quasi-picture processors. Finally, the along with economists and game theorists, in the debates familiar uneasy coexistence of the intentional framework— about RCT. having rationality at its core—with the scientific worldview It is ironic that Savage’s expected utility theory, in which is worth recalling. Yet probably much of the groundplan of most economists see the perfect embodiment of instrumen- our species’ model of an agent is innate (see AUTISM and tal rationality, is a set of axioms, admittedly purely syntactic THEORY OF MIND); the framework therefore may be a ladder in nature, that constrain the rational agent’s ends for the we cannot kick away. It is as if the scientific worldview can sake of consistency (see RATIONAL DECISION MAKING). For comfortably proceed neither with, nor without, an inten- instance, her preferences must be transitive: if she prefers x tional-cognitive paradigm. to y and y to z, she must prefer x to z. If, no matter the state of the world, she prefers x to y, she must prefer x to y even See also ANOMALOUS MONISM; COMPUTATIONAL COM- in the ignorance of the state of the world (sure-thing princi- PLEXITY; FOLK PSYCHOLOGY; INTENTIONAL STANCE ple). Savage proves that an agent whose preferences satisfy —Christopher Cherniak all the axioms of the theory chooses as if she were maximiz- ing her expected utility while assigning subjective probabil- References ities to the states of the world. It is not at all that her choices can be explained by her setting out to maximize her utility, Cherniak, C. (1981). Minimal rationality. Mind 90: 161–183. because it is tautological, by construction, that the utility of Cherniak, C. (1986). Minimal Rationality. Cambridge, MA: MIT Press. x is larger to her than that of y if she chooses x over y. The Davidson, D. (1980). Psychology as philosophy. In D. Davidson, claim is that agents whose preferences were not consistent Essays on Actions and Events. New York: Oxford University (i.e., violated the axioms) could not achieve the maximal Press. satisfaction of their ends. Dennett, D. (1978). Intentional systems. In Brainstorms. Cam- This removal of all psychological content and motiva- bridge, MA: MIT Press. tional assumptions from the theory of utility is untenable. Hempel, C. (1965). Aspects of scientific explanation. In Aspects of Consider the obvious possibility that preferences may Scientific Explanation. New York: Free Press. change over time. Which of one’s preferences should be Hintikka, J. (1962). Knowledge and Belief. Ithaca, NY: Cornell subjected to the coherence constraints set by the theory? University Press. Only the occurrent ones, because future preferences are not Quine, W. (1960). Word and Object. Cambridge, MA: MIT Press. Simon, H. (1982). Models of Bounded Rationality, vol. 2. Cam- motivationally efficacious now? Should we rather postulate bridge, MA: MIT Press. second-order preferences that weigh future versus occurrent Stich, S. (1990). The Fragmentation of Reason. Cambridge, MA: first-order preferences? Or are there (noninstrumental) MIT Press. external reasons that will do the weighing? Dispensing with Tversky, A., and D. Kahneman. (1974). Judgment under uncer- a theory of mind proves impossible (Hampton 1998; Hollis tainty: Heuristics and biases. Science 185: 1124–1131. and Sugden 1993). Von Neumann, J., and O. Morgenstern. (1944). Theory of Games According to RCT, an act is an assignment of conse- and Economic Behavior. New York: Wiley quences to states of the world, and the description of a con- sequence must include no reference to how that Rational Choice Theory consequence was brought about. The only legitimate moti- vations are forward-looking reasons: only the future mat- The theory of rational choice was developed within the dis- ters. Using an equipment just because one has invested a lot cipline of economics by JOHN VON NEUMANN and Oskar in it is taken to be irrational (“sunk cost fallacy”; see Nozick Morgenstern (1947) and Leonard Savage (1954). Although 1993). Experiments in cognitive psychology reveal that its roots date back as far as Thomas Hobbes’s denial that most of us commit that alleged fallacy most of the time, reason can fix our ends or desires (instrumental rationality), proving that we care about the consistency between past and and David HUME’s relegation of reason to the role of “slave present, maybe for the sake of personal identity (we violate of the passions,” having no motivating force, via the utilitar- as well Savage’s axioms, especially the sure-thing principle; ians’ definition of rationality as the maximization of “util- see Shafir and Tversky 1993; cf. also JUDGMENT HEURIS- ity” and the neoclassical school of economics’ theory of TICS). Does that mean that we are irrational, or just that our revealed preferences, rational choice theory (RCT) purports mind works differently from what RCT, in spite of its pro- to be neutral relative to all forms of psychological assump- claimed neutrality, presupposes? tions or philosophies of mind. In this respect, its relevance When RCT is applied to a strategic setting, leading to for the cognitive sciences is problematic. However, its most GAME THEORY, some of its implications are plainly para- recent developments have been marked by the discovery of doxical. In an ideal world where all agents are rational, this paradoxes (Binmore 1987a, b; Campbell and Sowdon 1985; fact being common knowledge (everyone knows it, knows 700 Rational Choice Theory that everyone knows it, etc.), rational behavior may be quite Gauthier, D. (1986). Morals by Agreement. Oxford: Oxford Uni- versity Press. unreasonable: the agents are unable to cooperate in a finitely Gauthier, D. (1988/89). In the neighbourhood of the Newcomb- repeated prisoner's dilemma setting (Kreps and Wilson Predictor (Reflections on Rationality). Proceedings of the Aris- 1982); they don’t make good on their promises when it goes totelian Society 89, part 3. against their current interest (assurance game; see Bratman Gibbard, A., and W. Harper. (1985). Counterfactuals and two kinds 1992); their threats are not credible (chain-store paradox; of expected utility. In R. Campbell and L. Sowden, Eds., Para- Selten 1978); trust proves impossible (centipede game and doxes of Rationality and Cooperation: Prisoner’s Dilemma and backward induction paradox; Reny 1992; Pettit and Sugden Newcomb’s Problem. Vancouver: University of British Colum- 1989), etc. A remarkable feature is that a small departure bia Press. pp. 133–158. Originally published in Hooker, Leach, from complete transparency is enough to bring back the and McClennen, Eds. Foundations and Applications of Deci- rational close to the reasonable. Imperfect or BOUNDED sion Theory vol. 1. Dordrecht: Reidel, 1978, pp. 125–162. Hampton, J. (1997). The Authority of Reasons. Cambridge, MA: RATIONALITY would be that which keeps the social world Cambridge University Press. moving. Hobbes, T. (1651). Leviathan. Cambridge: Cambridge University Philosophers have recently taken up these paradoxes. Press (1991). Although diverging, their conclusions make it clear that Hollis, M., and Sugden, R. (1993). Rationality in action. Mind 102, there is no way out without completing or amending RCT 405: 1–35. with theories of, among others, rational planning and inten- Hume, D. (1740). A Treatise of Human Nature. Oxford: Oxford tion-formation, belief revision, counterfactual and probabi- University Press (1978). listic reasoning in strategic settings, and even temporality Kavka, G. (1983). The toxin puzzle. Analysis 43: 1. and self-deception (Dupuy 1998). Some authors think it Kreps, D. M., and R. Wilson. (1982). Reputation and imperfect possible to ground a form of Kantian rationalism in such an information. Journal of Economic Theory 27: 253–279. Lewis, D. K. (1985). Prisoner's dilemma is a Newcomb problem. expanded or revised RCT, so that to choose rationally In R. Campbell and L. Sowden, Eds., Paradoxes of Rationality entails that one chooses morally (Gauthier 1986). and Cooperation: Prisoner’s Dilemma and Newcomb’s Prob- Take as an example the assurance game. A mutually ben- lem. Vancouver: University of British Columbia Press, pp. 251– eficial exchange is possible between you and me, but you 255. Originally published in Philosophy and Public Affairs have to take the first step and I will then decide whether I 8(3): 235–240. reciprocate or not. Is my proclaimed intention that I will McClennen, E. (1989). Rationality and Dynamic Choice: Founda- reciprocate a good enough assurance for you to engage in tional Explorations. Cambridge: Cambridge University Press. the deal, and can I rationally form this intention? Forming it Nozick, R. (1969). Newcomb's problem and two principles of has positive autonomous effects for me, independent of my choice. In N. Rescher, Ed., Essays in Honor of Carl G. Hempel. carrying it out (it will provide an incentive for you to coop- Dordrecht: Reidel, pp. 114–146. Nozick, R. (1993). The Nature of Rationality. Princeton: Princeton erate), and no cost. If it were an act of the will, it would be University Press. rational for me to form it, and we might be tempted to con- Pettit, P., and R. Sugden. (1989). The backward induction paradox. clude that it would also be rational to execute it. However, Journal of Philosophy 86: 169–182. some authors contend, one cannot will oneself to form an Reny, P. J. (1992). Rationality in extensive-form games. Journal of intention any more than a belief, and it is impossible to form Economic Perspectives 6: 103–118. the intention to do X if one knows that when the time comes Rosenthal, R. (1982). Games of perfect information, predatory it will be irrational to do (Kavka 1983). Others maintain that pricing, and the chain store paradox. Journal of Economic The- it is possible to be "resolute" in this case, and rational not ory 25: 92–100. only to form the intention but to make good on it (McClen- Savage, L. (1954). The Foundations of Statistics. New York: Wiley. nen 1989). Only a full-blown theory of the mind can adjudi- Selten, R. (1978). The chain store paradox. Theory and Decision 9: 127–159. cate between these two positions. Shafir, E., and A. Tversky. (1993). Thinking through uncertainty: See also ECONOMICS AND COGNITIVE SCIENCE Nonconsequential reasoning and choice. Cognitive Psychology. —Jean-Pierre Dupuy Von Neumann, J., and O. Morgenstern. (1947). Theory of Games and Economic Behavior. 2nd ed. Princeton: Princeton Univer- sity Press. References Binmore, K. (1987a). Modeling rational players: part 1. Economics Further Readings and Philosophy 3: 9–55. Binmore, K. (1987b). Modeling rational players: part 2. Econom- Bratman, M. (1987). Intentions, Plans and Practical Reason. Cam- ics and Philosophy 4: 179–214. bridge, MA: Harvard University Press. Bratman, M. (1992). Planning and the stability of intention. Minds Davidson, D. (1980). Essays on Actions and Events. Oxford: Clar- and Machines 2: 1–16. endon Press. Campbell, R., and L. Sowden, Eds. (1985). Paradoxes of Rational- Davidson, D. (1982). Paradoxes of irrationality. In R. Wollheim, ity and Cooperation: Prisoner's Dilemma and Newcomb’s and J. Hopkins, Eds., Philosophical Essays on Freud. Cam- Problem. Vancouver: University of British Columbia Press. bridge: Cambridge University Press. Dupuy, J.-P. (1998). Rationality and self-deception. In J.-P. Dupuy, Dupuy, J.-P. (1992). Two temporalities, two rationalities: A new Ed., Self-Deception and Paradoxes of Rationality. Stanford: look at Newcomb's paradox. In P. Bourgine and B. Walliser, CSLI Publications, 113–150. Eds., Economics and Cognitive Science. Pergamon Press. Eells, E. (1982). Rational Decision and Causality. Cambridge: Elster, J. (1979). Ulysses and the Sirens. Cambridge: Cambridge Cambridge University Press. University Press. Rational Decision Making 701 among alternative actions from preference rankings of pos- Elster, J. (1986). The Multiple Self. Cambridge: Cambridge Uni- versity Press. sible states of the world and beliefs or probability judgments Fischer, J. M., Ed. (1989). God, Foreknowledge, and Freedom. about what states obtain as outcomes of different actions, as Stanford: Stanford University Press. in the maximal expected utility criterion of decision theory Frankfurt, H. (1971). Freedom of the will and the concept of a per- and economics. UTILITY THEORY and the FOUNDATIONS OF son. Journal of Philosophy 68: 5–20. PROBABILITY theory provide a base for its developments. Gauthier, D. (1984). Deterrence, maximization and rationality. In Somewhat unrelated, but common, senses of the term refer D. MacLean, Ed., The Security Gamble. Deterrence Dilemmas to making decisions through reasoning (Baron 1985), espe- in the Nuclear Age. Totowa, NJ: Rowman and Allanheld. cially reasoning satisfying conditions of logical consistency Gauthier, D., and R. Sugden, Eds. (1993). Rationality, Justice and and deductive completeness (see DEDUCTIVE REASONING; the Social Contract. Hemel Hempstead, England: Harvester LOGIC) or probabilistic soundness (see PROBABILISTIC REA- Wheatsheaf. Hollis, M. (1987). The Cunning of Reason. Cambridge: Cambridge SONING). The basic elements of the theory were set in place University Press. by Bernoulli (1738), Bentham (1823), Pareto (1927), Ram- Horwich, P. (1987). Asymmetries in Time: Problems in the Philos- sey (1926), de Finetti (1937), VON NEUMANN and Morgen- ophy of Science. Cambridge, MA: MIT Press. stern (1953), and Savage (1972). Texts by Raiffa (1968), Hurley, S. (1989). Natural Reasons. Oxford: Oxford University Keeney and Raiffa (1976), and Jeffrey (1983) offer good Press. introductions. Lewis, D. K. (1969). Convention: A Philosophical Study. Cam- The theory of rational choice begins by considering a set bridge, MA: Harvard University Press. of alternatives facing the decision maker(s). Analysts of par- Lewis, D. K. (1979). Counterfactual dependence and time’s arrow. ticular decision situations normally consider only a Nous 13: 455–476. restricted set of abstract alternatives that capture the impor- Luce, R. D., and H. Raiffa. (1957). Games and Decisions. New York: Wiley. tant or interesting differences among the alternatives. This Parfit, D. (1984). Reasons and Persons. Oxford: Oxford University often proves necessary because, particularly in problems of Press. what to do, the full range of possible actions exceeds com- Quattrone, G. A., and A. Tversky. (1987). Self-deception and the prehension. The field of decision analysis (Raiffa 1968) voter's illusion. In J. Elster, Ed., The Multiple Self. Cambridge: addresses how to make such modeling choices and provides Cambridge University Press. useful techniques and guidelines. Recent work on BAYESIAN Schelling, T. C. (1960). The Strategy of Conflict. Cambridge, MA: NETWORKS (Pearl 1988) provides additional modeling tech- Harvard University Press. niques. These models and their associated inference mecha- Simon, H. A. (1982). Models of Bounded Rationality. Cambridge, nisms form the basis for a wide variety of successful MA: MIT Press. KNOWLEDGE-BASED SYSTEMS (Wellman, Breese, and Gold- Sugden, R. (1991). Rational choice: A survey of contributions from economics and philosophy. Economic Journal 101: 751– man 1992). 785. The theory next considers a binary relation of preference Williams, B. (1981). Internal and external reasons. In Moral Luck. among these alternatives. The notation x y means that Cambridge: Cambridge University Press, pp. 101–113. alternative y is at least as desirable as alternative x, read as y is weakly preferred to x; “weakly” because x y permits x and y to be equally desirable. Decision analysis also pro- Rational Decision Making vides a number of techniques for assessing or identifying the preferences of decision makers. Preference assessment Rational decision making is choosing among alternatives in may lead to reconsideration of the model of alternatives a way that “properly” accords with the preferences and when the alternatives aggregate together things differing beliefs of an individual decision maker or those of a group along some dimension on which preference depends. making a joint decision. The subject has been developed in Decision theory requires the weak preference relation decision theory (Luce and Raiffa 1957; see RATIONAL to be a complete preorder, that is, reflexive (x x), transitive CHOICE THEORY), decision analysis (Raiffa 1968), GAME (x y and y z imply x z), and relating every pair of alter- THEORY (von Neumann and Morgenstern 1953), political natives (either x y or y x). These requirements provide a theory (Muller 1989), psychology (Kahneman, Slovic, and formalization in accord with ordinary intuitions about sim- TVERSKY 1982; see DECISION-MAKING), and economics ple decision situations in which one can readily distinguish (Debreu 1959; Henderson and Quandt 1980; see ECONOM- different amounts, more is better, and one can always tell ICS AND COGNITIVE SCIENCE), in which it is the primary which is more. Various theoretical arguments have also been activity of homo economicus, “rational economic man.” made in support of these requirements; for example, if The term refers to a variety of notions, with each conception someone’s preferences lack these properties, one may con- of alternatives and proper accord with preferences and struct a wager against him he is sure to lose. beliefs yielding a “rationality” criterion. At its most Given a complete preordering of alternatives, decision abstract, the subject concerns unanalyzed alternatives theory requires choosing maximally desirable alternatives, (choices, decisions) and preferences reflecting the desirabil- that is, alternatives x such that y x for all alternatives y. ity of the alternatives and rationality criteria such as maxi- There may be one, many, or no such maxima. Maximally mal desirability of chosen alternatives with respect to the preferred alternatives always exist within finite sets of alter- preference ranking. More concretely, one views the alterna- natives. Preferences that linearly order the alternatives tives as actions in the world, and determines preferences ensure that maxima are unique when they exist. 702 Rational Decision Making The rationality requirements of decision theory on pref- preference relation over outcomes. If we choose a numeri- erences and choices constitute an ideal rarely observed but cal function U over outcomes to represent this preference relation, then the expected utility Û(x) of alternative x useful nonetheless (see Kahneman, Slovic, and Tversky 1982; DECISION MAKING, ECONOMICS AND COGNITIVE SCI- denotes the total utility of the consequences of x, weighting the utility of each outcome by its probability, that is ENCE, JUDGMENT HEURISTICS). In practice, people appar- ently violate reflexivity (to the extent that they distinguish ∑ ˆ U(x) = U ( ω )Pr ( ω x ). alternative statements of the same alternative), transitivity (comparisons based on aggregating subcomparisons may ω∈Ω conflict), and completeness (having to adopt preferences Because the utilities of outcomes are added together in this among things never before considered). Indeed, human definition, this utility function is called a cardinal utility preferences change over time and through reasoning and function, indicating magnitude as well as order. We then action, which renders somewhat moot the usual require- define x y to hold just in case Û(x) Û(y). Constructing ments on instantaneous preferences. People also seem to preferences over actions to represent comparisons of not optimize their choices in the required way, more often expected utility in this way transforms the abstract rational seeming to choose alternatives that are not optimal but are choice criterion into one of maximizing the expected utility nevertheless good enough. These “satisficing” (Simon of actions. 1955), rather than optimizing, decisions constitute a princi- The identification of rational choice under UNCERTAINTY pal focus in the study of BOUNDED RATIONALITY, the ratio- with maximization of expected utility also admits criticism nality exhibited by agents of limited abilities (Horvitz (Machina 1987). Milnor (1954) examined a number of rea- 1987; Russell 1991; Simon 1982). Satisficing forms the sonable properties one might require of rational decisions, basis of much of the study of PROBLEM SOLVING in artifi- and proved no decision method satisfied all of them. In cial intelligence; indeed, NEWELL (1982: 102) identifies the practice, the reasonability of the expected utility criterion method of problem solving via goals as the foundational depends critically on whether the modeler has incorporated (but weak) rationality criterion of the field (“If an agent has all aspects of the decision into the utility function, for exam- knowledge that one of its actions will lead to one of its ple, the decision maker’s attitudes toward risk. goals, then the agent will select that action”). Such “heuris- The theory of rational choice may be developed in axi- tic” rationality lacks the coherence of the decision-theoretic omatic fashion from the axioms above, in which philosophi- notion because it downplays or ignores issues of compari- cal justifications are given for each of the axioms. The son among alternative actions that all lead to a desired goal, complementary “revealed preference” approach uses the as well as comparisons among independent goals. In spite axioms instead as an analytical tool for interpreting actions. of the failure of humans to live up to the requirements of This approach, pioneered by Ramsey (1926) and de Finetti ideal rationality, the ideal serves as a useful approximation, (1937) and developed into a useful mathematical and practi- one that supports predictions, in economics and other cal method by von Neumann (von Neumann and Morgen- fields, of surprisingly wide applicability (Becker 1976; Sti- stern 1953) and Savage (1972), uses real or hypothesized gler and Becker 1977). sets of actions (or only observed actions in the case of Though the notions of preference and optimal choice Davidson, Suppes, and Siegel 1957) to construct probability have qualitative foundations, most practical treatments of and utility functions that would give rise to the set of decision theory represent preference orders by means of actions. numerical utility functions. We say that a function U that When decisions are to be made by a group rather than an assigns numbers to alternatives represents the relation just individual, the above model is applied to describing both the in case U(x) U(y) whenever x y. Note that if a utility group members and the group decision. The focus in group function represents a preference relation, then any mono- decision making is the process by which the beliefs and tone-increasing transform of the function represents the preferences of the group determine the beliefs and prefer- relation as well, and that such transformation does not ences of the group as a whole. Traditional methods for mak- change the set of maximally preferred alternatives. Such ing these determinations, such as voting, suffer various functions are called ordinal utility functions, as the numeri- problems, notably yielding intransitive group preferences. cal values only indicate order, not magnitude (so that U(x) = Arrow (1963) proved that there is no way, in general, to 2 U(y) does not mean that x is twice as desirable as y). achieve group preferences satisfying the rationality criteria To formalize choosing among actions that may yield dif- except by designating some group member as a “dictator,” ferent outcomes with differing likelihoods, the theory and using that member’s preferences as those of the group. moves beyond maximization of preferability of abstract May (1954), Black (1963), and others proved good methods alternatives to the criterion of maximizing expected utility, exist in a number of special cases (Sen 1977). When all which derives preferences among alternatives from prefer- preferences are well behaved and concern exchanges of eco- ence orderings of the possible outcomes together with nomic goods in markets, the theory of general equilibrium beliefs or expectations that indicate the probability of dif- ferent consequences. Let Ω denote the set of possible out- (Arrow and Hahn 1971; Debreu 1959) proves the existence of optimal group decisions about allocations of these goods. comes or consequences of choices. The theory supposes Game theory considers more refined rationality criteria that the beliefs of the agent determine a probability measure Pr, where Pr (ω|x) is the probability that outcome ω obtains appropriate to multiagent settings in which decision makers interact. Artificial markets (Wellman 1993) and negotiation as a result of taking action x. The theory further supposes a Rationalism vs. Empiricism 703 techniques based on game theory (Rosenschein and Zlotkin Ramsey, F. P. (1964). Truth and probability. In H. E. Kyburg, Jr. and H. E. Smokler, Eds., Studies in Subjective Probability. New 1994) now form the basis for a number of techniques in York: Wiley. Originally published 1926. MULTIAGENT SYSTEMS. Rosenschein, J. S., and G. Zlotkin. (1994). Rules of Encounter: —Jon Doyle Designing Conventions for Automated Negotiation among Computers. Cambridge, MA: MIT Press. Russell, S. J. (1991). Do the Right Thing: Studies in Limited Ratio- References and Further Readings nality. Cambridge, MA: MIT Press. Arrow, K. J. (1963). Social Choice and Individual Values. 2nd ed. Savage, L. J. (1972). The Foundations of Statistics. 2nd ed. New New Haven: Yale University Press. York: Dover Publications. Arrow, K. J., and F. H. Hahn. (1971). General Competitive Analy- Sen, A. (1977). Social choice theory: A re-examination. Econo- sis. Amsterdam: Elsevier. metrica 45: 53–89. Baron, J. (1985). Rationality and Intelligence. Cambridge: Cam- Simon, H. A. (1955). A behavioral model of rational choice. Quar- bridge University Press. terly Journal of Economics 69: 99–118. Becker, G. S. (1976). The Economic Approach to Human Behavior. Simon, H. A. (1982). Models of Bounded Rationality: Behavioral Chicago: University of Chicago Press. Economics and Business Organization, vol. 2. Cambridge, MA: Bentham, J. (1823). Principles of Morals and Legislation. Oxford: MIT Press. Oxford University Press. Stigler, G. J., and G. S. Becker. (1977). De gustibus non est dis- Bernoulli, D. (1738). Specimen theoriae novae de mensura sortis. putandum. American Economic Review 67: 76–90. Comentarii academiae scientarium imperialis Petropolitanae, von Neumann, J., and O. Morgenstern. (1953). Theory of Games vol. 5 for 1730 and 1731, pp. 175–192. and Economic Behavior. 3rd ed. Princeton, NJ: Princeton Uni- Black, D. (1963). The Theory of Committees and Elections. Cam- versity Press. bridge: Cambridge University Press. Wellman, M. P. (1993). A market-oriented programming environ- de Finetti, B. (1937). La prévision: Ses lois logiques, ses sources ment and its application to distributed multicommodity flow subjectives. Annales de l’Institut Henri Poincaré 7. problems. Journal of Artificial Intelligence Research 1: 1–23. Davidson, D., P. Suppes, and S. Siegel. (1957). Decision Making: Wellman, M. P., J. S. Breese, and R. P. Goldman. (1992). From An Experimental Approach. Stanford, CA: Stanford University knowledge bases to decision models. The Knowledge Engineer- Press. ing Review 7(1): 35–53. Debreu, G. (1959). Theory of Value: An Axiomatic Analysis of Rationalism vs. Empiricism Economic Equlibrium. New York: Wiley. Henderson, J. M., and R. E. Quandt. (1980). Microeconomic The- ory: A Mathematical Approach. 3rd ed. New York: McGraw- “Rationalism” and “empiricism” are best understood as Hill. Horvitz, E. J. (1987). Reasoning about beliefs and actions under names for two broad trends in philosophy rather than labels computational resource constraints. Proceedings of the Third for specific articulated theories. “Sensationalism,” “experi- AAAI Workshop on Uncertainty in Artificial Intelligence. entialism,” and “empirical theory” are among other terms Menlo Park, CA: AAAI Press. that have been used to denote the latter doctrine, while Jeffrey, R. C. (1983). The Logic of Decision. 2nd ed. Chicago: Uni- “intuitionalism,” “intellectualism,” and “transcendental- versity of Chicago Press. ism” have had currency in alluding to the former. In the tra- Kahneman, D., P. Slovic, and A. Tversky, Eds. (1982). Judgment ditional pantheon of philosophers, the classic rationalists under Uncertainty: Heuristics and Biases. Cambridge: Cam- are René DESCARTES, Gottfried Wilhelm Leibniz, and bridge University Press. Baruch Spinoza; the classic empiricists are John Locke, Keeney, R. L., and H. Raiffa. (1976). Decisions with Multiple George Berkeley, and David HUME. Immanuel Kant’s tran- Objectives: Preferences and Value Tradeoffs. New York: Wiley. scendental theses, although removed from empiricism, do Luce, R. D., and H. Raiffa. (1957). Games and Decisions. New not fit readily into the rationalist picture either. Theorists York: Wiley. are usually said to be “rationalists” or “empiricists” in light Machina, M. J. (1987). Choice under uncertainty: Problems solved of a discerned family resemblance between one of their and unsolved. Journal of Economic Perspectives 1 (1): 121– positions and a position championed by members of one of 154. the traditional schools (cf. KANT). May, K. O. (1954). Intransitivity, utility, and the aggregation of Roughly put, empiricism is the view that all knowledge preference patterns. Econometrica 22: 1–13. of fact comes from experience. At birth the mind is a tabula Milnor, J. (1954). Games against nature. In R. M. Thrall, C. H. rasa. Our senses not only provide the evidence available to Coombs, and R. L. Davis, Eds., Decision Processes. New York: justify beliefs, they are the initial source of the concepts Wiley, pp. 49–59. Mueller, D. C. (1989). Public Choice 2. 2nd ed. Cambridge: Cam- constituting these thoughts. Innate biases and dispositions bridge University Press. may influence the ideas experience leads us to acquire, but Newell, A. (1982). The knowledge level. Artificial Intelligence 18 we do not come into the world equipped with anything (1): 87–127. deserving the title of an “idea.” Some ideas, the simple Pareto, V. (1927). Manuel d’economie politique, deuxième édition. ones, are found directly in experience; others are derived Paris: M. Giard. from these by abstraction, analogy, and definition. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: According to the empiricists, there are also no a priori Networks of Plausible Inference. San Mateo, CA: Morgan truths, except for those analytic statements (for example, Kaufmann. “All bachelors are unmarried,” or “Triangles have three Raiffa, H. (1968). Decision Analysis: Introductory Lectures on sides”), which, being matters of meaning, can be depicted in Choices Under Uncertainty. Reading, MA: Addison-Wesley. 704 Rationalism vs. Empiricism terms of definitional, hence necessary, relations among representational than those of the rationalists (cf. BEHAV- ideas. The seemingly special status of mathematics might IORISM and MENTAL REPRESENTATION). then be explained on the assumption that mathematical Empiricists did stress and give wider scope to the role statements are analytic. Some prominent empiricists, nota- associative processes were seen to play in the acquisition bly John Stuart Mill (1956), argued even mathematics was and manipulation of ideas, and experienced similarity had a empirically derived. more prominent place in their theories. But it is a mistake to Knowledge of all matters of fact, however, rests on think they allowed for no other kinds of mental transitions inductively gained experience. In particular, scientific and processing. In fact, Locke spends a chapter of his Essay knowledge is not based on a priori or necessary principles. Concerning Human Understanding (1975: book II, ch. 33) Reason does not supply the ultimate foundation for science, warning how too ready reliance on the happenstance of nor can it enable us to achieve certainty in these areas. Rea- experiential co-occurrences will result in false beliefs. son can help to organize and see the implications of what At the same time, there is no reason to assume rational- sense offers, and logic puts constraints on appropriate pat- ists did not allow many transitions of thought and imagina- terns of reasoning, but reason on its own cannot provide the tion were fueled by past associations or presently sensed wherewithal for understanding nature. similarities. For example, Descartes’ influential theory of Rationalism may be given an equally rough description the emotions (Passions of the Soul, from Descartes 1984– in terms of its denial of these central empiricist tenets. For 85), has elements that are not only sensationalist but are the rationalist, experience is not the source of all knowl- behaviorist and associationist. It is also true, especially in edge. Some concepts are neither derived nor derivable his theory of vision (Dioptrics, from Descartes 1984-85), from sense experience. The mind comes equipped with a that Descartes talks of more intellectual-like reasoning and set of innate ideas. What’s more, reason or intuition, when calculating that is neither conscious nor involves con- properly tapped, can provide true beliefs or principles, scious ideas. But these processing claims are in tension albeit not all or even most of those we entertain; determin- with the standard interpretation of Descartes as the fore- ing the truth of mundane claims about the height of a tree most champion of the view that all mental states are con- or if the milk has gone bad require sense experience. scious. Inductive exploration will also play a role in discovering Although the rationalist/empiricist dichotomy, loosely empirical regularities theoretical science incorporates and characterized, does have its uses in limited contexts, care seeks to explain. must be taken employing it to make specific claims about Reason has a higher calling. It furnishes a priori princi- specific historical figures. Further caution is warranted ples that are not only true, but are necessary, and are recog- when their work is cited or used to support contemporary nizably so. These principles are not stipulative definitional doctrines. For example, many seventeenth- and eighteenth- truths; rather they delineate the real essences of the ideas of century arguments for innateness hinge on the claim that “God,” “being,” “triangle,” and so on that they contain. In ideas such as “God” or “triangle” could not be acquired, this way, they supply a bedrock of certainty upon which because no actual instances of these sorts of concepts could knowledge is built, and coherence with them is the sine qua possibly be found in experience. Current-day proponents of non of acceptable hypotheses. innate ideas, stressing inductive indeterminacy, extend the It is possible, then, to conceive of scientific theories claim to ideas for which there clearly are observed cases. along the lines of axiom systems in mathematics. The first Similarly, rationalists often argued apprehension of certain principles (or axioms) of these systems of science are not principles, like the principles of noncontradiction and iden- established by the inductive amassing of evidence. They are tity, must be a priori, because they are a prerequisite for intuited by reason as true and necessary. These principles having any thought at all. Current proponents of the provide the foundational certainty from which the rest of INNATENESS OF LANGUAGE, for example, tend not to make science can be seen to follow deductively. comparable claims about the principles of universal gram- The rationalist/empiricist distinction, as drawn above, is mar (cf. NATIVISM; NATIVISM, HISTORY OF; POVERTY OF to be distinguished from the contrast between mentalistic THE STIMULUS ARGUMENTS). versus behavioristic approaches to psychology and mind. The classic writers forged their theories in and against a Empiricists as well as rationalists were mentalists, and both background of assumptions about physics, mind, physiol- placed heavy emphasis on the role of consciousness in cog- ogy, religion, and science in general that have been largely nition. None of these thinkers had qualms appealing to men- abandoned. Moreover, in the course of time, concepts and tal states, and none were committed to the view that human doctrines of consciousness, learning, innateness, mental behavior was not mediated by and dependent upon such states and processes versus nonmental states and processes, internal states. Descartes (Discourse on the Method, part 5, biological inheritance, and the like have come to be under- from Descartes 1984–85) perhaps stands out in his clear stood along new dimensions. This has led to further blurring refusal to attribute CONSCIOUSNESS to animals and in his the meaning and significance of the distinction between conviction that their behavior can be fully explained mecha- rationalist and empiricist positions. nistically without appeal to mental intermediaries. Hume, in In contemporary psychological literature, for instance, his Treatise of Human Nature (1960: book I, sect. 16), Hermann von HELMHOLTZ is frequently cited as the found- forcefully argues that animals are endowed with thought ing father of the cognitivist approach to perception. Since and reason. Everyone agreed, however, that the coin of the Helmholtz’s (1950) unconscious inference model postulates mental was ideas, and the empiricists’ ideas were no less mental representations and processes, his theory is said to Reading 705 stand in opposition to Gestalt, behaviorist, and Gibsonian Piattelli-Palmarini, M., Ed. (1980). Language and Learning. Cam- bridge, MA: Harvard University Press. positions. Yet Helmholtz’s model is strikingly similar to the Schwartz, R. (1994). Vision: Variations on Some Berkeleian one Berkeley offers in his New Theory of Vision (from Ber- Themes. Oxford: Blackwell. keley 1948–57), and it explicitly mirrors Mill’s account of Smith, R. (1997). The Human Sciences. London: Fontana Press. inductive inference. But whereas the staunch empiricist Mill Stich, S., Ed. (1975). Innate Ideas. Berkeley: University of Califor- (1973) allows that certain visual inferences may be instinc- nia Press. tive, Helmholtz claims (1950) his main result has been to Sully, J. (1878). The question of visual perception in Germany, pts. show how a range of phenomena, usually thought innate, 1 and 2. Mind 9: 1–23, 167–195. can be explained in terms of learning and psychological pro- cessing. So, for example, on the important issue of the Rationality fusion of binocular images, Helmholtz offers a cognitive account, in opposition to the purely physiological, nativist explanation given by Descartes and other theorists (cf. GIB- See BOUNDED RATIONALITY; RATIONAL AGENCY; RATIONAL SON; GESTALT PERCEPTION; see also PARSIMONY AND SIM- CHOICE THEORY; RATIONAL DECISION MAKING PLICITY). See also DOMAIN SPECIFICITY; LINGUISTIC UNIVERSALS Reading AND UNIVERSAL GRAMMAR; MORAL PSYCHOLOGY —Robert Schwartz At the close of the nineteenth century, the perceptual and cognitive processes involved in reading were central topics References of theory and research (e.g., Cattell 1885; Pillsbury 1897). Yet, as learning theory came to dominate academic psychol- Berkeley, G. (1948–1957). The Works of George Berkeley, Bishop of Cloyne, vol. 1. A. A. Luce and T. E. Jessop, Eds. Edinburgh: ogy, this interest waned. Across most of the twentieth cen- Thomas Nelson. tury, reading was broadly viewed by research psychologists Descartes, R. (1984–1985). Philosophical Writings. 2 vols. as the product of paired-associate learning and, thus, as Translated by J. Cottingham, R. Stoothoff, and D. Murdoch. largely understood at least in principle, despite the heated Cambridge: Cambridge University Press. Vol. 3, The Corre- debate in the educational arena as to whether the effective spondence, trans. by C. S. M. Kenny and A. Kenny. Cam- stimulus in learning to read corresponded to letters or words bridge: Cambridge University Press (1991). (e.g., Chall 1967; Flesch 1955). Helmholtz, H. (1950). Treatise on Physiological Optics. 3 vols. J. This attitude changed abruptly with the onset of the cog- Southall, Ed. New York: Dover. nitive era. Text, after all, was language. If bottom-up learning Hume, D. (1960). Treatise of Human Nature. L. A. Selby-Bigge, theories were not adequate to explain the acquisition or com- Ed. Oxford: Oxford University Press. Locke, J. (1975). An Essay Concerning Human Understanding. P. prehension of oral language (see LANGUAGE ACQUISITION), H. Nidditch, Ed. Oxford: Oxford University Press. then neither, by extension, were they adequate to explain the Mill, J. S. (1956). A System of Logic. London: Longmans, Green acquisition or comprehension of written language (Smith and Co. 1971). Moreover, written as opposed to oral language had Mill, J. S. (1973). Bailey on Berkeley’s theory of vision. In Disser- the amenably investigable property that its units were dis- tations and Discussions, vol. 2. New York: Haskell House, pp. crete, lending themselves to physical alteration, substitution, 84–119. and rearrangement at every level, from letters to discourse structure. Thus, according to Besner and Humphreys (1991), Further Readings research on reading has filled more pages in books and jour- Barbanell, E., and D. Garrett, Eds. (1997). Encyclopedia of Empir- nals than any other topic in cognitive psychology. In conse- icism. Westport, CT: Greenwood Press. quence, few subdomains of cognitive psychology have seen Boring, E. G. (1950). A History of Experimental Psychology. 2nd as much progress—empirical, theoretical, and applied—as ed. New York: Appleton-Century-Crofts. has the field of reading research. Brown, S., Ed. (1996). Routledge History of Philosophy. Vol. 5, During the 1970s and 1980s, working collaboratively British Philosophy and the Age of Enlightenment. New York: with the fields of linguistic science and artificial intelli- Routledge. gence, cognitive research on reading focused on two issues: Chomsky, N. (1966). Cartesian Linguistics. New York: Harper and how higher-order knowledge is organized and how, by vir- Row. tue of that organization, the partial and temporally messy Herrnstein, R., and E. G. Boring, Eds. (1968). A Sourcebook in the History of Psychology. Cambridge, MA: Harvard University information of text might be restructured and implemented Press. into coherent events and images. James, W. (1950). The Principles of Psychology. 2 vols. New York: The results were a wealth of empirical work demonstrat- Dover. ing that readers can interpret and evaluate an author’s mes- Leibniz, G. W. (1981). New Essays Concerning Human Under- sage only to the extent that they possess and call forth the standing. Translated by P. Remnant and J. Bennett, Ed. Cam- vocabulary, syntactic, rhetorical, topical, analytic, and social bridge: Cambridge University Press. knowledge that the author has presumed, as well as a num- Parkinson, G. H. R., Ed. (1993). Routledge History of Philosophy. ber of theories and models of the psychological structures Vol. 4, The Renaissance and 17th Century Rationalism. New and processes involved in bringing such knowledge to bear York: Routledge. (for review, see Anderson and Pearson 1984; Sanford and Piaget, J. (1970). Genetic Epistemology. New York: Norton. 706 Reading Garrod 1981). Alongside, text was shown to differ from nor- word of text, its spelling is registered with complete, letter- mal oral discourse in language, content, and communicative wise precision even as it is instantly and automatically modes and purposes. The implications were, first, that mapped to the speech patterns it represents (Rayner 1998; beyond learning to listen or speak, LITERACY demands more see also EYE MOVEMENTS AND VISUAL ATTENTION). knowledge in depth, breadth, and kind and, second, that Although scientists are only beginning to understand the unlike learning to listen or speak, the processes of becoming various roles of these spelling-to-speech translations, they literate require reflective access to such knowledge at every are clearly of critical importance to the reading process. To level. Of applied relevance, researchers also demonstrated the extent that knowledge of spelling-to-speech correspon- that among younger and poorer readers the requisite knowl- dences is underdeveloped (as evidenced, for example, by edge, inferential capabilities, or comprehension monitoring subnormal speed or accuracy in reading nonsense words), it tendencies were generally underdeveloped to a greater or is strongly and reliably associated with reading delay or dis- lesser extent (for review, see Baker and Brown 1984), and ability. Moreover, given an alphabetic script such as instructional implications of this work quickly found its way English, research affirms that learning to recognize or spell into classroom materials and practice (see METACOGNI- an adequate number of words is essentially impossible except as children have internalized the spelling-to-speech TION). An equally important outcome of this work was that it correspondences of the language (Ehri 1992; Share and forced the field’s awareness of its explanatory limitations. Stanovich 1995). First, although this work helped make explicit the syntactic Although results of the 1992 and 1994 National Assess- and semantic infrastructure on which text comprehension ment of Educational Progress (NAEP) indicate that more depends, it begged the question of how such knowledge than 40 percent of U.S. fourth graders are unable to read might be accessed in process or acquired developmentally. grade-appropriate text with minimally acceptable levels of Second, there was the issue of words. Much of the research understanding or fluency (Campbell, Donahue, Clyde, and of this era had been designed to elucidate how skillful read- Philips 1996), research indicates that, with the exception of ers might use the higher-order constraints of text to reduce no more than 1–3 percent of children, reading disability can or finesse the demands of word recognition while reading. be prevented through well-designed early instruction (Vellu- Yet, as many times and as many ways as this question was tino et al. 1996). As in method comparison studies of past empirically probed, the results contradicted the premise. decades (e.g., Bond and Dykstra 1967; Chall 1967), contem- Instead, skillful readers’ recognition of printed words porary investigations (e.g., Foorman et al. 1998) affirm that proved itself almost wholly indifferent to the type or initial reading instruction is most effective if it includes strength of bias introduced by researchers; only among poor explicit, systematic attention to phonics as well as an active readers does the speed or accuracy of word recognition tend emphasis on practicing and using that knowledge both in iso- to be measurably influenced by context (Stanovich 1980). lation and in the context of meaningful reading and writing. Furthermore, poorly developed word recognition skills were In addition, building on the seminal work of A. Liber- shown to account for much of the difference in good and man, Cooper, Shankweiler, and Studdert-Kennedy (1967) poor readers’ comprehension of text (Perfetti 1985). and I. Liberman, Shankweiler, Fischer, and Carter (1974), The invention of parallel distributed processing models research has amply demonstrated that learning to read an of perception and memory have been key in the theoretical alphabetic script depends critically on the relatively diffi- reconciliation of the word recognition and comprehension cult insight that every word can be conceived as a sequence research (see COGNITIVE MODELING, CONNECTIONIST). of phonemes (see PHONOLOGY). Indeed, poorly developed These computational models have demonstrated that many phonemic awareness has asserted itself as the core and of the microphenomena of word recognition—including causal factor underlying most cases of severe reading dis- effects of word frequency, orthographic redundancy, spell- ability (Lyon 1995). Conversely, for normal as well as at- ing, sound irregularities (aisle) and inconsistencies (head, risk populations, activities designed to develop children’s bead), and sensitivity to syllabic and onset-rime boundaries awareness of the phonemic structure of words have been —reflect statistical properties of the language’s ortho- shown to ease and accelerate both reading and writing graphic and orthophonological structure and, as such, growth (see Adams, Treiman, and Pressley 1997; Torgesen emerge through associative learning (Seidenberg and 1997). The relationship between phonemic awareness and McClelland 1989; see VISUAL WORD RECOGNITION). More learning to read is bidirectional such that some basic critically, perhaps, in positing such associative learning not appreciation of the phonological structure of words just among letters but also between spellings and both pho- appears necessary for grasping the alphabetic principle, nology and meaning, the models provide means of under- while instruction and practice in decoding and spelling standing how, even as the print on the page is the raw data of serves reciprocally to advance the child’s phonemic sensi- reading, it might serve to activate and to reinforce and tivity. In terms of cognition and metacognition, the impor- extend learning of the language and meaning on which text tant lesson of this work is that, no less than for higher- comprehension depends (Adams 1990). order dimensions of literacy growth, productive learning With the help of a variety of new technologies, research about decoding and spelling necessarily builds on prior has now affirmed that for skillful readers, regardless of the knowledge and active understanding. difficulty of the text, the basic dynamic of reading is line by See also DYSLEXIA; WRITING SYSTEMS line, left-to-right, and word by word. Further, during that —Marilyn Adams fraction of a second while the eyes are paused on any given Realism and Antirealism 707 References Smith, F. (1971). Understanding Reading. New York: Holt, Rine- hart and Winston. Adams, M. J. (1990). Beginning to Read: Thinking and Learning Stanovich, K. E. (1980). Toward an interactive-compensatory about Print. Cambridge, MA: MIT Press. model of individual differences in the development of reading Adams, M. J., R.Treiman, and M. Pressley. (1997). Reading, writ- fluency. Reading Research Quarterly 16:32–71. ing and literacy. In I. Sigel and A. Renninger, Eds., Handbook Torgesen, J. K. (1997). The prevention and remediation of reading of Child Psychology, 5th ed., vol. 4, Child Psychology in Prac- disabilities: Evaluating what we know from research. Journal tice. New York: Wiley, pp. 275–357. of Academic Language Therapy 1: 11–47. Anderson, R. C., and P. D. Pearson. (1984). A schema-theoretic Vellutino, F. R., D. M. Scanlon, E. Sipay, S. Small, A. Pratt, R. view of basic processes in reading. In P. D. Pearson, Ed., Hand- Chen, and M. Denckla. (1996). Cognitive profiles of difficult- book of Reading Research. New York: Longman, pp. 255–292. to-remediate and readily remediated poor readers: Early inter- Baker, L., and A. L. Brown. (1984). Metacognitive skills and read- vention as a vehicle for distinguishing between cognitive and ing. In P. D. Pearson, R. Barr, M. Kamil, and P. Mosenthal, experiential deficits as basic causes of specific reading disabil- Eds., Handbook of Reading Research, vol. 1. New York: Long- ity. Journal of Educational Psychology 88: 601–638. man, pp. 353–394. Besner, D., and G. W. Humphreys. (1991). Introduction. In D. Further Readings Besner and G. W. Humphreys, Eds., Basic Processes in Read- ing: Visual Word Recognition. Hillsdale, NJ: Erlbaum, 1–9. P. B. Gough, L. C. Ehri, and R. Treiman, Eds. (1992). Reading Bond, G. L., and R. Dykstra. (1967). The cooperative research pro- Acquisition. Hillsdale, NJ: Erlbaum, pp. 107–143. gram in first-grade reading instruction. Reading Research Juel, C. (1994). Learning to Read and Write in One Elementary Quarterly 2: 5–142. School. New York: Springer. Campbell, J. R., P. L. I. Donahue, M. R. Clyde, and G. W. Phillips. Just, M. A., and P. A. Carpenter. (1987). The Psychology of Read- (1996). NAEP 1994 Reading Report Card for the Nation and ing and Language Comprehension. Boston: Allyn and Bacon. the States. Washington, DC: National Center for Educational National Research Council. (1998). Preventing reading difficulties Statistics, US Department of Education. in young children. Washington, DC: National Academy Press. Cattell, J. M. (1885). The inertia of the eye and brain. Brain 8: Olson, D. R. (1994). The World on Paper. New York: Cambridge 295–312. Reprinted in A. T. Poffenberger, Ed., James McKeen University Press. Cattell: Man of Science (1947). York, PA: Science Press. Plaut, D. C., J. L. McClelland, M. S. Seidenberg, and K. Patterson. Chall, J. S. (1967). Learning to Read: The Great Debate. New (1996). Understanding normal and impaired word reading: York: McGraw-Hill. Computational principles in quasi-regular domains. Psycholog- Ehri, L. C. (1992). Reconceptualizing the development of sight ical Review 103: 56–115. word reading and its relationship to recoding. In P. B. Gough, Rack, J. P., M. J. Snowling, and R. K. Olson. (1992). The nonword L. C. Ehri, and R. Treiman, Eds., Reading Acquisition. Hills- reading deficit in developmental dyslexia: A review. Reading dale, NJ: Erlbaum, pp.107–143. Research Quarterly 26: 28–53. Flesch, R. (1955). Why Johnny Can’t Read. New York: Harper and Rayner, K., and A. Pollatsek. (1989). The Psychology of Reading. Row. Hillsdale, NJ: Erlbaum. Foorman, B., D. J. Francis, J. M. Fletcher, C. Schatschneider, and Shankweiler, D., S. Crain, L. Katz, A. E. Fowler, A. M. Liberman, P. Mehta. (1998). The role of instruction in learning to read: S. A. Brady, R. Thornton, E. Lundquist, L. Dreyer, J. M. Preventing reading failure in at-risk children. Journal of Educa- Fletcher, K. K. Stuebing, S. E. Shaywitz, and B. A. Shaywitz. tional Psychology to appear. (1995). Cognitive profiles of reading-disabled children: Com- Liberman, A. M., F. Cooper, D. Shankweiler, and M. Studdert- parison of language skills in phonology, morphology and syn- Kennedy. (1967). Perception of the speech code. Psychological tax. Psychological Science 6: 149–156. Review 74: 431–461. Treiman, R. (1993). Beginning to Spell: A Study of First Grade Liberman, I. Y., D. Shankweiler, F. W. Fischer, and B. Carter. Children. New York: Oxford University Press. (1974). Reading and the awareness of linguistic segments. Journal of Experimental Child Psychology 18: 201–212. Realism and Antirealism (1995). Toward a definition of dyslexia. Annals of Dyslexia 45: 3– 27. Neisser, U. (1967). Cognitive Psychology. New York: Appleton- Realism is a blend of metaphysics and epistemology. Meta- Century-Crofts. physically, realism claims that there is an observer-indepen- Perfetti, C. A. (1985). Reading Ability. New York: Oxford Univer- dent world; epistemologically, it claims that we can gain sity Press. knowledge of that very world. In relation to science, realism Pillsbury, W. B. (1897). A study in apperception. American Jour- asserts that, independently of our representations, the enti- nal of Psychology 8: 315–393. ties described by our scientific theories exist and that the Rayner, K. (1998). Eye movements in reading and information theories themselves are objectively true (at least approxi- processing: Twenty years of research. Psychological Bulletin to appear. mately). Opposed to scientific realism (hereafter just “real- Sanford, A. J., and S. G. Garrod. (1981). Understanding Written ism”) are a variety of antirealisms; notably positivism, Language. New York: Wiley. empiricism, instrumentalism, and constructivism. Seidenberg, M. S., and J. L. McClelland. (1989). A distributed, Twentieth-century positivism regarded realism as a developmental model of word recognition and naming. Psycho- pseudo-question external to science. Difficulties over the logical Review 96: 523–568. very possibility of a realist interpretation for the quantum Share, D., and K. Stanovich. (1995). Cognitive processes in early theory of 1925–26 seemed to support this view (Fine 1996). reading development: Accommodating individual differences The situation changed in the 1960s with the emergence of into a mode of acquisition. Issues in Education: Contributions what came to be known as the “miracles” argument, namely, from Educational Psychology 1: 1–57. 708 Realism and Antirealism (or useful) constructs—and so between realism and instru- that unless the theoretical entities employed by scientific mentalism (Fine 1986). theories actually existed and the theories themselves were at A number of fresh alternatives to realism have developed least approximately true of the world at large, the evident recently. Principal among them are Putnam’s “internal real- success of science (in terms of its applications and predic- ism” (1981, 1990), van Fraassen’s “constructive empiricism” tions) would surely be a miracle (Putnam 1975; Smart (1980), and what Fine calls the “natural ontological attitude,” 1963). During the next two decades versions of this argu- or NOA (1996). Internal realism is a perspectival position ment became so fashionable that realism was often identi- allowing that scientific claims are true from certain perspec- fied with science itself. tives but denying that science tells the whole story, or even Despite the fashion, the argument is inconclusive that there is a whole story to tell. There could be other ver- because, at best, scientific success can show only that some- sions of the truth—different stories about the world—each of thing is right about science. That could mean that science is which it may be proper to believe. Van Fraassen’s construc- actually getting at the truth, as the miracles argument urges, tive empiricism eschews belief in favor of what he calls com- or it could just mean that science is developing reliable tools mitment. In contrast with realism, constructive empiricism for organizing experience, perhaps using flawed representa- takes empirical adequacy (not truth) as the goal of science, tions of reality. Similar difficulties beset an influential and when it accepts a theory it accepts it only as empirically “explanationist” variant of the argument (Boyd 1992). This adequate. This involves commitment to working within the version asks us to explain the evident success of science and framework of the theory but not to believing in its literal truth. argues that realism, with its emphasis on the truth of our Fine’s NOA is a minimal attitude that urges critical attention theories, offers the best explanation. Among other prob- to local practice without imposing general interpretive agen- lems, this version suffers from the defect that the conclusion das on science, such as goals for science as a whole or blanket in support of realism depends on the principle (“inference to empiricist limitations on knowledge. NOA regards truth as the best explanation”) to accept as true that which explains basic but, seeing science as open, it challenges general pre- best (Lipton 1991). Antirealisms like instrumentalism and scriptions for scientific truth, including the perspectivalism empiricism would deny the inference. (After all, the best built into internal realism and the external-world correspon- may well be the best of a bad lot.) Thus the explanationist dence built into realism itself. Despite their differences, these version of the “miracles” argument uses a principle of infer- alternatives share with realism a basically positive attitude ence that begs a central question at issue between realism toward science. A contrary suspicion attaches to constructiv- and antirealism—whether truth, or some other merit, ism (Barnes, Bloor, and Henry 1996; Galison and Stump attaches to a good theory (Fine 1996; Laudan 1981). 1996; Latour 1987; Pickering 1984; Searle 1995). In addition to these logical difficulties, realism has a Constructivism opposes realism’s claim that in order to problem with the history of science, which shows our best understand science we must take scientists to be exploring a theories repeatedly overthrown. Inductively, this may sup- world not of their own making. Inspired by developments in port pessimism about the stability of current science (Psillos the history and sociology of science, it maintains instead 1996). It also has a problem with the underdetermination of that scientific knowledge is socially constituted and that theory by evidence, which suggests that theories may have “facts” are made by us. Constructivism emphasizes agency empirical equivalents between which no evidence can and (like NOA) sees unforced judgments throughout scien- decide (Earman 1993; Laudan and Leplin 1991). Both con- tific activity. In their studies constructivists bracket the siderations tend to undermine claims for the reality of the truth-claims of the activity under investigation and try to objects of scientific investigation and the truth of scientific address scientific practice using little more than common theories. sense psychology and an everyday pragmatism with respect In response, some philosophers have suggested that real- to the familiar objects of experience. To the extent to which ism confine itself to a doctrine about the independent exist- these studies succeed in understanding science they paint a ence of theoretical entities (“entity realism”) without picture quite different from realism’s, a dynamic and open commitment to the truth of the theories employing them. picture that challenges not only the arguments but also the There are several proposals of this sort concerning which intuitions on which scientific realism rests. entities to advance as real. We might promote only those entities that are used experimentally to generate new knowl- See also EPISTEMOLOGY AND COGNITION; NATURAL edge or, more generally, only those we regard as causal KINDS; RATIONALISM VS. EMPIRICISM agents (Cartwright 1983; Hacking 1983). We might take —Arthur Fine only those that prove fruitful enough to survive scientific revolutions (McMullin 1987), or only those essential in spe- References cific cases of explanatory or predictive success (Kitcher 1993) or only those entities that stand out as supported by Barnes, B., D. Bloor, and J. Henry. (1996). Scientific Knowledge: especially excellent scientific evidence (Newton-Smith A Sociological Analysis. Chicago: University of Chicago Press. 1989). Finally, we might just plead that surely some entities Boyd, R. (1992). Constructivism, realism and philosophical must be real, without specifying which ones (Devitt 1984). method. In J. Earman, Ed., Inference, Explanation and Other Unfortunately for entity realism, it is not clear that such cri- Frustrations. Berkeley: University of California Press, pp. 131– teria overcome the strategies that challenge realism in gen- 198. eral. In particular these proposals do not seem to Cartwright, N. (1983). How The Laws of Physics Lie. New York: discriminate effectively between real entities and reliable Clarendon Press. Recurrent Networks 709 Devitt, M. (1984). Realism and Truth. Princeton: Princeton Uni- Morrison, M. (1988). Reduction and realism. In A. Fine and J. versity Press. Leplin, Eds., PSA 1988, vol. 1. E. Lansing, MI: Philosophy of Earman, J. (1993). Underdetermination, realism and reason. Mid- Science Association, pp. 286–293. west Studies in Philosophy 18: 19–38. Papineau, D., Ed. (1996). The Philosophy of Science. Oxford: Fine, A. (1986). Unnatural attitudes: Realist and instrumentalist Oxford University Press. attachments to science. Mind 95: 149–179. Popper, K. (1972). Three views concerning human knowledge. Fine, A. (1996). The Shaky Game: Einstein, Realism and the Quan- Reprinted in Conjectures and Refutations. London: Routledge tum Theory. 2nd ed. Chicago: University of Chicago Press. and Kegan Paul, pp. 97–119. Galison, P., and D. Stump, Eds. (1996). The Disunity of Science: Rosen, G. (1994). What is constructive empiricism? Philosophical Boundaries, Contexts, and Power. Stanford: Stanford Univer- Studies 74: 143–178. sity Press. Rouse, J. (1996). Engaging Science: How To Understand Its Prac- Hacking, I. (1983). Representing and Intervening. Cambridge: tices Philosophically. Ithaca, NY: Cornell University Press. Cambridge University Press. Kitcher, P. (1993). The Advancement of Science. Oxford: Oxford Reasoning University Press. Latour, B. (1987). Science in Action. Cambridge, MA: Harvard University Press. See CAUSAL REASONING; DEDUCTIVE REASONING; INDUC- Laudan, L. (1981). A confutation of convergent realism. Philoso- TION; PROBABILISTIC REASONING phy of Science 48: 19–49. Laudan, L., and J. Leplin. (1991). Empirical equivalence and Recognition underdetermination. Journal of Philosophy 88: 449–472. Lipton, P. (1991). Inference to the Best Explanation. London: Rou- tledge. See FACE RECOGNITION; OBJECT RECOGNITION, ANIMAL McMullin, E. (1987). Explanatory success and the truth of theory. STUDIES; OBJECT RECOGNITION, HUMAN NEUROPSYCHOL- In N. Rescher, Ed., Scientific Inquiry in Philosophical Perspec- OGY; VISUAL OBJECT RECOGNITION, AI tive. Lanham: University Press of America, pp. 51–73. Newton-Smith, W. (1989). Modest realism. In A. Fine and J. Lep- Recording from Single Neurons lin, Eds., PSA 1988, vol. 2. E. Lansing, MI: Philosophy of Sci- ence Association, pp.179–189. Pickering, A. (1984). Constructing Quarks: A Sociological History See SINGLE-NEURON RECORDING of Particle Physics. Chicago: University of Chicago Press. Psillos, S. (1996). Scientific realism and the pessimistic induction. Philosophy of Science 63 Supplement: S306–S314. Recurrent Networks Putnam, H. (1975). Mathematics, Matter and Method, vol. 1. Cam- bridge: Cambridge University Press. Putnam, H. (1981). Reason, Truth and History. Cambridge: Cam- are generally broken down into two NEURAL NETWORKS bridge University Press. broad categories: feedforward networks and recurrent net- Putnam, H. (1990). Realism with A Human Face. Cambridge, MA: works. Roughly speaking, feedforward networks are net- Harvard University Press. works without cycles (see PATTERN RECOGNITION AND Searle, J. (1995). The Construction of Social Reality. New York: FEEDFORWARD NETWORKS) and recurrent networks are net- Free Press. works with one or more cycles. The presence of cycles in a Smart, J. C. C. (1963). Philosophy and Scientific Realism. London: network leads naturally to an analysis of the network as a Routledge and Kegan Paul. dynamic system, in which the state of the network at one van Fraassen, B. C. (1980). The Scientific Image. Oxford: Claren- moment in time depends on the state at the previous don Press. moment in time. In some cases, however, it is more natural Further Readings to view the cycles as providing a specification of simulta- neous constraints that the nodes of the network must satisfy, Blackburn, S. (1993). Essays in Quasi-Realism. Oxford: Oxford a point of view that need not involve any analysis of time- University Press. varying behavior. These two points of view can in principle Churchland, P. M., and C. A. Hooker, Eds. (1985). Images of Sci- be reconciled by thinking of the constraints as specifying ence. Chicago: University of Chicago Press. the equilibrium states of a dynamic system. Devitt, M. (1983). Realism and the renegade Putnam. Nous 17: Let us begin by considering recurrent networks which 291–301. admit an analysis in terms of equilibrium states. These net- Giere, R. (1987). Explaining Science: A Cognitive Approach. Chi- works, which include Hopfield networks and Boltzmann cago: University of Chicago Press. Hollis, M., and S. Lukes, Eds. (1982). Rationality and Relativism. machines, are generally specified as undirected graphs, that Oxford: Blackwell. is, graphs in which the presence of a connection from node Kukla, A. (1996). Antirealist explanations of the success of sci- Si to node Sj implies a connection from node Sj to node Si ence. Philosophy of Science 63 Supplement: S298–S305. (see figure 1). The graph may be completely connected or Leplin, J., Ed. (1984). Scientific Realism. Berkeley: University of partially connected; we will consider some of the implica- California Press. tions associated with particular choices of connectivity later Leplin, J. (1997). A Novel Defense of Scientific Realism. New in the article. York: Oxford University Press. A Hopfield network is an undirected graph in which each Miller, R. W. (1987). Fact and Method. Princeton: Princeton Uni- node is binary (i.e., for each i, Si ∈ {–1,1}), and each link is versity Press. 710 Recurrent Networks design so that the energy E is equal to the quantity that it is desired to optimize. The formulation of the Hopfield model in the early 1980s was followed by the development of the Boltzmann machine (Hinton and Sejnowski 1986), which is essentially a probabilistic version of the Hopfield network. The move to probabilities has turned out to be significant; it has led to the development of new algorithms for UNSUPERVISED LEARNING and SUPERVISED LEARNING, and to more efficient Figure 1. A generic undirected recurrent network, in which the algorithms for Hopfield networks. presence of a connection from node Si to node Sj implies a A Boltzmann machine is characterized by a probability connection from node Sj to node Si. The value of the weight Jij is distribution across the states of a Hopfield network. This equal to Jji by assumption. distribution, known as the Boltzmann distribution, is the exponential of the negative of the energy of the state: labeled with a real-valued weight Jij. Because the graph is undirected, Jij and Jji refer to the same link and thus are –E ⁄ T e equal by assumption. P ( S 1, S 2, …, S N ) = -------------- - (3) A Hopfield network also comes equipped with an energy Z function E, which can be viewed intuitively as a measure of where Z is a normalizing constant (the sum across all states the “consistency” of the nodes and the weights. The energy of the numerator), and T is a scaling constant. The Boltz- is defined as follows: mann distribution gives higher probability to states with lower energy, but does not rule out states with high energy. ∑ Jij SiSj E=– (1) As T goes to zero, the lower energy states become the only ones with significant probability and the Boltzmann i