9.66J | Fall 2004 | Undergraduate
Computational Cognitive Science

Readings

AIMA is short for the recommended course text: Russell, Stuart, J., and Peter Norvig. Artificial Intelligence: A Modern Approach. 2nd ed. Upper Saddle River, NJ: Prentice Hall/Pearson Education, 2003. ISBN: 0137903952.

Lec # Topics readings
1 Introduction

Goodman, Nelson. “The New Riddle of Induction.” In Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press, 1983. ISBN: 0674290712.

Fodor, Chomsky. “On the Impossibility of Acquiring more Powerful Structures,” and “The Inductivist Fallacy (including discussion).” In Language and Learning: The Debate between Jean Piaget and Noam Chomsky. Edited by Massimo Piattelli-Palmarini. Cambridge, MA: Harvard University Press, 1984, pp. 142-149. ISBN: 0674509412.

———. “The Inductivist Fallacy.” In Language and Learning: The Debate between Jean Piaget and Noam Chomsky. Edited by Massimo Piattelli-Palmarini. Cambridge, MA: Harvard University Press, 1984, pp. 259-269, including discussion. ISBN: 0674509412.

Optional Readings

Fodor, J. A., M. F. Garrett, E. C. T. Walker, and C. H. Parkes. “Against Definitions.” Cognition 8 (1980): 263-367.

Laurence, Stephen, and Eric Margolis. “Radical Concept Nativism.” Cognition 86 (2002): 22-55.

2 Foundations of Inductive Learning

AIMA. Sections 18.1-2, 18.5, and 19.1-2.

Berwick, R. C. “Learning From Positive-only Examples: The Subset Principle and Three Case Studies.” Machine Learning 2 (1986): 625- 645. (The section on phonology can be skipped. Just read the applications to conceptual hierarchies and syntax.)

Bruner, Jerome S., Jacqueline J. Goodnow, and George Austin. A Study in Thinking. Somerset, NJ: Transaction Publishers, 1986. ISBN: 0887386563.

Mitchell, Thomas M. Machine Learning. New York, NY: McGraw-Hill, 1997, chapter 2. ISBN: 0070428077. 

Optional Readings

Feldman, J. “Minimization of Boolean Complexity in Human Concept Learning.” Nature 407 (2000): 630-633.

Buy at MIT Press Kearns, Michael J., and Umesh V. Vasirani. An Introduction to Computational Learning Theory. Cambridge, MA: MIT Press, 1994, pp. 1-7. ISBN: 0262111934.

Winston, P. H., ed. “Learning Structural Descriptions from Examples.” In The Psychology of Computer Vision. New York, NY: McGaw-Hill, 1975, pp. 157-209. ISBN: 0070710481.

3 Knowledge Representation: Spaces, Trees, Features

Shepard, R. N. “Multidimensional Scaling, Tree-fitting, and Clustering.” Science 210 (1980): 390-398.

Landaues, T. K., and S. T. Dumais. “A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge.” Psychological Review 104 (1997): 211-240.

Goldstone, R. L, and J. Son. “Similarity.” In Cambridge Handbook of Thinking and Reasoning. Edited by K. Holyoak and R. Morrison. Cambridge, MA: Cambridge University Press, 2005, pp. 13-36.

4 Knowledge Representation: Language and Logic 1

AIMA. Sections 22.1-22.2, “Basics of Formal Grammars,” and Section 22.8, “Grammar Induction.”

———. Sections 8.1-8.3, “First Order Logic: See 7.1-7.4 if necessary for background on logic,” Section 10.6, “Using Logic to Represent Category Relations,” and Section 19.5, “Learning New Concepts in Logic: An Answer to Fodor’s Challenge?”

Chomsky, Noam. Syntactic Structures. Berlin, Germany: Walter De Gruyter, Inc., 1976, pp. 11-48. ISBN: 3110154129.

Markman, Arthur. Knowledge Representation. Mahwah, NJ: Lawrence Erlbaum Associates, 1998, pp. 118-146, and 188-216. ISBN: 0805824413.

Optional Readings

Nowak, M. A., N. L. Komarova, and P. Niyogi. “Computational and Evolutionary Aspects of Language.” Nature 417 (2002): 611-617.

Gentner, D., and A. B. Markman. “Structural Alignment in Analogy and Similarity.” American Psychologist 52, no. 1 (1997): 45-56.

5 Knowledge Representation: Language and Logic 2

At least one of the following three pairs of papers:

1. Rosch, E. “Principles of Categorization.” In Cognition and Categorisation. Edited by E. Rosch and B. Lloyd. Hillsdale, NJ: Erlbaum, 1978, pp. 27-48.

2. Armstrong S. L., L. R. Gleitman, and H. Gleitman. “What Some Concepts Might Not Be.” Cognition 13, no. 3 (May 1983): 263-308.

Buy at MIT Press 1. Pinker, S. “Why the Child Holded the Baby Rabbits: A Case Study in Language Acquisition.” In An Invitation to Cognitive Science: Language. Edited by L. Gleitman, and M. Liberman. 2nd ed. Vol. 1. Cambridge, MA: MIT Press, 1995. ISBN: 0262650444.

Buy at MIT Press 2. Rumelhart, D. E., and J. L. McClelland, eds. “On Learning the Past Tenses of English Verbs.” Chapter 18 in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2. Cambridge, MA: MIT Press, 1987. ISBN: 0262631121.

1. Collins, A. M., and M. R. Quillian. “Retrieval Time from Semantic Memory.” Journal of Verbal Learning and Verbal Behavior 8 (1969): 240-248.

2. McClelland, and Rogers. “The Parallel Distributed Processing Approach to Semantic Cognition.” Nature Reviews Neuroscience 4 (April 2003): 1-14.

Optional Readings

Buy at MIT Press Rumelhart, D. E., and J. L. McClelland, eds. “Schemata and Sequential Thought Processes in PDP Models.” Chapter 14 in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2. Cambridge, MA: MIT Press, 1987. ISBN: 0262631121.

Goldstone, R. L., and A. Kersten. “Concepts and Categories.” In Comprehensive Handbook of Psychology. Edited by A. F. Healy, and R. W. Proctor. Vol. 4: Experimental Psychology. 2003, pp. 591-621.

Paccanaro, A., and G. E. Hinton. “Learning Distributed Representations of Concepts Using Linear Relational Embedding.” Technical Report: GCNU TR 2000-002, March 2000.

6 Knowledge Representation: Great Debates 1

AIMA. Chapter 13.

Jeffreys, and Berger. “Bayesian Occam’s Razor.” American Scientist 80 (1992): 64-72.

Tversky, A., and D. Kahneman. “Judgement under Uncertainty: Heuristics and Biases.” Science 185 (1974): 1124-1130.

Optional Readings

Sivia. Bayesian Data Analysis. New York, NY: Oxford University Press, 1996, pp. 1-23. ISBN: 0198518897.

7 Knowledge Representation: Great Debates 2

AIMA. Sections 14.1-14.3, and 14.5.

Charniak. “Bayesian Networks without Tears.” AI Magazine 12, no. 4 (1991): 50-63.

Buy at MIT Press McClelland, J. L., D. E. Rumelhart, andG. Hinton. “The Appeal of Parallel Distributed Processing.” In Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2. Edited by D. E. Rumelhart, and J. L. McClelland. Cambridge, MA: MIT Press, 1987. ISBN: 0262631121.

Johnson-Laird, P. N., and Fabien Savary. “Illusory Inferences about Probabilities.” Acta Psychologica 93 (1996): 69-90.

8 Basic Bayesian Inference

AIMA. Sections 20-20.2.

Gelman, Carlin, Stern, and Rubin. “Hierarchical Models.” Chapter 5 in Bayesian Data Analysis. 2nd ed. London, UK: Chapman & Hall, 1995, pp. 117-131. ISBN: 0412039915.

Griffiths, T. L., and M. Steyvers. “A Probabilistic Approach to Semantic Representation.” In Proceedings of the 24th Annual Conference of the Cognitive Science Society (2002).

Review: Goodman, Nelson. “The New Riddle of Induction.” In Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press, 1983. ISBN: 0674290712.

9 Graphical Models and Bayes Nets

For review: AIMA. Section 19.2.

Mitchell, Thomas M. “Bayesian Learning.” Chapter 6 in Machine Learning. New York, NY: McGraw-Hill, 1997, sections 6.1-6.3. ISBN: 0070428077.

Buy at MIT Press Tenenbaum, J. B. “Rules and Similarity in Concept Learning.” In Advances in Neural Information Processing Systems 12. Edited by S. Solla, T. Leen, and K. Muller. Cambridge, MA: MIT Press, 2000, 59-65. ISBN: 0262194503.

Posner, and Keele. “On the Genesis of Abstract Ideas.” Journal of Experimental Psychology 77 (1968): 353-363.

If necessary for background: Bishop, C. M. “Bayesian Classification.” Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 0198538642.

10 Simple Bayesian Learning 1

AIMA. Section 20.3.

Fried, and Holyoak. “Induction of Category Distributions: A Framework for Classification Learning.” Journal of Experimental Psychology: Learning, Memory and Cognition 10 (1984): 234-257.

Ghahramani, Z., and M. I. Jordan. “Learning from Incomplete Data.” MIT Center for Biological and Computational Learning Technical Report 108 (1994).

Optional Readings

Zhu, Xiaojin, Zoubin Ghahramani, and John Lafferty. “Semi-supervised Learning using Gaussian Fields and Harmonic Functions.” The Twentieth International Conference on Machine Learning (ICML). Washington, DC: 2003.

Nigam, Kamal, Andrew McCallum, Sebastian Thrun, and Tom Mitchell. “Text Classification from Labeled and Unlabeled Documents using EM.” Machine Learning 39, no. 2/3 (2000): 103-134.

Seeger, M. “Learning with Labeled and Unlabeled Data.” University of Edinburgh Institute for Adaptive and Neural Computation Technical Report (2001).

11 Simple Bayesian Learning 2

AIMA. Sections 20.4-5.

Nosofsky, R. M. “Optimal Performance and Exemplar Models of Classification.” In Rational Models of Cognition. Edited by M. Oaksford and N. Chater. New York, NY: Oxford University Press, 1998, pp. 219-247.

Kruschke, J. K. “Human Category Learning: Implications for Backpropagation Models.” Connection Science 5, no. 1 (1993): 3-37.

Optional Readings

B. W. Silverman. “Maximum Penalized Likelihood Estimators.” In Density Estimation. London, UK: Chapman and Hall, 1986, pp. 110-119.

Hinton, G. E., P. Dayan, B. J. Frey, and R. M. Neal. “The Wake-sleep Algorithm for Unsupervised Neural Networks.” Science 268 (1995): 1158-1160.

12 Probabilistic Models for Concept Learning and Categorization 1

Anderson, J. R. “The Adaptive Nature of Human Categorization.” Psychological Review 98, no. 3 (1991): 409-429.

Smyth, P. “Model Selection for Probabilistic Clustering using Cross-validated Likelihood.” Statistics and Computing 10, no. 1 (2000): 63-72.

MacKay, D. J. C. “Probable Networks and Plausible Predictions - A Review of Practical Bayesian Methods for Supervised Neural Networks.” Network: Comput. Neural Syst. 6 (1995): 469-505.

13 Probabilistic Models for Concept Learning and Categorization 2  
14 Unsupervised and Semi-supervised Learning

Buy at MIT Press Murphy, Gregory L., and Douglas L. Medin. “The Role of Theories in Conceptual Coherence.” Chapter 19 in Concepts: Core Readings. Edited by E. Margolis, and S. Laurence. Cambridge, MA: MIT Press, July 1999, pp. 425-458. ISBN: 9780262631938.

Gelman, Susan A. The Essential Child. New York, NY: Oxford University Press, March 1, 2003, chapter 1, and 3, pp. 3-18, and 60-88. ISBN: 0195154061.

Optional Readings

Courville, Aaron C., Nathaniel D. Daw and David S. Touretzky. “Similarity and Discrimination in Classical Conditioning: A Latent Variable Account.” Neural Information Processing Systems Conference (2004).

Rehder, Bob. “Essentialism as a Generative Theory of Classification.” To appear in Gopnik, A., and L. Schulz (Eds.) Causal learning: Psychology, philosophy, and computation. New York, NY: Oxford University Press.

15 Non-parametric Classification: Exemplar Models and Neural Networks 1  
16 Non-parametric Classification: Exemplar Models and Neural Networks 2

Pearl, Judea. Causality: Models, Reasoning, and Inference. Cambridge, MA: Cambridge University Press, 2000, chapter 1. ISBN: 0521773628.

Buy at MIT Press Gregory F. Cooper. “An Overview of the Representation and Discovery of Causal Relationships using Bayesian Networks.” In Computation, Causation, and Discovery. Edited by Clark Glymour and Gregory F. Cooper. MIT Press, June 1999, pp. 3-62. ISBN: 0262571242.

Gopnik, Alison, and Laura Schulz. “Mechanisms of Theory Formation in Young Children.” Trends in Cognitive Sciences 8, no. 8 (August 2004): 371-377. 

17 Controlling Complexity and Occam’s Razor 1

Wellman, Henry M., and Susan A. Gelman. “Cognitive Development: Foundational Theories of Core Domains.” Annu Rev Psychol 43 (1992): 337-75.

Tenenbaum J. B., and Griffiths. “The Place of Intuitive Theories in Rational Causal Inference.” To appear in Gopnik, A., and L. Schulz (Eds.) Causal learning: Psychology, philosophy, and computation. New York, NY: Oxford University Press.

Charles Kemp, Thomas L. Griffiths, and Joshua B. Tenenbaum. Discovering Latent Classes in Relational Data. MIT Computer Science and Artificial Intelligence Laboratory. AI Memo 2004-019, September 2004.

Optional Readings

Rehder, Bob. “Essentialism as a Generative Theory of Classification_.”_ To appear in Gopnik, A., and L. Schulz (Eds.) Causal learning: Psychology, philosophy, and computation. New York, NY: Oxford University Press.

18 Controlling Complexity and Occam’s Razor 2

AIMA. 14.6.

Milch, Brian, Bhaskara Marthi, and Stuart Russell. “BLOG: Relational Modeling with Unknown Objects.” Workshop on Statistical Relational Learning and Its Connections to Other Fields. The Twenty-First International Conference on Machine Learning (ICML). Banff, Alberta: July 2004.

19 Intuitive Biology and the Role of Theories

Osherson, Daniel N., O. Wilkie, E. E. Smith, A. Lopez, and E. Shafir. “Category-Based Induction.” Psychological Review 97, no. 2 (1990): 185-200.

Buy at MIT Press Atran, Scott. “Classifying Nature Across Cultures.” In Thinking: An Invitation to Cognitive Science. Edited by Edward E. Smith, Daniel N. Osherson, et. al. 2nd ed. Vol. 3. Cambridge, MA: MIT Press, 1995, pp. 131-174. ISBN: 0262150433.

Kemp, C., and J. B. Tenenbaum. “Theory-based Induction.” In Proceedings of the 25th Annual Conference of the Cognitive Science Society (2003).

Optional Reading

Kemp, C., T. L. Griffiths, S. Stromsten, and J. B. Tenenbaum. “Semi-supervised Learning with Trees.” In Advances in Neural Information Processing Systems (2003).

20 Learning Domain Structures 1

Keil, Frank C. “Contraints on Knowledge and Cognitive Development.” Psychological Review 88, no. 3 (May 1981): 197-227.

McClelland, and Rogers. “The Parallel Distributed Processing Approach to Semantic Cognition.” Nature Reviews Neuroscience 4 (April 2003): 1-14.

Kemp, Charles, Amy Perfors, and Joshua B. Tenenbaum. Learning Domain Structures. Department of Brain and Cognitive Sciences, MIT. Internal Memo.

Optional Readings

Geman, Stuart, and E. Bienenstock. “Neural Networks and the Bias/Variance Dilemma.” Neural Computation 4 (1992): 1-58. [Especially pp. 46-48]

21 Learning Domain Structures 2

Scholl, Brian J.,and Patrice D. “Perceptual Causality and Animacy.” Tremoulet Trends in Cognitive Sciences 4, no. 8 (August 2000): 299-309.

Feldman, Jacob, and Patrice D. Tremoulet. The Computation of Intention. (Forthcoming)

22 Causal Learning  
23 Causal Theories 1  
24 Causal Theories 2  
25 Project Presentations