NETS 7976 - Information Theory & Bayesian Inference for Complex Systems - Fall 2022
Tuesdays: 10:00 – 11:00am
September 13 – December 13, 2022
177 Huntington, room 1005
Summary
This directed study is designed to introduce central ideas of information theory in the context of complex adaptive systems and their relation to Bayesian inference and machine learning. This class is designed in collaboration with Moritz Laber and Sagar Kumar, 1st year PhD students in Network Science.
PDF of this syllabus: here.
Coursework, Class Structure, Grading
The course is intended foremost as reading course but also includes solving selected exercises that accompany the reading material and the writing of chapter summaries to track the students progress. The grading of the course is based on a final project in which central ideas of the course are applied. During class time, we will discuss the reading assignment for that week; this will happen after a 10-15 minute introduction of the class where students share their weekly deliverable that they've produced. This can be in the form of notes, code, homework exercises, interesting connections, etc.
Learning Objectives and Outcomes
By the end of this course students should have a deep familiarity with the central concepts of information theory and Bayesian inference; the overall goal is to be able to describe and apply these core concepts in both theoretical and application-oriented ways. Furthermore, students should leave this course with the ability to find and explain the connections between these topics and connections to domains outside mathematics. This course hopes to achieve these learning objectives while being:
Current: Through the discussion of research papers in the course, students should learn to identify concepts from information theory and inference in the literature.
Practical: Upon completion of this course, students should have gained practical experience in applying methods from information theory to various (textbook) problems, especially those originating in the context of complex systems and network science.
Novel & Actionable: Students should be able to transfer this knowledge to their specific research interests in ways that advance their research.Through hands-on activities (problem sets from the textbooks, active note taking, coding, etc.) and the final project, students will leave this course with documentation of their knowledge. This can be in the form of a research paper, a review article, a software repository, or a collection of notes organized pedagogically for teaching this material in the future.
Evaluation
The course evaluation is based on two components:
Completion and presentation of the final project: The project gives students the chance to show that they have familiarized themselves with the central concepts of information theory and Bayesian inference. It can take the form of a research paper, literature review, software repository, etc. The presentation of the results in the final week of the semester is part of the evaluation.
Active participation in the weekly meetings: This includes engaging in discussions on the content of the main readings as well as the presentation of solved problems or pieces of code.
Materials
The primary source of reading is Information Theory, Inference and Learning Algorithms by David J.C. MacKay (McK) and the course on information theory, pattern recognition and machine learning that he teaches at the University of Cambridge. This directed study draws also on material from Information Theory for Complex Systems by Kristian Lindgren (L) and the classic Elements of Information Theory by Thomas M. Cover and Joy A. Thomas (C&T). Additionally, we will read current and historical papers in the literature to show this course's continued relevance in the study of complex systems. Course materials can be found at: https://www.dropbox.com/sh/wt9rj5u3clib7r4/AACfRyPajNbwFxr6KiAeKfmRa?dl=0.
McK: MacKay, David J.C. (2003). Information Theory, Inference and Learning Algorithms. Cambridge University Press. Available at: https://www.dropbox.com/s/lg7eg2y4drviz76/MacKay-InfoTheory_Inference_Learning.pdf?dl=0. Video lectures also available at: https://www.youtube.com/playlist?list=PLruBu5BI5n4aFpG32iMbdWoRVAA-Vcso6.
C&T: Cover, Thomas M. & Thomas, Joy A. (2006). Elements of Information Theory. John Wiley & Sons. Available at: https://www.dropbox.com/s/j6bmddtg5bvfxt5/CoverThomas-Elements_InfoTheory.pdf?dl=0.
L: Lindgren, Kristian. (2014). Information Theory for Complex Systems. Lecture Notes from the Department of Physical Resource Theory, Chalmers and Göteborg University. Available at: https://www.dropbox.com/s/hsv5onmcuuqjg3j/Lindgren-InfoTheory_ComplexSystems.pdf?dl=0.
Other useful resources
Introduction to Information Theory Tutorial, from Cris Moore at the Santa Fe Institute. Available at: https://www.youtube.com/playlist?list=PLF0b3ThojznQAEXlZQmbTTFNH96i2iZPC.
Chodrow, Philip. (2017) Divergence, Entropy, Information: An Opinionated Introduction to Information Theory. arXiv. Available at: https://arxiv.org/abs/1708.07459.
Olah, Christopher. (2015). Visual Information Theory. Available at: http://colah.github.io/posts/2015-09-Visual-Information/.
Information Processing and Learning -- Course from the Machine Learning Department at Carnegie Mellon. Available at http://www.cs.cmu.edu/~aarti/Class/10704/lecs.html.
Instructor
My name is Brennan Klein, and I am a postdoctoral researcher at the Network Science Institute at Northeastern, which is also where I received my PhD in 2020. I am broadly interested in foundational questions in Network Science and Complex Systems, from emergence and higher order structure to information theory and agency. My current research looks at how complex systems are able to represent, predict, and intervene on their surroundings across a number of different scales---all in ways that appear to minimize surprising states in the future. I believe that scientists have an obligation towards openness and curiosity, and this commitment often leads me into surprising collaborations on a wide range of topics. If you would like to learn more about my research, you can visit my website http://brennanklein.com/.
Week 1: Sep. 13
Introduction to Information Theory I: Course Introduction
Primary Readings:
McK: Chapter 1-3: Introduction to Information, Probability & Inference (page 3-64).
Supplementary Readings:
C&T: Chapter 2: Entropy, Relative Entropy, and Mutual Information (page 12-49).
L: Chapter 2: Information Theory (page 5-17).
Week 2: Sep. 20
Introduction to Information Theory II: Data Compression
Primary Readings:
McK: Chapter 2 & 4-7: Introduction (page 22-47) and Data Compression (page 65-136).
Part I (page 1-18) of: Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. (Full text available at: https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf).
Supplementary Readings:
C&T: Chapter 5.1-5.6 Data Compression (page 78-93).
Week 3: Sep. 27
Noisy Channel Coding Theorem
Primary Readings:
McK: Chapter 8-10: Noisy Channel Coding (page 137-176).
Part II-IV (page 18-51) of: Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. (Full text available at: https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf).
Supplementary Readings:
C&T: Chapter 8: Channel Capacity and Noisy Channel Coding (page 183-202).
Week 4: Oct. 4
Asymptotic Equipartition Theorem & Stochastic Processes
Primary Readings:
C&T: Chapter 3: Asymptotic Equipartition Theorem (page 50-59).
C&T: Chapter 4: Entropy Rate of a Stochastic Process (page 60-77).
Supplementary Readings:
L: Chapter 3: Information Theory for Lattice Systems (page 21-46).
Week 5: Oct. 11
Geometric Information Theory
Primary Readings:
L: Chapter 6: Geometric Information Theory & Spatial Coarse Graining (page 91-104).
Watch online: Introduction to Renormalization, by Simon DeDeo; Lectures 1, 2, 3, 17, 18, 19, 20, 23, & 24. https://www.youtube.com/playlist?list=PLF0b3ThojznTzAA7bfLWh4RKzRrwNF4L0
Supplementary Readings:
Eriksson, Karl-Erik & Lindgren, Kristian (1987). Structural information in self-organizing systems. Physica Scripta, 35(3), 388. Available at: https://www.dropbox.com/s/d39aomu2bdzr2fn/Eriksson-StructuralInformation_SelfOrg.pdf?dl=0.
Week 6: Oct. 18
Inference, Clustering, and Communities: Part I
Primary Readings:
Rosvall, M. & Bergstrom, C.T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118-1123. Available at https://www.dropbox.com/s/kmq8ybjodveiwxa/RosvallBergstrom-Infomap.pdf?dl=0.
McK: Chapter 20-23: Probabilities and Inference (page 281-318).
Supplementary Readings:
Look through recent developments and selected extensions of the map-equation framework: https://www.mapequation.org/publications.html.
Week 7: Oct. 25
Inference, Clustering, and Communities: Part II
Primary Readings:
Peixoto, Tiago (2021). Descriptive vs. inferential community detection. arXiv. Available at: https://www.dropbox.com/s/n7vctfetlh3pwpl/Peixoto-DescriptiveInferential.pdf?dl=0.
McK: Chapter 24 & 28: Exact Marginalization (page 319-323) & Model Comparison and Occam's Razor (page 343-356).
Supplementary Readings:
Download and reproduce some of the GraphTool tutorials https://graph-tool.skewed.de/.
Peixoto, Tiago. (2019). Bayesian Stochastic Blockmodeling. arXiv. Available at: https://arxiv.org/pdf/1705.10225.pdf.
Week 8: Nov. 1
Neural Networks I
Primary Readings:
McK: Chapter 38-40: Neural Networks (page 467-491)
Supplementary Readings:
Hopfield, John J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554-2558. Available at: https://www.dropbox.com/s/lb0htzwn1hvck93/Hopfield-NeuralNetworks.pdf?dl=0.
Week 9: Nov. 8
Neural Networks II
Primary Readings:
McK: Chapter 41-44: Hopfield & Boltzmann Machines (page 505-526).
L: Chapter 5: Physics and Information Theory (page 71-86).
Supplementary Readings:
Heins, C., Klein, B., Demekas, D., Aguilera, M., & Buckley, C. (2022); Spin glass systems as collective active inference. International Workshop on Active Inference}. Available at: https://arxiv.org/abs/2207.06970.
Week 10: Nov. 15
Neural Networks III
Primary Readings:
McK: Chapter 41-44: Hopfield & Boltzmann Machines (page 505-526).
L: Chapter 5: Physics and Information Theory (page 71-86).
Supplementary Readings:
Heins, C., Klein, B., Demekas, D., Aguilera, M., & Buckley, C. (2022); Spin glass systems as collective active inference. International Workshop on Active Inference}. Available at: https://arxiv.org/abs/2207.06970.
Week 11: Nov. 22
Class postponed.
Week 12: Nov. 29
Emergence and Information Decomposition: I
Primary Readings:
Mediano, P.A., Rosas, F.E., et al. (2022). Greater than the parts: A review of the information decomposition approach to causal emergence. Philosophical Transactions of the Royal Society A, 380(2227), 20210246. Available at: https://royalsocietypublishing.org/doi/full/10.1098/rsta.2021.0246.
Rosas et al. (2020). Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data. PLOS Computational Biology 16(12): e1008289. Available at: https://doi.org/10.1371/journal.pcbi.1008289.
Week 13: Dec. 6
Emergence and Information Decomposition: II
Primary Readings:
Rosas et al. (2020). Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data. PLOS Computational Biology 16(12): e1008289. Available at: https://doi.org/10.1371/journal.pcbi.1008289.
Borge-Holthoefer, J., Perra, N., Gonçalves, B., González-Bailón, S., Arenas A., Moreno, Y. & Vespignani, A. (2016). The dynamics of information-driven coordination phenomena: A transfer entropy analysis. Science Advances, 2(4). Available at: https://www.science.org/doi/pdf/10.1126/sciadv.1501158.
Supplementary Readings:
Supporting Materials for: The dynamics of information-driven coordination phenomena: A transfer entropy analysis. Available at: https://www.dropbox.com/s/vxgwi008ze0weru/BorgeHolthoefer_TransferEntropy_SM.pdf?dl=0.
Week 14: Dec. 13
Information Decomposition, Emergence, & the Anatomy of a Bit
Primary Readings:
Timme, N., Alford, W., Flecker, B. et al. Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. J Comput Neurosci 36, 119–140 (2014). https://doi.org/10.1007/s10827-013-0458-4.
Supplementary Readings:
Vijayaraghavan, V.S.; James, R.G.; Crutchfield, J.P. Anatomy of a Spin: The Information-Theoretic Structure of Classical Spin Systems. Entropy 2017, 19, 214. https://doi.org/10.3390/e19050214.