9  References

Arthur, W. B. (1994). Inductive reasoning and bounded rationality. The American Economic Review, 84(2), 406–411.
Arthur, W. B. (2014). Complexity and the Economy. Oxford University Press.
Axelrod, R. (1984). The Evolution Of Cooperation. Basic Books.
Axelrod, R., & Hamilton, W. D. (1981). The Evolution of Cooperation. Science, 211(4489), 1390–1396. https://doi.org/10.1126/science.7466396
Barfuss, W. (2020). Reinforcement Learning Dynamics in the Infinite Memory Limit. Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, 1768–1770.
Barfuss, W. (2022). Dynamical systems as a level of cognitive analysis of multi-agent learning. Neural Computing and Applications, 34(3), 1653–1671. https://doi.org/10.1007/s00521-021-06117-0
Barfuss, W., Donges, J. F., & Kurths, J. (2019). Deterministic limit of temporal difference reinforcement learning for stochastic games. Physical Review E, 99(4), 043305. https://doi.org/10.1103/PhysRevE.99.043305
Barfuss, W., Donges, J. F., Vasconcelos, V. V., Kurths, J., & Levin, S. A. (2020). Caring for the future can turn tragedy into comedy for long-term collective action under risk of collapse. Proceedings of the National Academy of Sciences, 117(23), 12915–12922. https://doi.org/10.1073/pnas.1916545117
Berner, C., Brockman, G., Chan, B., Cheung, V., Dębiak, P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S., Hesse, C., Józefowicz, R., Gray, S., Olsson, C., Pachocki, J., Petrov, M., Pinto, H. P. d. O., Raiman, J., Salimans, T., Schlatter, J., … Zhang, S. (2019). Dota 2 with Large Scale Deep Reinforcement Learning (arXiv:1912.06680). arXiv. https://doi.org/10.48550/arXiv.1912.06680
Bialek, W., Cavagna, A., Giardina, I., Mora, T., Silvestri, E., Viale, M., & Walczak, A. M. (2012). Statistical mechanics for natural flocks of birds. Proceedings of the National Academy of Sciences, 109(13), 4786–4791. https://doi.org/10.1073/pnas.1118633109
Botvinick, M., Wang, J. X., Dabney, W., Miller, K. J., & Kurth-Nelson, Z. (2020). Deep Reinforcement Learning and Its Neuroscientific Implications. Neuron, 107(4), 603–616. https://doi.org/10.1016/j.neuron.2020.06.014
Brush, E. R., Krakauer, D. C., & Flack, J. C. (2018). Conflicts of interest improve collective computation of adaptive social structures. Science Advances, 4(1), e1603311. https://doi.org/10.1126/sciadv.1603311
Buckley, C. L., Kim, C. S., McGregor, S., & Seth, A. K. (2017). The free energy principle for action and perception: A mathematical review. Journal of Mathematical Psychology, 81, 55–79. https://doi.org/10.1016/j.jmp.2017.09.004
Buhl, J., Sumpter, D. J. T., Couzin, I. D., Hale, J. J., Despland, E., Miller, E. R., & Simpson, S. J. (2006). From Disorder to Order in Marching Locusts. Science, 312(5778), 1402–1406. https://doi.org/10.1126/science.1125142
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58, 313–323. https://doi.org/10.1037/h0054388
Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2), 156–172.
Camerer, C. F. (2011). Behavioral game theory: Experiments in strategic interaction. Princeton university press.
Carroll, M., Shah, R., Ho, M. K., Griffiths, T., Seshia, S., Abbeel, P., & Dragan, A. (2019). On the utility of learning about humans for human-ai coordination. Advances in Neural Information Processing Systems, 32.
Christoffersen, P. J., Haupt, A. A., & Hadfield-Menell, D. (2022). Get it in writing: Formal contracts mitigate social dilemmas in multi-agent RL. arXiv Preprint arXiv:2208.10469.
Cohen, J. E. (1998). Cooperation and self-interest: Pareto-inefficiency of Nash equilibria in finite random games. Proceedings of the National Academy of Sciences, 95(17), 9724–9731. https://doi.org/10.1073/pnas.95.17.9724
Cross, J. G. (1973). A Stochastic Learning Model of Economic Behavior*. The Quarterly Journal of Economics, 87(2), 239–266. https://doi.org/10.2307/1882186
Dafoe, A., Bachrach, Y., Hadfield, G., Horvitz, E., Larson, K., & Graepel, T. (2021). Cooperative AI: Machines must learn to find common ground. Nature, 593(7857), 33–36. https://doi.org/10.1038/d41586-021-01170-0
Daniels, B. C., Ellison, C. J., Krakauer, D. C., & Flack, J. C. (2016). Quantifying collectivity. Current Opinion in Neurobiology, 37, 106–113. https://doi.org/10.1016/j.conb.2016.01.012
Daniels, B. C., Krakauer, D. C., & Flack, J. C. (2017). Control of finite critical behaviour in a small-scale social system. Nature Communications, 8(1), 14301. https://doi.org/10.1038/ncomms14301
Daniels, B. C., Laubichler, M. D., & Flack, J. C. (2021). Introduction to the special issue: Quantifying collectivity. Theory in Biosciences, 140(4), 321–323. https://doi.org/10.1007/s12064-021-00358-2
Darriba, Á., & Waszak, F. (2018). Predictions through evidence accumulation over time. Scientific Reports, 8(1), 494. https://doi.org/10.1038/s41598-017-18802-z
Dawes, R. M. (1980). Social Dilemmas. Annual Review of Psychology, 31(1), 169–193. https://doi.org/10.1146/annurev.ps.31.020180.001125
Dayan, P., & Niv, Y. (2008). Reinforcement learning: The Good, The Bad and The Ugly. Current Opinion in Neurobiology, 18(2), 185–196. https://doi.org/10.1016/j.conb.2008.08.003
De Marzo, G., Gabrielli, A., Zaccaria, A., & Pietronero, L. (2022). Quantifying the unexpected: A scientific approach to Black Swans. Physical Review Research, 4(3), 033079. https://doi.org/10.1103/PhysRevResearch.4.033079
DeDeo, S., Krakauer, D. C., & Flack, J. C. (2010). Inductive game theory and the dynamics of animal conflict. PLoS Computational Biology, 6(5), e1000782.
Epstein, J. M., & Axtell, R. L. (1996). Growing Artificial Societies: Social Science From the Bottom Up (First Edition). Brookings Institution Press.
Erev, I., & Roth, A. E. (1998). Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria. The American Economic Review, 88(4), 848–881.
(FAIR)†, M. F. A. R. D. T., Bakhtin, A., Brown, N., Dinan, E., Farina, G., Flaherty, C., Fried, D., Goff, A., Gray, J., Hu, H., et al. (2022). Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science, 378(6624), 1067–1074.
Fehr, E., & Gächter, S. (2000). Cooperation and Punishment in Public Goods Experiments. American Economic Review, 90(4), 980–994. https://doi.org/10.1257/aer.90.4.980
Flack, J. C. (2017). Coarse-graining as a downward causation mechanism. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 375(2109), 20160338. https://doi.org/10.1098/rsta.2016.0338
Foerster, J., Assael, I. A., de Freitas, N., & Whiteson, S. (2016). Learning to Communicate with Deep Multi-Agent Reinforcement Learning. Advances in Neural Information Processing Systems, 29.
Franci, A., Golubitsky, M., Stewart, I., Bizyaeva, A., & Leonard, N. E. (2022). Breaking indecision in multi-agent, multi-option dynamics. arXiv Preprint arXiv:2206.14893.
Friston, K. (2018). Does predictive coding have a future? Nature Neuroscience, 21(8), 1019–1021. https://doi.org/10.1038/s41593-018-0200-7
Fudenberg, D., & Levine, D. K. (1998). The Theory of Learning in Games (K. Binmore, Ed.). MIT Press.
Grupen, N., Jaques, N., Kim, B., & Omidshafiei, S. (2022). Concept-based understanding of emergent multi-agent behavior. Deep Reinforcement Learning Workshop NeurIPS 2022. https://openreview.net/forum?id=zt5JpGQ8WhH
Gunawardena, J. (2022). Learning Outside the Brain: Integrating Cognitive Science and Systems Biology. Proceedings of the IEEE, 1–23. https://doi.org/10.1109/JPROC.2022.3162791
Hauert, C. (2002). Effects of space in 2 2 games. International Journal of Bifurcation and Chaos, 12(07), 1531–1548. https://doi.org/10.1142/S0218127402005273
Hauert, C., & Doebeli, M. (2004). Spatial structure often inhibits the evolution of cooperation in the snowdrift game. Nature, 428(6983), 643–646. https://doi.org/10.1038/nature02360
Hauert, C., Michor, F., Nowak, M. A., & Doebeli, M. (2006). Synergy and discounting of cooperation in social dilemmas. Journal of Theoretical Biology, 239(2), 195–202. https://doi.org/10.1016/j.jtbi.2005.08.040
Hauser, O. P., Hilbe, C., Chatterjee, K., & Nowak, M. A. (2019). Social dilemmas among unequals. Nature, 572(7770), 524–527.
Heins, C., Millidge, B., Costa, L. da, Mann, R., Friston, K., & Couzin, I. (2023). Collective behavior from surprise minimization. arXiv. http://arxiv.org/abs/2307.14804
Hernandez-Leal, P., Kartal, B., & Taylor, M. E. (2019). A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems, 33(6), 750–797. https://doi.org/10.1007/s10458-019-09421-1
Hilbe, C., Chatterjee, K., & Nowak, M. A. (2018). Partners and rivals in direct reciprocity. Nature Human Behaviour. https://doi.org/10.1038/s41562-018-0320-9
Hilbe, C., Šimsa, Š., Chatterjee, K., & Nowak, M. A. (2018). Evolution of cooperation in stochastic games. Nature, 559(7713), 246–249. https://doi.org/10.1038/s41586-018-0277-x
Hofbauer, J., & Sigmund, K. (1998). Evolutionary Games and Population Dynamics (First). Cambridge University Press. https://doi.org/10.1017/CBO9781139173179
Hofbauer, J., & Sigmund, K. (2003). Evolutionary game dynamics. Bulletin of the American Mathematical Society, 40(4), 479–519. https://doi.org/10.1090/S0273-0979-03-00988-1
Holland, J. H., & Miller, J. H. (1991). Artificial Adaptive Agents in Economic Theory. The American Economic Review, 81(2), 365–370.
Hughes, E., Anthony, T. W., Eccles, T., Leibo, J. Z., Balduzzi, D., & Bachrach, Y. (2020). Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games. New Zealand, 10.
Jaynes, E. T., & Bretthorst, G. L. (2003). Probability theory: The logic of science. Cambridge University Press. http://www5.unitn.it/Biblioteca/it/Web/LibriElettroniciDettaglio/50847
Jhawar, J., Morris, R. G., Amith-Kumar, U. R., Danny Raj, M., Rogers, T., Rajendran, H., & Guttal, V. (2020). Noise-induced schooling of fish. Nature Physics, 16(4), 488–493. https://doi.org/10.1038/s41567-020-0787-y
Kempes, C. P., Wolpert, D., Cohen, Z., & Pérez-Mercader, J. (2017). The thermodynamic efficiency of computations made in cells across the range of life. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 375(2109), 20160343. https://doi.org/10.1098/rsta.2016.0343
Kleshnina, M., Hilbe, C., Šimsa, Š., Chatterjee, K., & Nowak, M. A. (2023). The effect of environmental information on evolution of cooperation in stochastic games. Nature Communications, 14(1), 4153. https://doi.org/10.1038/s41467-023-39625-9
Krakauer, D. C., Flack, J. C., Dedeo, S., Farmer, D., & Rockmore, D. (2010). Intelligent Data Analysis of Intelligent Systems. In P. R. Cohen, N. M. Adams, & M. R. Berthold (Eds.), Advances in Intelligent Data Analysis IX (pp. 8–17). Springer. https://doi.org/10.1007/978-3-642-13062-5_3
Krakauer, D., Bertschinger, N., Olbrich, E., Flack, J. C., & Ay, N. (2020). The information theory of individuality. Theory in Biosciences, 139(2), 209–223. https://doi.org/10.1007/s12064-020-00313-7
Leibo, J. Z., Dueñez-Guzman, E. A., Vezhnevets, A., Agapiou, J. P., Sunehag, P., Koster, R., Matyas, J., Beattie, C., Mordatch, I., & Graepel, T. (2021). Scalable evaluation of multi-agent reinforcement learning with melting pot. International Conference on Machine Learning, 6187–6199.
Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent Reinforcement Learning in Sequential Social Dilemmas. Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 464–473.
Leonardos, S., & Piliouras, G. (2021). Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 11263–11271. https://doi.org/10.1609/aaai.v35i13.17343
Levin, S. (2002). Complex adaptive systems: Exploring the known, the unknown and the unknowable. Bulletin of the American Mathematical Society, 40(1), 3–19. https://doi.org/10.1090/S0273-0979-02-00965-5
Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In W. W. Cohen & H. Hirsh (Eds.), Machine Learning Proceedings 1994 (pp. 157–163). Morgan Kaufmann. https://doi.org/10.1016/B978-1-55860-335-6.50027-1
Lovering, C., Forde, J., Konidaris, G., Pavlick, E., & Littman, M. (2022). Evaluation beyond task performance: Analyzing concepts in AlphaZero in hex. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Advances in neural information processing systems (Vol. 35, pp. 25992–26006). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2022/file/a705747417d32ebf1916169e1a442274-Paper-Conference.pdf
Lupu, A., & Precup, D. (2020). Gifting in multi-agent reinforcement learning. Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 789–797.
Marden, J. R., & Shamma, J. S. (2018). Game theory and control. Annual Review of Control, Robotics, and Autonomous Systems, 1, 105–134.
McAvoy, A., Mori, Y., & Plotkin, J. B. (2022). Selfish optimization and collective learning in populations. Physica D: Nonlinear Phenomena, 439, 133426. https://doi.org/10.1016/j.physd.2022.133426
McGrath, T., Kapishnikov, A., Tomašev, N., Pearce, A., Wattenberg, M., Hassabis, D., Kim, B., Paquet, U., & Kramnik, V. (2022). Acquisition of chess knowledge in AlphaZero. Proceedings of the National Academy of Sciences, 119(47), e2206625119.
McNamara, J. M. (2013). Towards a richer evolutionary game theory. Journal of The Royal Society Interface, 10(88), 20130544. https://doi.org/10.1098/rsif.2013.0544
McNamara, J. M., Houston, A. I., & Leimar, O. (2021). Learning, exploitation and bias in games. PLOS ONE, 16(2), e0246588. https://doi.org/10.1371/journal.pone.0246588
Mora, T., & Bialek, W. (2011). Are Biological Systems Poised at Criticality? Journal of Statistical Physics, 144(2), 268–302. https://doi.org/10.1007/s10955-011-0229-4
Newman, M. E. J. (2003). The Structure and Function of Complex Networks. SIAM Review, 45(2), 167–256. https://doi.org/10.1137/S003614450342480
Nowak, M. A. (2006). Evolutionary dynamics: Exploring the equations of life. Harvard university press.
Ostrom, E., Walker, J., & Gardner, R. (1992). Covenants with and without a Sword: Self-Governance Is Possible. American Political Science Review, 86(2), 404–417. https://doi.org/10.2307/1964229
Park, S., Bizyaeva, A., Kawakatsu, M., Franci, A., & Leonard, N. E. (2021). Tuning cooperative behavior in games with nonlinear opinion dynamics. IEEE Control Systems Letters, 6, 2030–2035.
Poundstone, W. (2011). Prisoner’s Dilemma. Knopf Doubleday Publishing Group.
Press, W. H., & Dyson, F. J. (2012). Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proceedings of the National Academy of Sciences, 109(26), 10409–10413. https://doi.org/10.1073/pnas.1206569109
Ramos-Fernandez, G., Smith Aguilar, S. E., Krakauer, D. C., & Flack, J. C. (2020). Collective Computation in Animal Fission-Fusion Dynamics. Frontiers in Robotics and AI, 7. https://www.frontiersin.org/article/10.3389/frobt.2020.00090
Rao, R. P. N., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 79–87. https://doi.org/10.1038/4580
Rosas, F. E., Mediano, P. A. M., Gastpar, M., & Jensen, H. J. (2019). Quantifying high-order interdependencies via multivariate extensions of the mutual information. Physical Review E, 100(3), 032305. https://doi.org/10.1103/PhysRevE.100.032305
Roth, A. E., & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8(1), 164–212. https://doi.org/10.1016/S0899-8256(05)80020-X
Sarfati, R., Hayes, J. C., & Peleg, O. (2021). Self-organization in natural swarms of Photinus carolinus synchronous fireflies. Science Advances, 7(28), eabg9259. https://doi.org/10.1126/sciadv.abg9259
Schultz, W., Dayan, P., & Montague, P. R. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593–1599. https://doi.org/10.1126/science.275.5306.1593
Schultz, W., Stauffer, W. R., & Lak, A. (2017). The phasic dopamine signal maturing: From reward via behavioural activation to formal economic utility. Current Opinion in Neurobiology, 43, 139–148. https://doi.org/10.1016/j.conb.2017.03.013
Shoham, Y., Powers, R., & Grenager, T. (2007). If multi-agent learning is the answer, what is the question? Artificial Intelligence, 171(7), 365–377. https://doi.org/10.1016/j.artint.2006.02.006
Sigmund, K. (2010). The Calculus of Selfishness. In The Calculus of Selfishness. Princeton University Press. https://doi.org/10.1515/9781400832255
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. https://doi.org/10.1038/nature16961
Skyrms, B. (2004). The Stag Hunt and the Evolution of Social Structure. Cambridge University Press.
Stone, P., Kaminka, G., Kraus, S., & Rosenschein, J. (2010). Ad hoc autonomous agent teams: Collaboration without pre-coordination. Proceedings of the AAAI Conference on Artificial Intelligence, 24, 1504–1509.
Strouse, D., McKee, K. R., Botvinick, M. M., Hughes, E., & Everett, R. (2021). Collaborating with humans without human data. CoRR, abs/2110.08176. https://arxiv.org/abs/2110.08176
Sugden, R. (2004). The Economics of Rights, Co-operation and Welfare. Springer.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44. https://doi.org/10.1007/BF00115009
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (Second edition). The MIT Press.
Team, C. G. I., Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., Everett, R., Frechette, A., Oliveira, Y. G., Hughes, E., Mathewson, K. W., Mendolicchio, P., Pawar, J., Pislar, M., Platonov, A., Senter, E., Singh, S., Zacherl, A., & Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. https://arxiv.org/abs/2203.00715
Tekin, E., Savage, V. M., & Yeh, P. J. (2017). Measuring higher-order drug interactions: A review of recent approaches. Current Opinion in Systems Biology, 4, 16–23. https://doi.org/10.1016/j.coisb.2017.05.015
Tekin, E., Yeh, P. J., & Savage, V. M. (2018). General Form for Interaction Measures and Framework for Deriving Higher-Order Emergent Effects. Frontiers in Ecology and Evolution, 6. https://www.frontiersin.org/articles/10.3389/fevo.2018.00166
Tuyls, K., Perolat, J., Lanctot, M., Hughes, E., Everett, R., Leibo, J. Z., Szepesvári, C., & Graepel, T. (2019). Bounds and dynamics for empirical game theoretic analysis. Autonomous Agents and Multi-Agent Systems, 34(1), 7. https://doi.org/10.1007/s10458-019-09432-y
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., … Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350–354. https://doi.org/10.1038/s41586-019-1724-z
Wang, W. Z., Beliaev, M., Bıyık, E., Lazar, D. A., Pedarsani, R., & Sadigh, D. (2021). Emergent Prosociality in Multi-Agent Games Through Gifting. Twenty-Ninth International Joint Conference on Artificial Intelligence, 1, 434–442. https://doi.org/10.24963/ijcai.2021/61
Wang, X., & Fu, F. (2020). Eco-evolutionary dynamics with environmental feedback: Cooperation in a changing world. Europhysics Letters, 132(1), 10001. https://doi.org/10.1209/0295-5075/132/10001
Wolfram, S. (1994). Cellular Automata And Complexity: Collected Papers (1st edition). Westview Press.
Wolpert, D. H. (2006). Information Theory - The Bridge Connecting Bounded Rational Game Theory and Statistical Physics. In D. Braha, A. A. Minai, & Y. Bar-Yam (Eds.), Complex Engineered Systems: Science Meets Technology (pp. 262–290). Springer. https://doi.org/10.1007/3-540-32834-3_12
Wolpert, D. H., Harré, M., Olbrich, E., Bertschinger, N., & Jost, J. (2012). Hysteresis effects of changing the parameters of noncooperative games. Physical Review E, 85(3), 036102. https://doi.org/10.1103/PhysRevE.85.036102
Zinkevich, M., Greenwald, A., & Littman, M. (2005). Cyclic Equilibria in Markov Games. Advances in Neural Information Processing Systems, 18. https://proceedings.neurips.cc/paper/2005/hash/9752d873fa71c19dc602bf2a0696f9b5-Abstract.html