Ex | Learning Dynamics

import numpy as np
import matplotlib.pyplot as plt

Task 1 | Social dilemma flows

Visualize the flow plots for all four social dilemma environment we discussed in the course: Tragedy Prinsoner’s Dilemma, Divergence Chicken, Coordination Stag Hunt, and Comedy Harmony.

You can use the pyCRLD environment SocialDilemma by impporting

from pyCRLD.Environments.SocialDilemma import SocialDilemma

# ...

Task 2 | Critical transition

We consider the following model: Two agents can either cooperate or defect. A cooperator contributes a benefit \(b\), which all agents receive. However, a cooperator must pay \(c\) for the contribution. A defector does not contribute and does not pay a cost. Thus, the payoff matrix is

	Cooperate	Defect
Cooperate	\(2b-c\) , \(2b-c\)	\(b-c\), \(b\)
Defect	\(b\), \(b-c\)	\(0, 0\)

Let us re-normalize the payoffs, devide all payoffs by \(b\) and express in the cost-to-benefit ratio \(r = c/b\).

	Cooperate	Defect
Cooperate	\(2-r\) , \(2-r\)	\(1-r\), \(1\)
Defect	\(1\), \(1-r\)	\(0, 0\)

Simulate the reinforcement learning dynamics in the game from 25 random initial joint policies for values of \(r\) in the range \([0.5, 1.5]\). Record the final joint policy for each initial policy and plot the critical transition from defection to cooperation as a function of \(r\). Also, visualize how long, on average, it takes for the agents to reach the final joint policy. Show a critical slowing down.

# ...