Authors
Maryam Solaiman1, Theodore Mui1, Qi Wang2 and Phil Mui3, 1Aspiring Scholars Directed Research Program Fremont, USA, 2University of Texas Austin, Texas, 3Salesforce San Francisco, USA
Abstract
We model unlearning by simulating a Q-agent (using the reinforcement learning Q-learning algorithm), representing a real-world learner, playing the game of Nim against different adversarial agents to learn the optimal Nim strategy. When the Q-agent plays against sub-optimal agents, its percentage of optimal moves is decreased, analogous to a person forgetting ("unlearning") what they have learned previously. To mitigate the effect of this "unlearning", we experimented with modulating the Q-learning so that minimal learning occurs with untrusted opponents. This trust-based modulation is modeled by observing opponent moves that are different from those that a Q-agent has learned. This model parallels human trust which tends to increase with those whom one agrees with. With this modulated learning, we observe that a Q-agent with a baseline optimal strategy is able to robustly retain previously learned strategy, in some cases achieving a 0.3 difference in accuracy from the unlearning model. We then ran a three-phase simulation where the Q-agent played against optimal agents in the first phase, sub-optimal agents in the second "unlearning" phase, and optimal or random agents in the third phase. We found that even after unlearning, the Q-agent was quickly able to relearn most of its knowledge about the optimal strategy for Nim.
Keywords
Reinforcement learning, Q-learning, Nim Game, Unlearning, Learned Memory, Misinformation